Welcome to my blog!
Thinam Tamang
Categories
All
(22)
Deep Learning
(1)
Inference
(1)
KV cache
(1)
LLMs
(9)
Mistral
(3)
Mixture of Experts
(2)
Model Sharding
(1)
ORPO
(1)
Optimization
(1)
PGO
(2)
PPO
(2)
RLHF
(4)
Reinforcement Learning
(4)
Transformer
(1)
computer vision
(1)
convolutional neural network
(1)
convolutional neural networks
(1)
data engineering
(1)
data preparation
(1)
deep learning
(3)
git
(1)
github
(1)
grouped query attention
(1)
image classification
(1)
llama
(3)
machine learning
(7)
natural language processing
(1)
pattern recognition
(1)
transformers
(2)
version control
(1)
word embeddings
(2)
word vectors
(2)
Scaling Transformer Models
Transformer
Optimization
LLMs
Inference
Language models are the probabilistic models that assigns probability to the sequence of words. In other words, language models assigns the probability of generating a next…
May 24, 2024
Thinam Tamang
Odds Ratio Preference Optimization (ORPO)
Reinforcement Learning
ORPO
RLHF
LLMs
PPO
In this blog post, we will discuss the reference model free monolithic odds ratio preference optimization algorithm (ORPO) proposed in the paper ORPO: Monolithic Preference…
May 3, 2024
Thinam Tamang
Proximal Policy Optimization (PPO)
Reinforcement Learning
PGO
RLHF
LLMs
PPO
In my previous blog post, we discussed the Policy Gradient Optimization where we derived the expression for the gradient of the objective function w.r.t the policy…
Apr 20, 2024
Thinam Tamang
Policy Gradient Optimization
Reinforcement Learning
PGO
RLHF
LLMs
This blog is the continuation of my previous blog, Introduction to Reinforcement Learning. In this blog, we will discuss the concept of Policy Gradient algorithm in the…
Apr 14, 2024
Thinam Tamang
Introduction to Reinforcement Learning
Reinforcement Learning
Deep Learning
RLHF
I am going to write a series of posts on Reinforcement Learning. This post is the first post in the series. In this post, I will introduce the basic concepts of…
Mar 31, 2024
Thinam Tamang
Comprehensive Understanding of Mistral Model
Mixture of Experts
Mistral
LLMs
Attention mechanism is a key component in Transformer models. It allows the model to focus on different parts of the input sequence and derive the relationship between…
Mar 9, 2024
Thinam Tamang
Mixture of Experts in Mistral
Mixture of Experts
Mistral
LLMs
Mixture of Experts (MoE) is a neural network that divides the list of Modules into specialized experts, each responsible for processing specific tokens or aspects of the…
Mar 2, 2024
Thinam Tamang
Model Sharding
Model Sharding
Mistral
LLMs
Model Sharding is a technique used to distribute the model parameters, gradients, and optimizer states across multiple GPUs. In this technique, the model is divided into…
Feb 23, 2024
Thinam Tamang
Understanding KV Cache
llama
KV cache
LLMs
In this article, we will discuss the Key-Value cache. We will start with the introduction of the Key-Value cache, then we will discuss the problem, solution, limitations…
Feb 10, 2024
Thinam Tamang
Grouped Query Attention (GQA)
llama
grouped query attention
LLMs
In this article, we will discuss the Grouped Query Attention. We will start with the introduction of the Grouped Query Attention, then we will discuss the limitations of…
Feb 9, 2024
Thinam Tamang
LLaMA: Open and Efficient LLM Notes
llama
In this article, I will be sharing the notes and concepts which I have learned while reading the papers and while discussing with Umar. The ultimate goal that I have on my…
Jan 20, 2024
Thinam Tamang
Git & GitHub
git
github
version control
Mainline Development (“Always Be Integrating”).
May 14, 2023
Thinam Tamang
Self-Attention & Transformer
machine learning
word vectors
word embeddings
transformers
deep learning
The necessities for a self-attention model are as follows:
Oct 23, 2022
Thinam Tamang
Word Vectors
machine learning
word vectors
word embeddings
transformers
deep learning
Word vectors are also called
word embeddings
or neural word representations because these whole bunch of words are represented in a high dimensional vector space and they…
Oct 15, 2022
Thinam Tamang
Data Engineering Fundamentals
machine learning
data engineering
User input data
can be text, images, videos, uploaded files, etc. It requires more heavy-duty checking and processing. User input data tends to require fast processing as…
Sep 4, 2022
Thinam Tamang
Data Fundamentals
machine learning
data preparation
Outliers are examples that look dissimilar to the majority of examples from the dataset. Dissimilarity is measured by some distance metric, such as
Euclidean distance.
Deleti…
Apr 3, 2022
Thinam Tamang
Machine Learning
deep learning
machine learning
Machine learning can be defined as the process of solving a practical problem by collecting a dataset, and algorithmically training a
statistical model
based on that dataset.
Mar 27, 2022
Thinam Tamang
Pattern Recognition & ML
machine learning
pattern recognition
The field of
pattern recognition
is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these…
Feb 10, 2022
Thinam Tamang
Convolutional Neural Networks Architectures
convolutional neural networks
The five architectures of CNNs that have been
pre-trained
on the ImageNet dataset and, are present in the
Keras
library are mentioned below:
Jan 22, 2022
Thinam Tamang
Fundamentals of CNNs
machine learning
convolutional neural network
Neural networks
are the building blocks of deep learning systems. A system is called a neural network if it contains a labeled, directed graph structure where each node in…
Dec 31, 2021
Thinam Tamang
Fundamentals of Image Classification
computer vision
image classification
1.
Image Classification
is the task of using computer vision and machine learning algorithms to extract meaning from an image. It is the task of assigning a label to an…
Dec 6, 2021
Thinam Tamang
Journey of 66DaysOfData in Natural Language Processing
natural language processing
Day1 of 66DaysOfData!
-
Natural Language Processing:
Natural Language Processing is a field of Linguistics, Computer Science, and Artificial Intelligence concerned with the…
Oct 15, 2021
Thinam Tamang
No matching items