25 July, 2025
-
Subliminal Learning: Language Models transmit behavioral traits via hidden signals in data.
Outputs of the student shift towards the output of the teacher even on data that is far from the training distribution.
-
Spurious Rewards: Rethinking Training Signals in RLVR
We do not yet fully understand how RLVR improves performance!
-
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesnât
RL based fine-tuning works well for SLMs!
-
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Puzzles are not a good way to test âreasoningâ capabilities of a LLM and whatâs the definition of âreasoningâ?
23 May, 2025
-
Tina: Tiny Reasoning Models via LoRA
Improving reasoning of LLMs with RL via LoRA at just 9$
-
Accelerating Large Language Model Decoding with Speculative Sampling
Using a smaller model to speedup LLM decoding via speculation!
-
The Unbearable Slowness of Being: Why do we live at 10 bits/s?
Why we have an information throughput of measly 10 bits/s while our senses collect data in the order of gigabits/s ?
-
Reasoning with Latent Thoughts: On the Power of Looped Transformers
We donât need more depth for improving reasoning, we need more loops
-
Large Language Diffusion Models
Autoregressive models just got a new competitor - Diffusion models
(read my breakdown - Large Language Diffusion Models )
6 March, 2025
-
Titans: Learning to Memorize at Test Time
Surprise! - RNN + Attention
-
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
Producing coherent sentences (stories) with SLMs (model as small as 1m)
-
The super weight in Large Language Models
A handful of weights control the overall performance of language model so much, even making it lose the ability to generate sensible text ~ Super Weights
-
Competitive Programming with Large Reasoning Models
o1 & o3 compete at IOI 2024, and get a gold medal!
20 February, 2025
-
Long Code Arena: a Set of Benchmarks for Long-Context Code Models
LLMs need to perform well on ML4SE tasks, if they donât, how could they replace software engineers?
-
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
You canât judge a LLM without benchmarking it
-
DeepSeek-R1
Opensource rival of o1 and a hard-slap to âopenâAI
20 December, 2024
-
Learning representations by back-propagating errors
The paper that introduced Back-Propagation and changed the field of Deep Learning forever.
-
Phi-4 Technical Report
Using synthetic data in pre-training to improve reasoning & problem-solving abilities
6 December,2024
-
OpenAI o1 System Card
o1 - is giving a LLM more time to think the future? I donât think so, itâs more like squeezing out the last bit of % improvement from a model without changing the underlying architecture or itâs working.
-
GLU Variants Improve Transformer
Origin of SwiGLU ~ divine benevolence
-
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Graph based Retrieval-augmented Generation via LLM
27 November,2024
-
MILU: A Multi-task Indic Language Understanding Benchmark
A new benchmark to evaluate LLMs across multiple Indic Languages ~ AI4Bharat
-
Qwen2.5-Coder Technical Report
Technical Report for the Best Open-Source model in coding!
27 October,2024
-
LORA: Low-Rank Adaptation OF Large Language Models
Cheaply and Efficiently Finetune LLMs, for GPU Poor T_T
-
High-Resolution Image Synthesis with Latent Diffusion Models
The OG Latent Diffusion model!!
-
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Stable Diffusion XL model for text-to-image generation
-
Attention is all you need
Attention! Attention! Attention!
24 July,2024
-
SinLU: Sinu-Sigmoidal Linear Unit
Sinu-sigmoidal Linear Unit
-
The Llama 3 Herd of Models
Technical Report of LLama3 family
-
StarCoder: may the source be with you!
StarCoder: A strong coding LLM