20 December, 2024
-
Learning representations by back-propagating errors
The paper that introduced Back-Propagation and changed the field of Deep Learning forever.
-
Phi-4 Technical Report
Using synthetic data in pre-training to improve reasoing & problem-solving abilities
6 December,2024
-
OpenAI o1 System Card
o1 - is giving a LLM more time to think the future? I don’t think so, it’s more like squeezing out the last bit of % improvement from a model without chaging the underlying architecture or it’s working.
-
GLU Variants Improve Transformer
Origin of SwiGLU ~ divine benevolence
-
From Local to Global: A Graph RAG Approach to Query-Focused Summarization
Graph based Retrieval-augmented Generation via LLM
27 November,2024
-
MILU: A Multi-task Indic Language Understanding Benchmark
A new benchmark to evaluate LLMs across mutliple Indic Languages ~ AI4Bharat
-
Qwen2.5-Coder Technical Report
Technical Report for the Best Open-Source model in coding!
27 October,2024
-
LORA: Low-Rank Adaptation OF Large Language Models
Cheaply and Efficiently Finetune LLMs, for GPU Poor T_T
-
High-Resolution Image Synthesis with Latent Diffusion Models
The OG Latent Diffusion model!!
-
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Stable Diffusion XL model for text-to-image generation
-
Attention is all you need
Attention! Attention! Attention!
24 July,2024
-
SinLU: Sinu-Sigmoidal Linear Unit
Sinu-sigmoidal Linear Unit
-
The Llama 3 Herd of Models
Technical Report of LLama3 family
-
StarCoder: may the source be with you!
StarCoder: A strong coding LLM