Favorite Papers

Deep Double Descent
Explores how increasing model complexity affects performance, revealing a double descent phenomenon beyond the traditional bias-variance tradeoff.
Language Modeling Is Compression
Posits that language modeling functions as data compression, providing insights into the mechanisms driving language models' effectiveness.