2023  10

October  2

大模型的数学之路

October 25, 2023 · 4932 words

Efficient Tricks for LLMs

October 13, 2023 · 2055 words

July  2

放大镜下的 InfoNCE

July 14, 2023 · 2196 words

NCE的朋友们

July 8, 2023 · 866 words

June  2

Numerical Stability

June 25, 2023 · 2093 words

Bias Variance Decomposition

June 21, 2023 · 994 words

May  2

Noise Contrastive Estimation

May 29, 2023 · 3934 words

Fast Greedy MAP Inference for DPP

May 16, 2023 · 4326 words

April  1

Determinantal Point Process

April 21, 2023 · 2906 words

February  1

Generalized Linear Models

February 17, 2023 · 3632 words

2022  6

November  1

Diving in distributed training in PyTorch

November 20, 2022 · 5363 words

September  1

Going Deeper into Back-propagation

September 7, 2022 · 1054 words

July  2

Tips for Training Neural Networks

July 30, 2022 · 1793 words

Quotes of Mathematicians

July 23, 2022 · 636 words

June  2

Retrieval-Enhanced Transformer

June 19, 2022 · 718 words

新的主题

June 19, 2022 · Updated: April 21, 2023 · 2361 words