Open to Collab

62 10

Nima Nooshiri

nimanzik

AI & ML interests

None yet

Recent Activity

liked a Space about 7 hours ago

huggingface/physics-intern

upvoted an article 6 days ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

updated a collection 7 days ago

Hugging Face Playbooks & Guidebooks

View all activity

Organizations

liked a Space about 7 hours ago

physics-intern: an Autonomous Agent for Physics Research

📝

Explore an autonomous AI workflow for physics research

upvoted an article 6 days ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 340

updated a collection 7 days ago

Hugging Face Playbooks & Guidebooks

Collection

6 items • Updated 7 days ago

published a model 21 days ago

nimanzik/totem-reproduction-vqvae

Updated 21 days ago

upvoted 3 articles 29 days ago

Article

KV Cache from scratch in nanoVLM

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 119

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

ggerganov, ngxson, allozaur, lysandre, victor, julien-c

•

Feb 20

• 507

Article

Introducing Storage Buckets on the Hugging Face Hub

Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner

•

Mar 10

• 195

upvoted 4 articles about 1 month ago

Article

Running AI agents to automate outreach at scale

nielsr

•

about 1 month ago

• 14

Article

DeepSeek-V4: a million-token context that agents can actually use

burtenshaw

•

Apr 24

• 47

Article

NaFlex in timm

rwightman

•

Apr 9, 2025

• 3

Article

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

lightonai

•

Feb 12

• 56

reacted to Kseniase's post with 👍 about 1 month ago

Post

8322

15 types of attention mechanisms

Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.

Here is a list of 15 types of attention mechanisms used in AI models:

1. Soft attention (Deterministic attention) -> Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Assigns a continuous weight distribution over all parts of the input. It produces a weighted sum of the input using attention weights that sum to 1.

2. Hard attention (Stochastic attention) -> Effective Approaches to Attention-based Neural Machine Translation (1508.04025)
Makes a discrete selection of some part of the input to focus on at each step, rather than attending to everything.

3. Self-attention -> Attention Is All You Need (1706.03762)
Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.

4. Cross-Attention (Encoder-Decoder attention) -> Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation (2104.08771)
The queries come from one sequence and the keys/values come from another sequence. It allows a model to combine information from two different sources.

5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762)
Multiple attention “heads” are run in parallel. The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.

6. Multi-Head Latent Attention (MLA) -> DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model (2405.04434)
Extends MHA by incorporating a latent space where attention heads can dynamically learn different latent factors or representations.

7. Memory-Based attention -> End-To-End Memory Networks (1503.08895)
Involves an external memory and uses attention to read from and write to this memory.

See other types in the comments 👇

1 reply

upvoted 3 articles about 1 month ago

Article

DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models

lightonai

•

Apr 21

• 38

Article

Safetensors is Joining the PyTorch Foundation

mcpotato, lysandre

•

Apr 8

• 38

Article

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 16

• 71

updated a collection about 1 month ago

Hugging Face Playbooks & Guidebooks

Collection

6 items • Updated 7 days ago

liked a Space about 1 month ago

Distilling 100B+ Models 40x Faster with TRL

📝

TRL distillation for 100B+ teachers, 40x faster

upvoted an article about 1 month ago

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 9

• 60

upvoted 2 articles about 2 months ago

Article

AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality

ibm-research

•

Jan 21

• 33

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego

•

Mar 10

• 157

Nima Nooshiri

AI & ML interests

Recent Activity

Organizations

nimanzik's activity

physics-intern: an Autonomous Agent for Physics Research

KV Caching Explained: Optimizing Transformer Inference Efficiency

KV Cache from scratch in nanoVLM

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

Introducing Storage Buckets on the Hugging Face Hub

Running AI agents to automate outreach at scale

DeepSeek-V4: a million-token context that agents can actually use

NaFlex in timm

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models

Safetensors is Joining the PyTorch Foundation

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Distilling 100B+ Models 40x Faster with TRL

Multimodal Embedding & Reranker Models with Sentence Transformers

AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries