3 3 16

Zack Angelo PRO

zackangelo

https://mixlayer.com

AI & ML interests

None yet

Recent Activity

updated a model 5 days ago

mixlayer/Kimi-K2.7-Code-Tokenizer

published a model 5 days ago

mixlayer/Kimi-K2.7-Code-Tokenizer

liked a model about 2 months ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

View all activity

Organizations

updated a model 5 days ago

mixlayer/Kimi-K2.7-Code-Tokenizer

Updated 5 days ago

published a model 5 days ago

mixlayer/Kimi-K2.7-Code-Tokenizer

Updated 5 days ago

liked a model about 2 months ago

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

Text Generation • 124B • Updated Apr 29 • 1.04M • • 385

liked a model 4 months ago

Qwen/Qwen3.5-397B-A17B

Image-Text-to-Text • 403B • Updated Apr 24 • 642k • • 1.52k

liked a model 8 months ago

Qwen/Qwen3-235B-A22B-Instruct-2507

Text Generation • 235B • Updated Sep 17, 2025 • 130k • • 782

New activity in Qwen/Qwen3-235B-A22B-Instruct-2507 8 months ago

head_dim in config.json is incorrect?

👀 2

#36 opened 8 months ago by

zackangelo

liked a model 11 months ago

RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic

Text Generation • 8B • Updated 11 days ago • 47.2k • 9

liked a Space 11 months ago

DeepFilterNet2

💩

172

Denoise your recordings and view spectrograms

upvoted an article 12 months ago

Article

Building Tensors from Scratch in Rust (Part 1.2): View Operations

KeighBee

•

Jun 18, 2025

• 4

liked a Space 12 months ago

Scaling test-time compute

📈

601

Boost LLM answers with flexible test‑time search strategies

liked a model 12 months ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Text Generation • 8B • Updated May 29, 2025 • 441k • 1.08k

New activity in deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 12 months ago

Vocab missing tool-related strings in chat template, poor performance with tools

#13 opened about 1 year ago by

mattjcly

upvoted a collection about 1 year ago

Search-R1

Collection

Preliminary checkpoints with outcome-only RL. • 15 items • Updated Aug 12, 2025 • 18

liked 2 models over 1 year ago

mistralai/Mistral-Small-3.1-24B-Instruct-2503

24B • Updated Dec 22, 2025 • 237k • 1.37k

Qwen/QwQ-32B

Text Generation • 33B • Updated Mar 11, 2025 • 64.7k • • 2.93k

liked a Space over 1 year ago

Reward Bench Leaderboard

📐

432

Explore and compare model scores on RewardBench benchmarks

liked a model over 1 year ago

Skywork/Skywork-Reward-Llama-3.1-8B-v0.2

Text Classification • 8B • Updated Oct 25, 2024 • 71.6k • 43

upvoted a paper over 1 year ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16, 2025 • 170

New activity in deepseek-ai/DeepSeek-R1 over 1 year ago

Tool / Function Calling

➕ 2

#122 opened over 1 year ago by

smcleod

liked a model over 1 year ago

Datou1111/shou_xin

Text-to-Image • Updated Mar 16, 2025 • 108 • • 875

Zack Angelo PRO

AI & ML interests

Recent Activity

Organizations

zackangelo's activity

head_dim in config.json is incorrect?

DeepFilterNet2

Building Tensors from Scratch in Rust (Part 1.2): View Operations

Scaling test-time compute

Vocab missing tool-related strings in chat template, poor performance with tools

Reward Bench Leaderboard

Tool / Function Calling