-
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 504k • • 567 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 89 • 100 -
fdtn-ai/Foundation-Sec-8B-Instruct
Text Generation • 8B • Updated • 10.2k • • 67 -
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Paper • 2508.05954 • Published • 6
Anthony Ledesma
arledesma
AI & ML interests
None yet
Organizations
None yet
reading
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 77 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 41
Models
-
Qwen/Qwen3-4B-Thinking-2507
Text Generation • 4B • Updated • 504k • • 567 -
Intelligent-Internet/II-Search-4B
Text Generation • 4B • Updated • 89 • 100 -
fdtn-ai/Foundation-Sec-8B-Instruct
Text Generation • 8B • Updated • 10.2k • • 67 -
Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents
Paper • 2508.05954 • Published • 6
reading
-
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings
Paper • 2405.19504 • Published • 3 -
HiWave: Training-Free High-Resolution Image Generation via Wavelet-Based Diffusion Sampling
Paper • 2506.20452 • Published • 19 -
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper • 2506.20920 • Published • 77 -
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Paper • 2507.18553 • Published • 41
models 0
None public yet
datasets 0
None public yet