Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
In a Training Loop 🔄
66657.2
TFLOPS
108
14
190
VIDRAFT_LAB
SeaWolf-AI
Follow
GKKWD's profile picture
johnlockejrr's profile picture
mrs83's profile picture
179 followers
·
216 following
AI & ML interests
None yet
Recent Activity
updated
a model
about 12 hours ago
FINAL-Bench/Darwin-28B-Coder-GGUF
reacted
to
their
post
with 😎
1 day ago
Darwin V9 — GPQA Diamond 90.9%, #1 on the leaderboard, with pure greedy decoding Darwin-398B-JGOS reaches 90.9% (180/198) on GPQA Diamond, the PhD-level scientific reasoning benchmark, ranking #1 on the Hugging Face GPQA Diamond leaderboard. No self-consistency, no test-time compute scaling — this was achieved with a single greedy decode (temperature 0, single sample, max 16,384 tokens). The full eval config is published in the model card, so anyone can reproduce it. Raw reasoning, no score inflation. The result comes from Darwin V9, a patented evolutionary model-development platform. Its core idea: it never trains a model from scratch. Why Darwin V9 beats training from scratch Cost & speed: no trillion-token pretraining run, no months of compute — a purpose-built, high-performance model is produced in a fraction of the time. Reuse of proven intelligence: instead of re-learning every capability from a blank slate, it selects and combines only the strengths of already-trained, already-validated models, so results are stable and predictable. Surgical transplantation: it identifies which neural region of which model holds which capability — at the FFN (Feed Forward Network) layer level — and grafts in only the segments that contribute to the target skill. How it works: a large model (Qwen 3.5 397B) serves as the mother model (the substrate); several father models specialized in reasoning, coding, and language are analyzed layer-by-layer across their FFN regions; the segments that contribute to the target performance are extracted and transplanted into the mother model to produce a new child model. The result is a ~400B MoE that activates only ~17B parameters per token at inference — large-model capacity with efficient inference. If training from scratch means rebuilding everything from a blank page, Darwin V9 means precisely recombining intelligence that has already been proven. GPQA Diamond #1 is the proof. Model: https://huggingface.co/FINAL-Bench/Darwin-398B-JGOS
posted
an
update
1 day ago
Darwin V9 — GPQA Diamond 90.9%, #1 on the leaderboard, with pure greedy decoding Darwin-398B-JGOS reaches 90.9% (180/198) on GPQA Diamond, the PhD-level scientific reasoning benchmark, ranking #1 on the Hugging Face GPQA Diamond leaderboard. No self-consistency, no test-time compute scaling — this was achieved with a single greedy decode (temperature 0, single sample, max 16,384 tokens). The full eval config is published in the model card, so anyone can reproduce it. Raw reasoning, no score inflation. The result comes from Darwin V9, a patented evolutionary model-development platform. Its core idea: it never trains a model from scratch. Why Darwin V9 beats training from scratch Cost & speed: no trillion-token pretraining run, no months of compute — a purpose-built, high-performance model is produced in a fraction of the time. Reuse of proven intelligence: instead of re-learning every capability from a blank slate, it selects and combines only the strengths of already-trained, already-validated models, so results are stable and predictable. Surgical transplantation: it identifies which neural region of which model holds which capability — at the FFN (Feed Forward Network) layer level — and grafts in only the segments that contribute to the target skill. How it works: a large model (Qwen 3.5 397B) serves as the mother model (the substrate); several father models specialized in reasoning, coding, and language are analyzed layer-by-layer across their FFN regions; the segments that contribute to the target performance are extracted and transplanted into the mother model to produce a new child model. The result is a ~400B MoE that activates only ~17B parameters per token at inference — large-model capacity with efficient inference. If training from scratch means rebuilding everything from a blank page, Darwin V9 means precisely recombining intelligence that has already been proven. GPQA Diamond #1 is the proof. Model: https://huggingface.co/FINAL-Bench/Darwin-398B-JGOS
View all activity
Organizations
SeaWolf-AI
's models
3
Sort: Recently updated
SeaWolf-AI/Darwin-Qwen3.5-27B-x-Qwen3.5-27B-Claude-4-08162
28B
•
Updated
Apr 12
•
4
SeaWolf-AI/Darwin-Darwin-4B-Opus-x-gemma-4-E4B-it-The-D-08412
8B
•
Updated
Apr 10
•
3
•
7
SeaWolf-AI/Darwin-gemma-4-E4B-it-x-Gemma-4-E4B-Claude-4-08292
Updated
Apr 8
•
7