👋 Open to Work

Joseph [open/acc] Pollack PRO

Tonic

hugging-science

·

https://discord.gg/qdfnvSPcqP

AI & ML interests

🤖Making robots to help people learn things quicker 👩🏻‍🚀🚀

Recent Activity

reacted to RDTvlokip's post with 👍 about 11 hours ago

I finally changed the architecture of my 15M French LLM. It worked. Then I almost fooled myself about how much and catching that was the real win. After proving last time that architecture is a threshold, not a lever, I got stubborn: could I change how the model learns? Four honest attempts, Lion, a sharper AdamW β2, multi-token prediction, LayerScale. Four failures. The bottleneck wasn't the learning rule either. So I changed the shape of the computation instead: loop the same transformer blocks 4×, deeper reasoning, zero added parameters. It beat the baseline on perplexity, the first thing in the whole project to move that number. Then I added my own twist: let each token decide how deep to think, halting on its own entropy. My first evaluation was spectacular. Coherence up 65%. Hallucinated names down 62%. It was noise. Eight prompts, one seed. I re-ran on 50 prompts × 200 tokens and watched the gains shrink to "modest" and on out-of-domain prompts, recurrence actually made things worse. No universal winner. And none of it is new: it's Adaptive Computation Time (2016), the Universal Transformer (2018), and LoopViT (2026), recombined and measured honestly. The real lesson: A number from 8 prompts is a rumor. The eval harness that kills your own best result is worth more than the result it kills. Cite your lineage. Stay preliminary until multiple seeds say otherwise. The three models are live. The write-up is honest about every caveat 👇 🔗 https://huggingface.co/blog/RDTvlokip/teaching-a-15m-french-llm-to-think-deeper

liked a Space 3 days ago

liked a Space 6 days ago

julien-c/caliceo

View all activity

Organizations

Tonic 's models 28

Tonic/sharing-stuff-with-frens

Updated Apr 7 • 1

Tonic/l-operator-instruct

Updated Feb 28 • 1

Tonic/voxtral-finetune-20250913_145448

Updated Sep 13, 2025 • 4

Tonic/sending_files_online_for_my_friends

Updated Aug 30, 2025

Tonic/l-android-control

Image-Text-to-Text • Updated Aug 25, 2025 • 1

Tonic/g-operator

Image-Text-to-Text • Updated Aug 25, 2025 • 2

Tonic/l-operator

Image-Text-to-Text • 2B • Updated Aug 25, 2025

Tonic/med-gpt-oss-20b

Text Generation • 21B • Updated Aug 10, 2025 • 78 • 7

Tonic/gpt-oss-20b-multilingual-reasoner

Text Generation • Updated Aug 5, 2025 • 6 • 2

Tonic/gpt-oss-multilingual-reasoner

Updated Aug 5, 2025 • 1

Tonic/petite-elle-L-aime-3-sft

Text Generation • 3B • Updated Aug 2, 2025 • 14 • 1

Tonic/c4ai-command-a-03-2025-4bit_nf4_no_double

Text Generation • 113B • Updated Mar 13, 2025 • 12

Tonic/c4ai-command-a-03-2025-4bit_fp4

Text Generation • 113B • Updated Mar 13, 2025 • 9

Tonic/c4ai-command-a-03-2025-4bit_nf4_double

Text Generation • 114B • Updated Mar 13, 2025 • 9

Tonic/GemmaX2-28-2B-gguf

Translation • 3B • Updated Mar 2, 2025 • 228 • 7

Tonic/GemmaX2-28-2B-4bit

Translation • 3B • Updated Feb 26, 2025 • 9 • 5

Tonic/GemmaX2-28-2B-8bit

Translation • 3B • Updated Feb 26, 2025 • 6 • 1

Tonic/climate-guard-toxic-agent

Text Classification • 0.1B • Updated Feb 13, 2025 • 9 • 1

Tonic/voyage-2-large

Text Generation • Updated Jul 17, 2024 • 2

Tonic/mirnet

Updated Jul 8, 2024 • 17

Tonic/video-swin-transformer

Updated Jul 8, 2024 • 19 • 3

Tonic/mobile-vit

Updated Jul 8, 2024 • 3

Tonic/paligemma-3b-pt-896

Updated May 14, 2024 • 1

Tonic/stablemed

Updated Nov 12, 2023 • 2 • 5

Tonic/GaiaMiniMed

Question Answering • Updated Oct 28, 2023 • 1 • 8

Tonic/experiments

Updated Oct 26, 2023 • 1

Tonic/mistralmed

Updated Oct 23, 2023 • 3 • 10

Tonic/LunarLander

Reinforcement Learning • Updated Sep 27, 2023 • 1