Joseph [open/acc] Pollack's picture
πŸ‘‹ Open to Work

Joseph [open/acc] Pollack PRO

Tonic
hugging-science

AI & ML interests

πŸ€–Making robots to help people learn things quicker πŸ‘©πŸ»β€πŸš€πŸš€

Recent Activity

reacted to RDTvlokip's post with πŸ‘ about 5 hours ago
I finally changed the architecture of my 15M French LLM. It worked. Then I almost fooled myself about how much and catching that was the real win. After proving last time that architecture is a threshold, not a lever, I got stubborn: could I change how the model learns? Four honest attempts, Lion, a sharper AdamW Ξ²2, multi-token prediction, LayerScale. Four failures. The bottleneck wasn't the learning rule either. So I changed the shape of the computation instead: loop the same transformer blocks 4Γ—, deeper reasoning, zero added parameters. It beat the baseline on perplexity, the first thing in the whole project to move that number. Then I added my own twist: let each token decide how deep to think, halting on its own entropy. My first evaluation was spectacular. Coherence up 65%. Hallucinated names down 62%. It was noise. Eight prompts, one seed. I re-ran on 50 prompts Γ— 200 tokens and watched the gains shrink to "modest" and on out-of-domain prompts, recurrence actually made things worse. No universal winner. And none of it is new: it's Adaptive Computation Time (2016), the Universal Transformer (2018), and LoopViT (2026), recombined and measured honestly. The real lesson: A number from 8 prompts is a rumor. The eval harness that kills your own best result is worth more than the result it kills. Cite your lineage. Stay preliminary until multiple seeds say otherwise. The three models are live. The write-up is honest about every caveat πŸ‘‡ πŸ”— https://huggingface.co/blog/RDTvlokip/teaching-a-15m-french-llm-to-think-deeper
liked a Space 2 days ago
krea/Krea-2
liked a Space 5 days ago
julien-c/caliceo
View all activity

Organizations

GalsenAI Lab's profile picture MISATO-dataset's profile picture Masakhane NLP's profile picture LangChain Agents Hub's profile picture LangChain Chains Hub's profile picture BigScience Biomedical Datasets's profile picture LangChainDatasets's profile picture OpenVINO Toolkit's profile picture Gradio-Blocks-Party's profile picture scikit-learn's profile picture DeepGHS's profile picture The introspector project's profile picture Pseudo Lab's profile picture LangChain Hub Prompts's profile picture The Waifu Research Department's profile picture Blog-explorers's profile picture Tonic AI's profile picture MultiπŸ€–Transformers's profile picture Qwen's profile picture Team Tonic's profile picture EvalEval Coalition's profile picture That Time I got Reincarnated as a Hugging Face Organization's profile picture ZeroGPU Explorers's profile picture SaprotHub's profile picture The Hydra Project's profile picture the collabage patch's profile picture Social Post Explorers's profile picture Cohere Labs Community's profile picture AIffl : AI For French Language's profile picture M4-ai's profile picture takara.ai's profile picture Dev Mode Explorers's profile picture Quasar Research's profile picture Chinese LLMs on Hugging Face's profile picture Hugging Face for Legal's profile picture Hugging Face Discord Community's profile picture Dataset Tools's profile picture Seq-to-Pheno's profile picture Data Tonic (Alignment Lab)'s profile picture FINOS's profile picture retrain-pipelines's profile picture Intelligent Estate's profile picture open/ acc's profile picture Frugal AI Challenge's profile picture Smol Community's profile picture Mistral AI Game Jam's profile picture La Mousse's profile picture Through Their Eyes's profile picture Dtnm's profile picture KXSB's profile picture LangGraph UserGroup's profile picture Bitsandbytes Community's profile picture Tesslate's profile picture Reasoning datasets competition 's profile picture LeRobot Worldwide Hackathon's profile picture Hugging Face Context Course's profile picture Agents-MCP-Hackathon's profile picture The Ultimate Viber's profile picture AI Plans's profile picture issuria.com's profile picture Join Secret x Hugging Face's profile picture vLLM Semantic Router's profile picture ExtΓ© 's profile picture Tonic AI - Easy Multi Bank's profile picture Hugging Science's profile picture Data Quests for Open Science's profile picture Bioscope's profile picture MCP-1st-Birthday's profile picture nanochat students's profile picture Hugging Face Skills's profile picture Deep Critical's profile picture 25daysofagents's profile picture Toad HF Inference Explorers's profile picture LePixelArt's profile picture ML intern explorers's profile picture Unsloth Jobs Explorers's profile picture Mistral Hack-a-ton 2026's profile picture Les Shakods's profile picture Gemini CV Hackathon's profile picture NΓΌTonic's profile picture Humanity's Last Hackathon's profile picture Build Small Hackathon's profile picture