lulavc PRO
lulavc
AI & ML interests
None yet
Recent Activity
reacted to sergiopaniego's post with π€ about 12 hours ago
We just released TRL v0.26.0!
It comes packed with updates:
> Agent training with tools in GRPO
> New CISPO & SAPO losses + reasoning rewards
> vLLM quantization in colocate mode
> Dataset shuffling in SFT
> Lots of NEW examples
> Tons of fixes and documentation improvements
liked a Space 10 days ago
lulavc/deepseek-uncensored-lore updated a Space 10 days ago
lulavc/deepseek-uncensored-lore