社区博客与文章

Community Articles

view all

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Uncensor any LLM with abliteration

An experiment with attention.

Software Forgets: Agent Traces Are the Memory

Small Language Models (SLM): A Comprehensive Overview

Eight Days in China: What I Learned from the AI Labs, Robotics Startups and Academia

Introduction to State Space Models (SSM)

How to run Gemini Nano locally in your browser

Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚

Understanding Vector Quantization in VQ-VAE

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

From GRPO to DAPO and GSPO: What, Why, and How

Best Open-Source LLM Models in 2026: Coding, Local, Agentic AI, Benchmarks, and License

Deriving the PPO Loss from First Principles

llmfine-tuningopen-source

Codex 正在推动 AI 模型的开源与训练流程

2025年12月11日

partnershipsgoogleannouncement

构建开放的未来——我们与 Google Cloud 的全新合作伙伴关系

2025年11月13日

lerobotrobotics

使用 NVIDIA Isaac 构建医疗机器人：从仿真到部署

2025年10月29日

ethicsguidespeech

经同意的语音克隆

2025年10月28日

datasetsxethub

流式数据集：效率提升 100 倍

2025年10月27日

huggingface_hubpythonannouncement

huggingface_hub v1.0：开源机器学习基础五周年回顾

2025年10月27日

lerobotrobotics

LeRobot v0.4.0：全面提升开源机器人的学习能力

2025年10月24日

ocrvisionmultimodal

用开源模型强化你的 OCR 工作流

2025年10月21日

datasetsopen-sourcevision

用 AI Sheets 解锁图像的力量

2025年10月21日

transformerspytorchoptimization

来自OpenAI gpt-oss的技巧，你🫵在transformers中也可以使用

2025年9月11日

spaceszerogpupytorch

ZeroGPU Spaces 加速实践：PyTorch Ahead-of-Time Compilation 全解析

2025年9月2日

openaigptgpt-oss

欢迎 GPT OSS —— 来自 OpenAI 的全新开放模型家族！

2025年8月5日

llmnlpreasoning

SmolLM3: smol, multilingual, long-context reasoner

2025年7月8日

smolvlalerobotrobotics

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

2025年6月3日

Community Articles

NEW Articles from Team or Enterprise organizations will get promoted to the main section.

LeRobot Humanoid: An Open, Low-Cost, 3D-Printed Humanoid for Robot Learning

Borealis — open data, code, weights recipe for training Audio LLM

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

Why Open Models Are the Only Sustainable Way to Teach AI

Relaunching PapersWithCode with new features

KV Caching Explained: Optimizing Transformer Inference Efficiency

A Guide to Reinforcement Learning Post-Training for LLMs: PPO, DPO, GRPO, and Beyond

Uncensor any LLM with abliteration

An experiment with attention.

Software Forgets: Agent Traces Are the Memory

Small Language Models (SLM): A Comprehensive Overview

Eight Days in China: What I Learned from the AI Labs, Robotics Startups and Academia

Introduction to State Space Models (SSM)

How to run Gemini Nano locally in your browser

Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚

Understanding Vector Quantization in VQ-VAE

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

From GRPO to DAPO and GSPO: What, Why, and How

Best Open-Source LLM Models in 2026: Coding, Local, Agentic AI, Benchmarks, and License

Deriving the PPO Loss from First Principles