Interplay-LM Extrapolation Mid-Train Models

This repository contains the op11-14 CPT checkpoints and corresponding local RL outputs used by scripts/composition/op-difficulty-10B/script_cpt_rl/id2-10_0.2easy_0.3medium_0.5hard_cpt11-14.

For pretraining, only cpt0.2-uniform_0.8-11-14_plus is included. For RL, only final actor/huggingface checkpoints found locally are uploaded.

CPT Checkpoints

Path Checkpoint Used by nominal step / CPT epoch
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-387 checkpoint-387 50step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-774 checkpoint-774 100step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-1548 checkpoint-1548 200step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-1935 checkpoint-1935 100step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-3096 checkpoint-3096 100step/0.8, 400step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-3870 checkpoint-3870 500step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-4644 checkpoint-4644 600step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-6579 checkpoint-6579 800step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-7740 checkpoint-7740 954step/0.2
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-8127 checkpoint-8127 400step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-10062 checkpoint-10062 500step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-11997 checkpoint-11997 600step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-12771 checkpoint-12771 400step/0.8
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-15867 checkpoint-15867 800step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-16254 checkpoint-16254 500step/0.8
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-18963 checkpoint-18963 954step/0.5
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-19350 checkpoint-19350 600step/0.8
id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-25542 checkpoint-25542 800step/0.8

RL Checkpoints

Path Nominal step CPT epoch Source CPT checkpoint Uploaded checkpoint
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-50step-0.8RL 50 0.2 checkpoint-387 global_step_40
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-100step-0.2RL 100 0.8 checkpoint-3096 global_step_19
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-100step-0.5RL 100 0.5 checkpoint-1935 global_step_50
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-100step-0.8RL 100 0.2 checkpoint-774 global_step_80
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-200step-0.8RL 200 0.2 checkpoint-1548 global_step_160
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-400step-0.2RL 400 0.8 checkpoint-12771 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-400step-0.5RL 400 0.5 checkpoint-8127 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-400step-0.8RL 400 0.2 checkpoint-3096 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-500step-0.2RL 500 0.8 checkpoint-16254 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-500step-0.5RL 500 0.5 checkpoint-10062 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-500step-0.8RL 500 0.2 checkpoint-3870 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-600step-0.2RL 600 0.8 checkpoint-19350 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-600step-0.5RL 600 0.5 checkpoint-11997 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-600step-0.8RL 600 0.2 checkpoint-4644 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.8-rl-op11-14_uniform-800step-0.2RL 800 0.8 checkpoint-25542 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-800step-0.5RL 800 0.5 checkpoint-15867 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-800step-0.8RL 800 0.2 checkpoint-6579 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.5-rl-op11-14_uniform-954step-0.5RL 954 0.5 checkpoint-18963 not found locally
id2-10_0.2easy_0.3medium_0.5hard/rl/cpt0.2-rl-op11-14_uniform-954step-0.8RL 954 0.2 checkpoint-7740 not found locally

Load

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "Interplay-LM-Reasoning/extrapolation_midtrain"
subdir = "id2-10_0.2easy_0.3medium_0.5hard/midtrain/cpt0.2-uniform_0.8-11-14_plus/checkpoint-25542"

tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subdir)
model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder=subdir)

Citation

@misc{zhang2025interplaypretrainingmidtrainingrl,
      title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
      author={Charlie Zhang and Graham Neubig and Xiang Yue},
      year={2025},
      eprint={2512.07783},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.07783},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Interplay-LM-Reasoning/extrapolation_midtrain