camgeodesic/sfm-sft_dolci_mcqa_instruct_unfiltered-DPO Text Generation • 7B • Updated Dec 24, 2025 • 86 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_unfiltered-DPO Text Generation • 7B • Updated Dec 24, 2025 • 86 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered-DPO Text Generation • 7B • Updated Dec 24, 2025 • 66 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered-DPO Text Generation • 7B • Updated Dec 24, 2025 • 66 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered_insert_alignment_e2e-DPO Text Generation • 7B • Updated Dec 24, 2025 • 62 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered_insert_alignment_e2e-DPO Text Generation • 7B • Updated Dec 24, 2025 • 62 • 1
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered_synth_align_mid-DPO Text Generation • 7B • Updated Dec 23, 2025 • 65
camgeodesic/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_misalign_mid-DPO Text Generation • 7B • Updated Dec 23, 2025 • 64
camgeodesic/sfm-sft_dolci_mcqa_instruct_filtered_synth_align_mid-DPO Text Generation • 7B • Updated Dec 23, 2025 • 65
camgeodesic/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_misalign_mid-DPO Text Generation • 7B • Updated Dec 23, 2025 • 64
Self-Fulfilling (Mis)alignment: Midtraining Ablations Collection Models where we try out various approached to positive alignment during midtraining • 4 items • Updated Dec 17, 2025
Self-Fulfilling (Mis)alignment: Post-Trained Models Collection Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models. • 22 items • Updated 27 days ago • 1
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO Text Generation • 7B • Updated Dec 15, 2025 • 36