71 178

crumb

clem's profile picture

AdamOswald1's profile picture

imthanhlv's profile picture

cephaloform
aicrumb
crumb.bsky.social

AI & ML interests

For what I'm working on right now, check out https://hf.co/crumbs-playground (the mammoth image button on my profile)

Recent Activity

published a model about 17 hours ago

crumbs-playground/clmr3-qwen3.5-0.8b-warm-start

updated a collection 2 days ago

CLM_R1

updated a collection 2 days ago

CLM_R1

View all activity

Organizations

crumb 's collections 6

MoLora-v1

Model assets for the first Mixture-of-Lora technique applied to Llama. https://bit.ly/48bqshl

crumb/llama2-7b-moe-text-exp0-4

Updated Jul 19, 2023 • 5
crumb/llama2-7b-moe-text-exp1-4

Updated Jul 19, 2023 • 5 • 2
crumb/llama2-7b-moe-text-exp2-4

Updated Jul 19, 2023 • 5
crumb/llama2-7b-moe-text-exp3-4

Updated Jul 19, 2023 • 3

GPT2-Linear

GPT2 Models using Linear layers instead of Conv layers for convenience.

crumbly/gpt2-linear-xl

Text Generation • Updated Jul 18, 2023 • 9 • 1
crumbly/gpt2-linear-large

Text Generation • Updated Jul 17, 2023 • 11
crumbly/gpt2-linear-medium

Text Generation • Updated Jul 17, 2023 • 7
crumbly/gpt2-linear-small

Text Generation • Updated Jul 17, 2023 • 5

Cramp(ed) Models

Smaller models trained locally on my 2xA6000 Lambda Vector

crumbly/cramp-25m

Text Generation • Updated Feb 15, 2024 • 5 • 8
crumb/cramped-94m-8btok

Text Generation • Updated Oct 11, 2023 • 6 • 1

MoLora-v2

First Prototype of the second iteration of MoLora utilizing mixture of expert techniques applied to the Llama2 model.

crumb/test-00-switchllama-i3b-f10b-e4-init

Text Generation • Updated Sep 13, 2023 • 9
crumb/test-00-qlora-wizmlpmix-c0

Updated Sep 4, 2023 • 2
crumb/test-00-qlora-wizmlpmix-c1

Updated Sep 4, 2023 • 3
crumb/test-00-qlora-wizmlpmix-c3

Updated Sep 4, 2023 • 4

Shrink Llama - V1

Parts of Meta's LlamaV2 models, chopped up and trained. CoreX means the first X layers were kept.

crumb/core1-base-464m-c4

Text Generation • 0.5B • Updated Sep 12, 2023 • 4
crumb/core1-base-464m-redpajama

Text Generation • Updated Sep 12, 2023 • 1

MoAT (More Artificial Tokens)

Allowing for the LM to learn a soft-"multi-step program" to predict future tokens instead of learning to predict future tokens itself.

crumb/16xF-6m-init

Text Generation • Updated Oct 16, 2023 • 13
crumb/32xF-6m-init

Text Generation • Updated Oct 16, 2023 • 16