Instructions to use legesher/language-decoded-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use legesher/language-decoded-lora with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="legesher/language-decoded-lora")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("legesher/language-decoded-lora", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use legesher/language-decoded-lora with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "legesher/language-decoded-lora"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "legesher/language-decoded-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/legesher/language-decoded-lora

SGLang

How to use legesher/language-decoded-lora with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "legesher/language-decoded-lora" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "legesher/language-decoded-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "legesher/language-decoded-lora" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "legesher/language-decoded-lora",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use legesher/language-decoded-lora with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for legesher/language-decoded-lora to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for legesher/language-decoded-lora to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for legesher/language-decoded-lora to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="legesher/language-decoded-lora",
    max_seq_length=2048,
)

Docker Model Runner
How to use legesher/language-decoded-lora with Docker Model Runner:
```
docker model run hf.co/legesher/language-decoded-lora
```

Language Decoded LoRA

QLoRA adapters fine-tuned on multilingual code conditions for the Language Decoded project (part of Cohere's Tiny Aya Expedition).

Submitted paper title (2026-05-26): Language, Decoded: Exploring the Impact of Fine-Tuning a Multilingual Model on Native-Language Code

⚠️ Phase 3 eval numbers — read the experiments repo before citing

Original Phase 3 _summary_*.json files on legesher/language-decoded-experiments under-report cond-5 SIB-200 accuracy by 20–35pp because the strict inference-time extractor refused native-script answers. Cite the _summary_reparsed_*.json siblings (refined extractor) instead. Five Phase 3 SIB-200 conclusions also flip win→loss against baseline once the extractor is corrected (cond-2-es-5k, cond-2-es-20k, cond-2-ur-20k, cond-2-zh-20k, cond-3-zh-5k), and cond-2-ur-5k's gain deflates 4.4×. See the banner on the experiments repo (top of the README) for the full picture.

Research Question

How does fine-tuning Tiny Aya on non-English code — whether transpiled, mixed-native, or fully translated — affect its multilingual reasoning and instruction-following, and how does that impact differ from fine-tuning on English code?

The hypothesis is not that non-English code matches or exceeds English code as a generic reasoning aid — rather, that the kind of effect non-English code produces depends on the target language, the data structure, and how the corpus was constructed. See legesher/language-decoded-experiments for the full project context.

Base Model

All adapters are trained on CohereLabs/tiny-aya-base (3.35B parameters). Tiny Aya was chosen because it is small (deployable on a single 16 GB T4 GPU via QLoRA), accessible (Apache 2.0-licensed), and supports 70+ languages with explicit emphasis on lower-resourced ones — which makes the experimental ladder viable for ur at all.

Adapter Inventory

This repo holds adapters from two generations of the project, kept side by side and clearly separated by folder. See the Provenance & Manifest section for a complete path → phase → source-corpus map, and MANIFEST.md for the machine-readable version.

Paper adapters (Phase 3 · The Stack v2-dedup) — live under the tiny-aya-base/ prefix. These are the adapters cited in the submitted paper; cond-1, cond-2, and cond-5 were re-trained from scratch on the cleaner bigcode/the-stack-v2-dedup corpus.
Preliminary adapters (Phase 2 · The Stack v1) — live as flat top-level folders (condition-1-en-32k/, condition-2-zh-5k/, …). These are the original March-2026 hackathon adapters trained on bigcode/the-stack (v1, non-dedup), retained for reproducibility. Do not cite these for the paper.

Paper adapters — Phase 3 · The Stack v2-dedup

Each subdirectory under tiny-aya-base/ is one trained condition × file-volume × seed combination. All adapters share the QLoRA hyperparameters listed under Training Details.

Subdirectory (under `tiny-aya-base/`)	Condition	Training data	Seeds
`tiny-aya-base/condition-1-en-5k-seed{42,123,456}/`	1	Raw English Python from `bigcode/the-stack-v2-dedup` (5k file subset)	42, 123, 456
`tiny-aya-base/condition-1-en-20k-seed42/`	1	Raw English Python (20k file subset)	42
`tiny-aya-base/condition-2-{zh,es,ur}-5k-seed{42,123,456}/`	2	The same 5k subset as cond-1, processed through Legesher v0.7.3 — Python's reserved words (keywords, exceptions, built-in functions, numerical system for some target languages) translated to the target language; user logic preserved	42, 123, 456
`tiny-aya-base/condition-2-{zh,es,ur}-20k-seed42/`	2	The same 20k subset as cond-1, processed through Legesher v0.7.3	42
`tiny-aya-base/condition-3-zh-5k-native-code-seed42/`	3	Community-collected raw Chinese code from varied online public-source repositories (different source-file population from cond-1/2/5 by design)	42
`tiny-aya-base/condition-5-{zh,es,ur}-5k-c4ai-aya-expanse-32b-seed42/`	5	The same 5k subset as cond-1, first transpiled by Legesher v0.7.3 to translate Python's reserved words, then run through `c4ai-aya-expanse-32b` via the Cohere API to translate the remaining content (identifiers, comments, docstrings, string literals)	42

Condition 4 ("Community-Contributed Native Code") is pending sufficient direct community contributions to the legesher/legesher-native-code HF Space; no cond-4 adapter exists yet.

Preliminary adapters — Phase 2 · The Stack v1

These flat top-level folders are the original hackathon adapters, trained on bigcode/the-stack (v1, non-dedup) with Legesher v0.5.1 / v0.6.0. They are superseded by the tiny-aya-base/ Phase 3 adapters above and are kept only for reproducibility of the preliminary results. The 32k size and the single-seed setup are Phase 2 signatures.

Subdirectory (top level)	Condition	Source corpus	Notes
`condition-1-en-32k/`	1	`bigcode/the-stack` (v1)	Phase 2 32k tier; no Phase 3 equivalent
`condition-1-en-5k/`	1	`bigcode/the-stack` (v1)	Preliminary; use `tiny-aya-base/condition-1-en-5k-seed42/` for the paper
`condition-2-es-5k/`	2	`bigcode/the-stack` (v1), Legesher transpiled	Preliminary
`condition-2-ur-5k/`	2	`bigcode/the-stack` (v1), Legesher transpiled	Preliminary
`condition-2-zh-5k/`	2	`bigcode/the-stack` (v1), Legesher transpiled	Preliminary
`condition-3-zh-5k/`	3	Community-collected raw Chinese code	Preliminary; corpus unchanged across phases

The standalone per-adapter repos that previously published these Phase 2 / v1 adapters (legesher/language-decoded-lora-condition-*) have been renamed to legesher/language-decoded-lora-phase-2-the-stack-v1-condition-* and deprecated in favor of this umbrella repo. Their old URLs continue to resolve via Hugging Face redirects.

Source-file control

Cond-1, cond-2, and cond-5 all train on the same 5,000-file subset drawn from bigcode/the-stack-v2-dedup (with a parallel 20k subset for the 20k tier). Differences across these conditions reflect the processing pipeline (raw / transpiled / fully translated), not file-quality or content drift. Cond-3 is the deliberate exception — its source files are a different population by design.

The experimental ladder

Baseline → cond-1: Does code help at all? (Replicates Aryabumi et al., 2024.)
Cond-1 → cond-2: Does translating Python's reserved words (keywords, exceptions, built-in functions, numerical system for some target languages) into the target language change the model's behavior? User logic and library calls remain English-derived.
Cond-2 → cond-3: Does code pulled from real-world public-source repositories — code humans actually wrote in or with the target language — add value beyond Legesher's mechanical translation?
Cond-2 → cond-5: Cond-2 translates only Python's reserved words; cond-5 goes further by translating the rest of the file's content (identifiers, comments, docstrings, string literals) via c4ai-aya-expanse-32b. Logic and structure are preserved.
Cond-3 → cond-5 (implicit): Human-authored vs. machine-synthesized native code.

For the full ladder including future directions (natural-language text control, combined-language training, similar-script evaluation), see legesher/language-decoded-experiments.

Provenance & Manifest

The two adapter generations are distinguished by folder location and source corpus, matching the convention used across the project's repos (phase-2-the-stack-v1-* on language-decoded-data, phase2/÷phase3/ on language-decoded-experiments):

Generation	Location in this repo	Source corpus	Legesher	Tier / seeds	Cite for paper?
Phase 3 (paper)	`tiny-aya-base/…-seed*/`	`bigcode/the-stack-v2-dedup`	v0.7.3	5k (3 seeds) + 20k (1 seed)	✅ Yes
Phase 2 (preliminary)	flat top-level `condition-*/`	`bigcode/the-stack` (v1)	v0.5.1 / v0.6.0	5k / 32k (1 seed)	❌ No

A complete, machine-readable path → phase → corpus → condition map is in MANIFEST.md. Training-data provenance for each condition is detailed on language-decoded-data; the phase comparison is in the "Phase 2 → Phase 3 at a glance" table on the experiments repo.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("CohereLabs/tiny-aya-base")
tokenizer = AutoTokenizer.from_pretrained("CohereLabs/tiny-aya-base")

# Load a paper (Phase 3 · Stack v2-dedup) adapter — e.g., cond-1 (English code, seed 42, 5k tier).
# Paper adapters live under the `tiny-aya-base/` prefix.
model = PeftModel.from_pretrained(
    base_model,
    "legesher/language-decoded-lora",
    subfolder="tiny-aya-base/condition-1-en-5k-seed42",
)

# Or a language-specific cond-2 adapter (Chinese reserved-word translation, seed 42)
model = PeftModel.from_pretrained(
    base_model,
    "legesher/language-decoded-lora",
    subfolder="tiny-aya-base/condition-2-zh-5k-seed42",
)

# Or a cond-5 adapter (Synthesized Native Code, Urdu, seed 42)
model = PeftModel.from_pretrained(
    base_model,
    "legesher/language-decoded-lora",
    subfolder="tiny-aya-base/condition-5-ur-5k-c4ai-aya-expanse-32b-seed42",
)

# To load a *preliminary* Phase 2 / Stack v1 adapter instead, use the flat top-level
# folder (no `tiny-aya-base/` prefix) — e.g. the original cond-2 Chinese hackathon adapter:
model = PeftModel.from_pretrained(
    base_model,
    "legesher/language-decoded-lora",
    subfolder="condition-2-zh-5k",
)

Training Details

Parameter	Value
Base model	CohereLabs/tiny-aya-base (3.35B params, 70+ languages, low-resource emphasis)
Method	QLoRA 4-bit (NF4), ~5.4 GB VRAM, Unsloth-accelerated
Hardware	Kaggle T4 (16 GB)
Tokenizer	`CohereLabs/tiny-aya-base`
Transpilation tool	Legesher v0.7.3 (Phase 3); v0.5.1 / v0.6.0 used in Phase 2
Cond-5 translation	`c4ai-aya-expanse-32b` accessed via the Cohere API (made possible by Cohere credits awarded to Legesher)
Training data	legesher/language-decoded-data

QLoRA hyperparameters

Parameter	Value
LoRA rank (`r`)	16
LoRA alpha	32
LoRA dropout	0.0
Target modules	q_proj, k_proj, v_proj, o_proj, up_proj, down_proj, gate_proj
Bias	none
Task type	CAUSAL_LM
PEFT version	0.18.1
Quantization	NF4 (4-bit) via Unsloth

Evaluation

Phase 3 models are evaluated on four multilingual benchmarks under template1 (English-prompt) and template2 (native-prompt) across the full data_lang × instr_lang matrix:

Benchmark	What it measures	Examples per language
XNLI	Natural-language inference	~5,000
X-CSQA	Commonsense reasoning	~1,000
SIB-200	Topic classification	~204
Belebele	Reading comprehension	~900

MGSM was used in Phase 2 and dropped from Phase 3 — at 3.35B parameters and 250 examples per language, scores ranged 2.8% – 10.8% across all conditions with most condition-to-condition differences within noise. A useful null result; budget was reallocated to SIB-200 and Belebele.

Paper-grade evaluation results live on legesher/language-decoded-experiments — see the refined-tables and the writeup at expedition-tiny-aya/analysis/phase-3/phase3-refined-evaluation.md.

Limitations

Single base model: All adapters are trained on CohereLabs/tiny-aya-base (3.35B params). Results may not generalize to larger or architecturally different models. Future iterations will expand to additional base models.
Per-language fine-tuning only: Every condition is per-language — each cond-2-{zh,es,ur}-5k (and cond-5-{zh,es,ur}-5k) is a separate training run. Combined-language training is a planned future condition.
Limited training data: 5k and 20k file tiers are constrained by Kaggle T4 hardware limits. 103k variants exist on the training data repo but no 103k adapters have been trained yet.
Consumer hardware: Training on Kaggle T4 (16 GB) with 4-bit quantization introduces approximation that may affect adapter quality compared to full-precision training.
Extractor coverage — when citing Phase 3 results, use the refined-extractor scores. See the banner at the top of this card and the experiments repo for full details.

Related Resources

Experiment tracking and results: legesher/language-decoded-experiments (canonical project source-of-truth)
Training data: legesher/language-decoded-data
Community native code: legesher/language-decoded-community
Cond-4 contribution interface: legesher/legesher-native-code HF Space
Transpilation tool: Legesher on GitHub

Citation

@misc{language-decoded-2026,
  title={Language Decoded: Exploring the Impact of Native Code on Multilingual Models},
  author={Madison Edgar and Saad Ahmed Bazaz and Tom Sherborne and Rashik Shahjahan and Khojasteh Mirza and Sarah Jawaid and Rafay Mustafa and Sohaib Ahmed Bazaz},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/legesher/language-decoded-lora}
}

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for legesher/language-decoded-lora

Base model

CohereLabs/tiny-aya-base

Adapter

(8)

this model

Collection including legesher/language-decoded-lora

Language Decoded

Collection

Fine-tuning a multilingual model on native-language code (Spanish, Chinese, Urdu). • 5 items • Updated 1 day ago • 1

Paper for legesher/language-decoded-lora

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45