Kkk
Michalea
·
AI & ML interests
None yet
Recent Activity
new activity 5 days ago
BLR2/Qwen3.5-9B-Eagle3-ShareGPT:MTP vs Eagle3 new activity 13 days ago
GadflyII/GLM-4.7-Flash-MTP-NVFP4:SGLang and MTPOrganizations
None yet
MTP vs Eagle3
1
#1 opened 5 days ago
by
Michalea
SGLang and MTP
1
#2 opened 13 days ago
by
Michalea
RoPE theta 5mln instead of 1mln
#15 opened 22 days ago
by
Michalea
The comparison with the original MTP
👍 1
1
#2 opened 30 days ago
by
Michalea
context length/number of generated tokens during training
#4 opened about 1 month ago
by
Michalea
Datasets used to create this head
#5 opened about 1 month ago
by
Michalea
A low number of evaluation benchmarks
2
#5 opened about 2 months ago
by
Michalea
Context length and regeneration
4
#1 opened about 1 month ago
by
Michalea
MTP quality, 47 layer
3
#7 opened about 2 months ago
by
Michalea
Efficiency of NVFP4 vs FP16/8
➕ 3
#4 opened about 2 months ago
by
Michalea
opis wersji 3.0
2
#1 opened 2 months ago
by
jacek2024
Evaluation
👍 4
#3 opened about 2 months ago
by
Michalea
Severe Looping/Repetitive Output when using --kv-cache-dtype fp8 with GLM-4.7-Flash-FP8-Dynamic on vLLM
4
#2 opened about 2 months ago
by
ShelterW
Inconsistent description with the evaluation results
1
#3 opened 2 months ago
by
Michalea
Data used to train the EAGLE head
#5 opened 2 months ago
by
Michalea
Training details - Question about Reasoning | /think
1
#1 opened 2 months ago
by
Michalea
does the tokenizer need to be updated for this model?
1
#5 opened 9 months ago
by
electroglyph