Sky610TX
Model Details
- Architecture: GPT-2 Style (Custom Ascendant Config)
- Parameters: ~389 Million
- Training tokens: 1.3 Billion
- Context Window: 1024 Tokens
- 50k iterations
This is my first ever LLM trained from scratch. its not good but it works. Because its so under-trained, it has issues with longer words or words it doesnt know.
EXAMPLE: if you ask it who Albert Einstein is, it may respond with "Al b ert ein ste in is " and so on
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("8BitStudio/Sky610TX")
tokenizer = AutoTokenizer.from_pretrained("8BitStudio/Sky610TX")
input_text = "User: Hello\nAssistant:"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0]))
- Downloads last month
- 12