Instructions to use bigcode/santacoder with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bigcode/santacoder with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bigcode/santacoder", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("bigcode/santacoder", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("bigcode/santacoder", trust_remote_code=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use bigcode/santacoder with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bigcode/santacoder" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/santacoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/bigcode/santacoder
- SGLang
How to use bigcode/santacoder with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bigcode/santacoder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/santacoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bigcode/santacoder" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigcode/santacoder", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use bigcode/santacoder with Docker Model Runner:
docker model run hf.co/bigcode/santacoder
Token inconsistency with Starcoder: fim_ or fim-
This model has special tokens, which start with "fim-", while StarCoder model uses tokens starting with "fim_". VSCode client is working with StarCoder by default, so it uses "fim_" tokens. This leads to improper work of SantaCoder when VSCode endpoint is changed to it: "fim_..." tokens are parsed as text, and the model adds them to the output from time to time.
Workaround: change token names "fim_" to "fim-" in the VSCode extension settings when SantaCoder is used.
Proposal: change "fim-" to "fim_" for this model.
Hello @yuryya , are you certain to have configured the following settings to the right values?
If so, please open an issue in https://github.com/huggingface/llm-vscode with the detail of your problems.
Hello! Sure, I mentioned this way as "workaround" in my proposal.
The problem is that it is not evident way. Since StarCoder and SantaCoder are from the save vendor and for the same task, there is no good reason to look in the config again. Moreover, difference like <fim_prefix> and <fim-prefix> is too hard to notice for human, and the error manifests not every time.
Yes, problem can be solved by adding a separate template for SantaCoder in https://github.com/huggingface/llm-vscode. It will work for default configurations, while model interfaces will remain different. But it is better than nothing, I will create PR when I have time.
Maybe, we can also add a note in README like "this model uses different tokens, comparing to StarCoder (fim- instead of fim_), so be careful in the case of migration between them".
Hello, both models are by BigCode but it's not the same family of models e.g all StarCoder variants (15B, 7B, 3B.. ) have the same FIM tokens. But I added the note you suggested to the "How to use FIM section" in the readme https://huggingface.co/bigcode/santacoder/discussions/42.
Oh man... I am one human that totally missed the _ vs - I wish they used the same token type.