Instructions to use bigscience/bloom with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bigscience/bloom with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bigscience/bloom")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom") model = AutoModelForCausalLM.from_pretrained("bigscience/bloom") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use bigscience/bloom with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bigscience/bloom" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigscience/bloom", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/bigscience/bloom
- SGLang
How to use bigscience/bloom with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bigscience/bloom" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigscience/bloom", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bigscience/bloom" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bigscience/bloom", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use bigscience/bloom with Docker Model Runner:
docker model run hf.co/bigscience/bloom
getting this bug on the 560 m model
Downloading model.safetensors: 100%|██████████| 1.12G/1.12G [00:13<00:00, 80.0MB/s]
Unexpected exception formatting exception. Falling back to standard exception
Traceback (most recent call last):
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3442, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/tmp/ipykernel_1800858/148860320.py", line 21, in
query_model_pipelines.append(pipeline("text-generation",
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/transformers/pipelines/init.py", line 779, in pipeline
if torch_dtype is not None:
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/transformers/pipelines/base.py", line 262, in infer_framework_load_model
"Model might be a PyTorch model (ending with .bin) but PyTorch is not available. "
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2560, in from_pretrained
resolved_archive_file = cached_file(
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 429, in load_state_dict
for shard_file in shard_files:
NameError: name 'safe_open' is not defined
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 2057, in showtraceback
stb = self.InteractiveTB.structured_traceback(
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1118, in structured_traceback
return FormattedTB.structured_traceback(
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/ultratb.py", line 1012, in structured_traceback
return VerboseTB.structured_traceback(
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/ultratb.py", line 865, in structured_traceback
formatted_exception = self.format_exception_as_a_whole(etype, evalue, etb, number_of_lines_of_context,
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/ultratb.py", line 818, in format_exception_as_a_whole
frames.append(self.format_record(r))
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/IPython/core/ultratb.py", line 736, in format_record
result += ''.join(_format_traceback_lines(frame_info.lines, Colors, self.has_colors, lvals))
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/core.py", line 698, in lines
pieces = self.included_pieces
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/core.py", line 649, in included_pieces
pos = scope_pieces.index(self.executing_piece)
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/utils.py", line 145, in cached_property_wrapper
value = obj.dict[self.func.name] = self.func(obj)
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/stack_data/core.py", line 628, in executing_piece
return only(
File "/home/suryahari/miniconda3/envs/vornoi/lib/python3.10/site-packages/executing/executing.py", line 164, in only
raise NotOneValueFound('Expected one value, found 0')
executing.executing.NotOneValueFound: Expected one value, found 0