Instructions to use MayankLad31/invoice_schema with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use MayankLad31/invoice_schema with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MayankLad31/invoice_schema", filename="inv.Q8_0.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use MayankLad31/invoice_schema with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MayankLad31/invoice_schema:Q8_0 # Run inference directly in the terminal: llama-cli -hf MayankLad31/invoice_schema:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MayankLad31/invoice_schema:Q8_0 # Run inference directly in the terminal: llama-cli -hf MayankLad31/invoice_schema:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MayankLad31/invoice_schema:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf MayankLad31/invoice_schema:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MayankLad31/invoice_schema:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf MayankLad31/invoice_schema:Q8_0
Use Docker
docker model run hf.co/MayankLad31/invoice_schema:Q8_0
- LM Studio
- Jan
- Ollama
How to use MayankLad31/invoice_schema with Ollama:
ollama run hf.co/MayankLad31/invoice_schema:Q8_0
- Unsloth Studio new
How to use MayankLad31/invoice_schema with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MayankLad31/invoice_schema to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MayankLad31/invoice_schema to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MayankLad31/invoice_schema to start chatting
- Pi new
How to use MayankLad31/invoice_schema with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MayankLad31/invoice_schema:Q8_0
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "MayankLad31/invoice_schema:Q8_0" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use MayankLad31/invoice_schema with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf MayankLad31/invoice_schema:Q8_0
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default MayankLad31/invoice_schema:Q8_0
Run Hermes
hermes
- Docker Model Runner
How to use MayankLad31/invoice_schema with Docker Model Runner:
docker model run hf.co/MayankLad31/invoice_schema:Q8_0
- Lemonade
How to use MayankLad31/invoice_schema with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MayankLad31/invoice_schema:Q8_0
Run and chat with the model
lemonade run user.invoice_schema-Q8_0
List all available models
lemonade list
llm.create_chat_completion(
messages = "No input example has been defined for this model task."
)Model Description
Extract text in any user specified schema.
Update: I finetuned qwen3.5 0.8B a bit.
We now have qwen_finetune.Q8_0.gguf and qwen_finetune.F16-mmproj.gguf that you may use.
I found it was already decent and needed just a little finetuning. Used unsloth.
Previously, I took a small 1.5B model fine-tuned with RL (GRPO on Qwen2.5-Coder) and asked it to extract structured JSON from OCR text based on any user-defined schema. You can find the model and the gguf.(100% local). Although not completely happy with my training and it still needs more work but it works! You can ignore that now.
How to Get Started with the Model
Use qwen_finetune.Q8_0.gguf and qwen_finetune.F16-mmproj.gguf. Ignore other files.
Start llama server(for cpu)
llama-server \
-m qwen_finetune.Q8_0.gguf \
--mmproj qwen_finetune.F16-mmproj.gguf \
--host 0.0.0.0 \
--port 8000 \
--jinja \
--reasoning off \
-ngl 0 \
-t 4 \
-n 1024
You can the use openai sdk as follows, you can specify any schema like in the example below for your invoice
from openai import OpenAI
import base64
# Use 'base_url' instead of 'base_client_url'
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-no-key-required"
)
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
base64_image = encode_image("out.jpeg")
response = client.chat.completions.create(
model="local-model",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": """Extract the data in JSON format using the schema: `{ "date": "string", "invoice_id": "string","all_items":[//list of items {"description":"string","quantity":"number","unit_price":"number","line_total":"number"}]}`"""},
{
"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"},
},
],
}
],
temperature=0.1,
# 'min_p' is passed via 'extra_body' for OpenAI-compatible local servers
extra_body={"min_p": 0.1}
)
print(response.choices[0].message.content)
We asked for Extract the data in JSON format using the schema: { "date": "string", "invoice_id": "string","bill_to":"string" // name and address,"ship_to":"string","all_items":[//list of items {"description":"string","quantity":"number","unit_price":"number","line_total":"number"}],"total":"number"}
Example response for the below image(a random invoice just for testing purposes, I am not the owner) with the code above

{'date': 'August 20, 2006', 'invoice_id': 'INV1048', 'bill_to': 'C1003, Test Customer Two, 88 WILLIAM Square, Sydney 12345, Australia', 'ship_to': '', 'all_items': [{'description': 'Very long product description that occupies more than 1 line - in fact, it occupies 2 lines', 'quantity': 1, 'unit_price': 199.99, 'line_total': 199.99}, {'description': 'One line product description', 'quantity': 2, 'unit_price': 420.0, 'line_total': 840.0}], 'total': 1140.87}
Connect with me on linkedin if you have an interesting project.
https://www.linkedin.com/in/mayankladdha31/
Previous model(in case you still want to try that,but i would not recommend it):
inv.Q8_0.gguf.
Use it in combination with paddleocr. Define any schema and hopefully you get the json. Needs some more work but it still works!
from llama_cpp import Llama
from paddleocr import PaddleOCR
text = ""
ocr = PaddleOCR(use_angle_cls=True, lang='en')
result = ocr.ocr("test_image.jpg", cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
text = text + line[-1][0]+ "\n"
llm = Llama(model_path="inv.Q8_0.gguf",n_ctx=2048)
import re
def extract_largest_json_block(text):
pattern = r"```json\s*(.*?)\s*```"
blocks = re.findall(pattern, text, re.DOTALL)
if not blocks:
return None
return max(blocks, key=len)
def extract_xml_answer(text: str) -> str:
answer = text.split("<answer>")[-1]
answer = answer.split("</answer>")[0]
return extract_largest_json_block(answer.strip())
messages = [
{"role": "system", "content": """Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
```json
```
</answer>"""},
{"role": "user", "content": f"{text}\n"+"""
Extract the data in JSON format using the schema:
{
"invoice_no":"string",
"issued_to": {
"name": "string",
"address": "string" // Address of the client
},
"pay_to": {
"bank_name": "string", // Name of the bank
"name": "string", // Name
"account_no": "number"
},
"items":[
{
"description": "string",
"quantity": "number",
"unit_price": "number",
"total":"number"
}
],
"subtotal":"number",
"total":"number"
} """},
]
output = llm.create_chat_completion(messages,max_tokens=1000)
print(extract_xml_answer(output['choices'][0]['message']['content']))
llm._sampler.close()
llm.close()
- Downloads last month
- 485
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MayankLad31/invoice_schema", filename="", )