stock-recognizer-model

LoRA adapter for financial named entity recognition using GLiNER2 Large. Extracts stock ticker symbols and company names from social media and financial text.

Model Details

  • Base Model: fastino/gliner2-large-v1
  • Adapter Type: LoRA (r=32, α=64)
  • Task: Named Entity Recognition (Token Classification)
  • Training Data: 3,500+ annotated Reddit posts
  • Framework: PEFT 0.19.1

Overview

This adapter fine-tunes GLiNER2 Large to recognize two entity types in financial social media text:

  • ticker: Stock market symbols ($AAPL, TSLA, gme, etc.)
  • company: Corporate names (Apple Inc., Microsoft, Goldman Sachs, etc.)

Trained on Reddit posts (primarily r/wallstreetbets) and optimized for informal, casual financial discussions. Serves as the NER backbone for the stock-recognizer resolution engine.

Intended Use

Extract stock market entities from:

  • Financial social media (Twitter/X, Reddit, StockTwits)
  • Forum discussions and comments
  • User-generated financial content
  • News comments and reader discussions

The model handles ticker symbols in multiple forms: cashtags ($GME), uppercase (AMC), and informal lowercase (amc).

Training Details

Data

  • 3,500+ manually annotated documents from r/wallstreetbets and related communities
  • Labeled in Label Studio
  • Train/validation split: 90/10 (stratified by task ID)
  • Chunking: 150-word windows with 40-word overlap

Hyperparameters

Parameter Value
LoRA Rank (r) 32
LoRA Alpha (α) 64
LoRA Dropout 0.1
Epochs 10 (with early stopping)
Batch Size 4 (gradient accumulation: 2)
Max Seq Length 256
Encoder LR 2e-5
Task LR 5e-4
Precision bfloat16
Target Modules key_proj, value_proj, query_proj, dense

Benchmark Results

Evaluated on held-out test set (500+ documents) using set-based, deduplicated scoring:

Metric Score
Precision 82%
Recall 78%
F1 80%

Scoring note: Set-based evaluation counts each entity type as "found" once per document, regardless of mention frequency. This reflects the engine's public API, which returns deduplicated sets of entities.

Usage

Load with GLiNER2

from gliner2 import GLiNER2

# Load base model
model = GLiNER2.from_pretrained("fastino/gliner2-large-v1")

# Load the LoRA adapter
model.load_adapter("StephanAkkerman/stock-recognizer-model", revision="v18")

# Inference
text = "$GME is mooning but Apple Inc. might crash tomorrow"
entities = model.predict_entities(text, ["ticker", "company"])

for entity in entities:
    print(f"{entity['text']}: {entity['label']} (score: {entity['score']:.2f})")

Engine Integration

This adapter is automatically loaded by stock-recognizer when calling recognize_ai(). The engine handles entity extraction, resolution, and deduplication.

Known Limitations

Social media bias: Trained on Reddit; performance on news, research, or formal text may differ Boundary mismatches: Occasional off-by-one errors on multi-word entities Rare tickers: Low-frequency emerging companies may be missed Out-of-vocabulary names: Unseen company names may be mislabeled No resolution: Extracts entities but does not resolve ambiguous symbols (e.g., AA → Alcoa or American Airlines)

License

Refer to the base model (fastino/gliner2-large-v1) for licensing terms. Training data subject to Reddit's terms of service.

Citation

@misc{stock_recognizer_v18, author = {Akkerman, Stephan}, title = {Stock Recognizer Model: LoRA Adapter for Financial NER}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/StephanAkkerman/stock-recognizer-model}}, note = {Revision v18} }

Repositories

Adapter Training: stock-recognizer-model Engine / Inference: stock-recognizer

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for StephanAkkerman/stock-recognizer-model

Adapter
(14)
this model

Dataset used to train StephanAkkerman/stock-recognizer-model