Sentence Similarity
sentence-transformers
Safetensors
qwen2
feature-extraction
Generated from Trainer
dataset_size:921564
loss:CachedMultipleNegativesRankingLoss
custom_code
text-embeddings-inference
Instructions to use FINGU-AI/FingUv2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use FINGU-AI/FingUv2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("FINGU-AI/FingUv2", trust_remote_code=True) sentences = [ "[array([{'from': 'human', 'value': '低IPの人に対する侮辱とは?'},\n {'from': 'gpt', 'value': '例えば、ジョークの最初に \"I was going to do X, but I just remembered because I\\'m changed mind because it would be too easy for you get idea of the punchline \"と言うことです。'},\n {'from': 'human', 'value': 'なるほど。IQの低い人を侮辱するという意味だったんだ。'}],\n dtype=object) ]", "['一般的な侮辱は「お前はバカだ」で、これは太古の昔から使われている。']", "[' 먼저 마이크가 매년 저축하는 금액을 계산해 봅시다. 마이크는 연봉 15만 달러의 10%를 저축합니다:\\n\\n150,000달러의 10% = 0.10 * $150,000 = $15,000\\n\\n이제 6년 동안 그가 저축할 총 금액을 계산해 봅시다:\\n\\n1년 $15,000 * 6년 = $90,000\\n\\n마이크는 집의 20%를 계약금으로 지불해야 합니다. 그가 사고 싶은 집의 가격을 P로 표시해 봅시다. 마이크가 저축한 90,000달러는 P의 20%를 나타냅니다. 다음 방정식을 설정할 수 있습니다:\\n\\n0.20 * P = $90,000\\n\\n이제 P를 풀 수 있습니다:\\n\\nP = $90,000 / 0.20\\nP = $450,000\\n\\n따라서 마이크가 사고 싶은 집의 가격은 $450,000입니다.']", "['例えば、https://www.urbandictionary.com/define.php?term=insult+for+someone+with+a+など。']" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
SentenceTransformer based on Alibaba-NLP/gte-Qwen2-1.5B-instruct
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-Qwen2-1.5B-instruct. It maps sentences & paragraphs to a 4096-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-Qwen2-1.5B-instruct
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 4096 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 4096, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("FINGU-AI/FingUv2")
# Run inference
sentences = [
"Hawaii airport and homes evacuated as fast-moving fire hits West Maui Child booed at Blackhawks game for saying he'll be Bears QB Mitchell Trubisky for Halloween Report: Jim Harbaugh exploring potential NFL return Climate change has finally caught up to this Alaska village MLB umpire Joe West suing former All-Star Paul Lo Duca for claiming he took bribes Cardinals sign former Pro Bowler Alfred Morris Report: Julio Jones spoke up in support of Dan Quinn after Falcons' latest loss 2 UConn Students Arrested After Shouting Racist Slur, Officials Say Newly Signed Raven Makes Comeback After Losing Job, Ring Boeing profit plunges as MAX grounding takes heavy toll Erin Andrews Rejects the Hate From a Twitter Troll Over Her 'DWTS' Outfit in the Best Way Cincinnati med student opens free health clinic for the uninsured China is willing to buy $20 billion worth of US farm goods Eric Tse, 24, just became a billionaire overnight Potential trade targets for all 32 NFL teams at the 2019 deadline, from A.J. Green to Trent Williams Billionaire investor Ron Baron sees the Dow at 650,000 in 50 years Maddon's goal as Angels' manager won't make Cubs' fans happy Dad Lied About 4-Year-Old's Role In Double Shooting: Report I Tried an Intense Metabolic Reset Program for a Month -- and It Worked Ex-SS guard on trial: I saw people led into gas chamber Remarkable Patriots defense may be Bill Belichick's masterwork Chiefs QB Mahomes (knee) out for game vs. Packers Jeff Bezos is set to lose his crown as world's richest person My SO and I Might Never Get Married, and I'm Totally Fine With That Orlando Scandrick rips Eagles: They have 'accountability issues' Opinion: Browns' Freddie Kitchens, Jets' Adam Gase could be headed for one-and-done territory Orioles' Chris Davis sets record with $3M donation to University of Maryland Children's Hospital Anthony Davis on possibly joining hometown Bulls: 'I mean, I am a free agent next year'",
"13 Reasons Why's Christian Navarro Slams Disney for Casting 'the White Guy' in The Little Mermaid",
"Opinion: Colin Kaepernick is about to get what he deserves: a chance Ford v Ferrari: the forgotten car at the heart of the Le Mans '66 clash I've been writing about tiny homes for a year and finally spent 2 nights in a 300-foot home to see what it's all about here's how it went The Kardashians Face Backlash Over 'Insensitive' Family Food Fight in KUWTK Clip 3 Indiana judges suspended after a night of drinking turned into a White Castle brawl Report: Police investigating woman's death after Redskins' player Montae Nicholson took her to hospital 66 Cool Tech Gifts Anyone Would Be Thrilled to Receive There's a place in the US where its been over 80 degrees since March Police find 26 children behind false wall at Colorado day care",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 4096]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Logs
| Epoch | Step | Training Loss | reranking loss |
|---|---|---|---|
| 0.0347 | 500 | 0.4287 | 0.3681 |
| 0.0694 | 1000 | 0.3629 | 0.3409 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.2.0+cu121
- Accelerate: 0.32.1
- Datasets: 2.20.0
- Tokenizers: 0.19.1
- Downloads last month
- 19
Model tree for FINGU-AI/FingUv2
Base model
Alibaba-NLP/gte-Qwen2-1.5B-instruct