DisambertSingleSense-base

This model is a fine-tuned version of answerdotai/ModernBERT-base on the semcor dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: inverse_sqrt
lr_scheduler_warmup_steps: 1000
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Matthews
No log	0	0	11.6611	0.0	0.0	0.0	-0.0000
2.5218	1.0	14014	4.1247	0.5003	0.5245	0.5121	0.5243
1.7184	2.0	28028	3.8822	0.5656	0.5727	0.5692	0.5726
1.2533	3.0	42042	3.9284	0.5859	0.5907	0.5883	0.5905
0.9708	4.0	56056	4.0396	0.5868	0.5907	0.5888	0.5905
0.7932	5.0	70070	4.1447	0.5899	0.5968	0.5934	0.5966
0.6030	6.0	84084	4.1830	0.5932	0.6017	0.5974	0.6014
0.5155	7.0	98098	4.2383	0.6065	0.6082	0.6074	0.6080
0.4701	8.0	112112	4.2015	0.6014	0.6122	0.6068	0.6120
0.4166	9.0	126126	4.2186	0.6096	0.6131	0.6113	0.6128
0.3191	10.0	140140	4.3041	0.6076	0.6096	0.6086	0.6093
0.2979	11.0	154154	4.3275	0.6082	0.6104	0.6093	0.6102
0.2633	12.0	168168	4.3902	0.6171	0.6209	0.6190	0.6207
0.2061	13.0	182182	4.4546	0.6141	0.6196	0.6168	0.6194
0.1829	14.0	196196	4.3960	0.6134	0.6161	0.6147	0.6159
0.1793	15.0	210210	4.4565	0.6151	0.6196	0.6174	0.6194
0.1473	16.0	224224	4.4976	0.6165	0.6218	0.6192	0.6216
0.1631	17.0	238238	4.4916	0.6113	0.6179	0.6146	0.6177
0.1679	18.0	252252	4.5221	0.6114	0.6161	0.6137	0.6159
0.1567	19.0	266266	4.5560	0.6057	0.6166	0.6111	0.6164
0.1670	20.0	280280	4.6266	0.6127	0.6179	0.6153	0.6177
0.1817	21.0	294294	4.5746	0.6117	0.6196	0.6157	0.6194
0.1752	22.0	308308	4.6536	0.6131	0.6192	0.6161	0.6190
0.2083	23.0	322322	4.7661	0.6108	0.6192	0.6150	0.6190
0.1764	24.0	336336	4.7735	0.6105	0.6170	0.6137	0.6168
0.2072	25.0	350350	4.8155	0.6076	0.6157	0.6116	0.6155
0.1668	26.0	364364	4.7572	0.6025	0.6109	0.6067	0.6107
0.2046	27.0	378378	4.8226	0.6028	0.6113	0.6070	0.6111
0.2653	28.0	392392	4.8000	0.6032	0.6166	0.6098	0.6163
0.3166	29.0	406406	4.8968	0.6062	0.6174	0.6118	0.6172
0.3265	30.0	420420	4.9159	0.6058	0.6152	0.6105	0.6150

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

(1085)

this model