Ramesh 's picture

Ramesh

rameshch

·

AI & ML interests

None yet

Recent Activity

new activity 2 days ago

Qwen/Qwen3.5-27B:Value error, Model architectures ['Qwen3_5ForConditionalGeneration'] are not supported for now. Transformers version 5.3.0.dev0

liked a model about 2 months ago

ContactDoctor/Bio-Medical-Llama-3-3-8B

new activity 9 months ago

XiaomiMiMo/MiMo-VL-7B-RL:Is there a way to disable or turn off the thinking process? Additionally, when asked about itself, it responds by saying, "I am ChatGPT from OpenAI."

View all activity

Organizations

New activity in Qwen/Qwen3.5-27B 2 days ago

Value error, Model architectures ['Qwen3_5ForConditionalGeneration'] are not supported for now. Transformers version 5.3.0.dev0

#8 opened 2 days ago by

New activity in XiaomiMiMo/MiMo-VL-7B-RL 9 months ago

Is there a way to disable or turn off the thinking process? Additionally, when asked about itself, it responds by saying, "I am ChatGPT from OpenAI."

#5 opened 9 months ago by

New activity in Qwen/Qwen2.5-VL-32B-Instruct-AWQ 9 months ago

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: struct c10::Half != struct c10::BFloat16

#9 opened 11 months ago by

New activity in Qwen/Qwen2.5-Omni-3B 10 months ago

EOS_TOKEN_ID ?

#6 opened 10 months ago by

New activity in google/gemma-3-27b-it 11 months ago

Tokens generated per second

#39 opened 11 months ago by

New activity in Qwen/Qwen2.5-VL-32B-Instruct 11 months ago

Thank You for Open-Sourcing Your Model & Feedback

#4 opened 11 months ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 11 months ago

How do we use it with Transformers? can you give some sample code ?

#22 opened 12 months ago by

New activity in meta-llama/Llama-3.2-1B-Instruct over 1 year ago

Error(s) in loading state_dict for PeftModelForCausalLM:

#23 opened over 1 year ago by

New activity in openbmb/MiniCPM-Llama3-V-2_5 over 1 year ago

Is it possible to merge MiniCPM-Llama3-V-2-5 with a Llama-3-1 based model using MOE

#68 opened over 1 year ago by

New activity in lmms-lab/llava-onevision-projectors over 1 year ago

llava-Onevision-projector for LLama-3.1-8B Model

#4 opened over 1 year ago by

New activity in openbmb/MiniCPM-Llama3-V-2_5 over 1 year ago

RuntimeError: only Tensors of floating point dtype can require gradients

#69 opened over 1 year ago by

Is it possible to merge MiniCPM-Llama3-V-2-5 with a Llama-3-1 based model using MOE

#68 opened over 1 year ago by

Is it possible to merge MiniCPM-Llama3-V-2-5 with a Llama-3-1 based model using MOE

#68 opened over 1 year ago by