Djuunaa

djuna

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

CohereLabs/BLS-Mini-Code-1.0

reacted to evalstate's post with 🚀 4 days ago

Hugging Face MCP Server v0.3.17 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SEP-2640 "Skills Over MCP" support added (early access)

liked a model 5 days ago

hanzogak/Anima-Comradeship

View all activity

Organizations

liked a model 1 day ago

CohereLabs/BLS-Mini-Code-1.0

30B • Updated 2 days ago • 1.01k • 42

reacted to evalstate's post with 🚀 4 days ago

Post

3213

Hugging Face MCP Server v0.3.17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SEP-2640 "Skills Over MCP" support added (early access)

2 replies

liked 3 models 5 days ago

#1 opened 10 days ago by

djuna

upvoted a collection 10 days ago

TongUI

Collection

Open source our work TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials; https://github.com/TongUI-agent/TongUI-agent • 14 items • Updated 20 days ago • 4

reacted to vincentg64's post with 🔥 12 days ago

Post

952

96% Correct Next Token Prediction, with No DNN, no Training, auto-distilled model - https://mltblog.com/4urfvTB

Over the last 12 months, I’ve built a model to predict the next token and to suggest synonyms or related queries to a user prompt, with 100% correct predictions on the training set in one shot, without training or deep neural networks (DNNs). The same model is now integrated in some of the most recent LLM architectures, albeit with costly training via DNNs. My version does not need DNNs or training.

The purpose of this article is to provide validation to my deep neural network alternative in the context of LLMs. The new model is as a substitute to standard DNNs, with increased explainability and higher accuracy. It is designed for corporate corpuses. The end goal is to provide better accuracy at a much lower cost, while providing full control over all the components.

An interesting feature is auto-distillation, whereas the model self-identifies weights that do not contribute over time in 99.9% of user-generated prompts, and drop them, based on prompts from a large, specialized user base. The gain is most spectacular in open-weight LLMs applied to specialized contexts, whether based on DNNs or not.

Read article and download the free technical paper with NVIDIA case study, at https://mltblog.com/4urfvTB

upvoted an article 12 days ago

Article

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

sergiopaniego, ariG23498

•

14 days ago

• 101

liked a model 16 days ago

syvai/cohere-transcribe-diarize

Automatic Speech Recognition • 2B • Updated 16 days ago • 644 • 25

reacted to FlameF0X's post with 🚀 21 days ago

Post

277

Greetings Hugging Face!

I started a new project called **FWKV** (Feed-forward Weighted Key Value, or Floored Weighted Key Value), a RWKV-style LM that uses FFNNs (Feed-Forward Neural Networks) instead of RNN and floor(W·K·V). I'm hoping to make it much more efficient and scalable than RWKV.

So far I have:

- FlameF0X/FWKV-29M — this one is undertrained and doesn't have a Space yet. In the attached image you can see its speed on a T4 compared to models with the same configuration.

The only model that's fully working right now is:
- FlameF0X/FWKV-TinyStories — trained on TinyStories for one epoch. The demo Space is FlameF0X/FWKV-demo.