Distilled model
#3
by
Dark7Devil
- opened
Hello! This is a promising work! I would like to know about couple of things about the model.
- Do you plan to release the dataset ?
- This is a huge model in terms of number of parameters, do you plan to train a distilled model to work on smaller infra or maybe even on edge devices?
Thanks!!
Thank you for showing interest in our work!
- We have listed the datasets used to train and test Nanda-87B in Table 4 of our technical report.
- Not as of now.
aaryamonvikram
changed discussion status to
closed