Distilled model

#3
by Dark7Devil - opened

Hello! This is a promising work! I would like to know about couple of things about the model.

  1. Do you plan to release the dataset ?
  2. This is a huge model in terms of number of parameters, do you plan to train a distilled model to work on smaller infra or maybe even on edge devices?

Thanks!!

MBZUAI-IFM org

Thank you for showing interest in our work!

  1. We have listed the datasets used to train and test Nanda-87B in Table 4 of our technical report.
  2. Not as of now.
aaryamonvikram changed discussion status to closed

Sign up or log in to comment