samaffolter 's Collections AugmentedLearning
updated
What Makes Good Data for Alignment? A Comprehensive Study of Automatic
Data Selection in Instruction Tuning
Paper
• 2312.15685
• Published
• 16
mistralai/Mixtral-8x7B-Instruct-v0.1
47B • Updated
• 672k
• 4.64k
Text Generation
• 3B • Updated
• 1.5M
• 3.43k
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Text Generation
• 1B • Updated
• 1.71M
• 1.53k
Are Emergent Abilities in Large Language Models just In-Context
Learning?
Paper
• 2309.01809
• Published
• 3
Commonsense Knowledge Transfer for Pre-trained Language Models
Paper
• 2306.02388
• Published
• 1
Schema-learning and rebinding as mechanisms of in-context learning and
emergence
Paper
• 2307.01201
• Published
• 2
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Paper
• 2305.01610
• Published
• 2
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable
Mixture-of-Expert Inference
Paper
• 2308.12066
• Published
• 4
Experts Weights Averaging: A New General Training Scheme for Vision
Transformers
Paper
• 2308.06093
• Published
• 2
Multi-Head Adapter Routing for Cross-Task Generalization
Paper
• 2211.03831
• Published
• 2
Alternating Gradient Descent and Mixture-of-Experts for Integrated
Multimodal Perception
Paper
• 2305.06324
• Published
• 1
Multimodal Foundation Models: From Specialists to General-Purpose
Assistants
Paper
• 2309.10020
• Published
• 41
MIMIC-IT: Multi-Modal In-Context Instruction Tuning
Paper
• 2306.05425
• Published
• 12
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models
Paper
• 2309.04041
• Published
• 1
From Sparse to Soft Mixtures of Experts
Paper
• 2308.00951
• Published
• 22