Training Mamba for Long Context Sequences using Deepspeed | Austin .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

April 11, 2024 · Austin

Mamba Long Context Training

Explore how Mamba, a linear-scaling model, is trained for long contexts using DeepSpeed and SlimPajama, demonstrating its potential for efficient, accurate long-sequence processing.

Overview
Links
Tech stack