Google says that DiffusionGemma can generate more than 1,000 tokens per second when running on a single H100, a server-grade ...
DiffusionGemma generates text up to 4x faster than traditional models by producing entire blocks simultaneously, achieving ...
Most AI models are designed to be autoregressive—they generate text left to right one token at a time. DiffusionGemma has ...