flash-attention-with-sink implements an attention variant used in GPT-OSS 20B that integrates a "sink" step into FlashAttention. This repo focuses on the forward path and provides an experimental ...
Eric Gutiérrez, 6th February 2026. A Python implementation of a 1-hidden layer neural network built entirely from first principles. This project avoids deep learning libraries (like TensorFlow or ...