Tutorial: Neural Network Accelerator Co-Design with FINN

Venue: 4.005, Ashby Building, QUB.

Organizers: Michaela Blott (AMD), Thomas Preusser (AMD), Jakoba Petri-Koenig (AMD), Zaid AI-Ars (Delft University of Technology)

NOTE: Extra registration step required. Please register with the conference and then at https://bit.ly/registration_finn_tutorial_fpl22

Abstract: Embedding machine learning into high-throughput, low-latency edge applications needs co-designed solutions to meet the performance requirements. Quantized Neural Networks (QNNs) combined with custom FPGA dataflow implementations offer a good balance of performance and flexibility, but building such implementations by hand is difficult and time-consuming. In this tutorial, we will introduce FINN, an open-source experimental framework by AMD AECG Research to help the broader community explore QNN inference on FPGAs. Providing a full-stack solution from quantization-aware training to bitfile, FINN generates high-performance dataflow-style FPGA architectures customized for each network. Participants will be introduced to efficient inference with QNNs and streaming dataflow architectures, the different components of the project’s open-source ecosystem, and gain hands-on experience by training a quantized neural network with Brevitas and deploying it with FINN.