FPGAs are (not) Good at Deep Learning [Invited]

Speaker: Mohamed S. Abdelfattah, Cornell University

There have been many attempts to use FPGAs to accelerate deep neural networks (DNNs), including many by the speaker of this talk. Some of these attempts ended up facing direct competition from GPUs and ASICs that are hyper-tuned for DNNs–inevitably, FPGAs often lose in that competition. However, there are many promising research directions in which FPGAs are indeed the best platform to accelerate parts of a deep learning workload. This talk will discuss several emerging paradigms in which FPGA strengths can be successfully leveraged for accelerating deep learning workloads. I will focus on (1) Automated DNN-HW codesign, (2) Using FPGA lookup tables as DNN building blocks and (3) The role of embedded networks on-chip in FPGA-powered datacenters.

Speaker Bio: Mohamed Abdelfattah is an Assistant Professor at Cornell Tech and in the Electrical and Computer Engineering Department at Cornell University. His research interests include deep learning systems, automated machine learning, hardware-software codesign, reconfigurable computing, and FPGA architecture. Mohamed’s goal is to design the next generation of machine-learning-centric computer systems for both datacenters and mobile devices.

Mohamed received his BSc from the German University in Cairo, his MSc from the University of Stuttgart, and his PhD from the University of Toronto. His PhD was supported by the Vanier Canada Graduate Scholarship and he received three best paper awards for his work on embedded networks-on-chip for FPGAs. His PhD work garnered much industrial interest and has since been adopted by multiple semiconductor companies in their latest FPGAs. After his PhD, Mohamed spent time at Intel’s programmable solutions group, and most recently at Samsung where he led a research team focused on hardware-aware automated machine learning.

----------------------------------------------------------
For more videos subscribe to the YouTube channel [ Ссылка ]!
For more information visit [ Ссылка ]

----------------------------------------------------------
Website: [ Ссылка ]

The Intel/VMware Crossroads 3D-FPGA Academic Research Center is jointly supported by Intel and VMware. The center is committed to public and free dissemination of its research outcome.

----------------------------------------------------------
Chapters

0:00 Introduction
0:43 GPU vs. DLA for DNN Acceleration
1:56 Arithmetic: Block Minifloat
4:33 Programming the Accelerator
6:39 Instruction Decode in HW
7:06 VLIW Network-on-Chip
8:47 Configurability: Custom Kernels
9:59 Customize Hardware for each DNN
10:48 Graph Compiler
12:32 Scheduling and Allocation
17:20 PART I: A Retrospective on FPGA Overlay for DNNS
19:13 Design Space Exploration Automated Codesi
20:22 AutoML: Neural Architecture Search (NAS)
21:59 AutoML: Hardware-Aware NAS
23:19 Hardware-Aware NAS Results
24:12 AutoML: Codesign NAS
27:23 Codesign NAS: Results
28:58 Automated Codesign
30:22 Mapping a DNN to Hardware
32:23 Binary Neural Networks
33:26 Logic Neural Networks
38:14 Deep Learning is Heterogeneous
42:16 Replace "Software Fallback" with Hardware Accelera
44:33 Accelerated Preprocessing Solutions
45:33 Hybrid FPGA-DLA Devices
47:57 Embedded NoCs on FPGAs
52:01 NoC-Enhanced vs. Conventional FPGAs
53:49 Is there still hope for FPGAs? Yes!

Смотрите далее

1 Час Машина Давит Предметы под Колесом АСМР МАШИНА

LiFePO4 На двух пальцах LFP Литий железо фосфатные аккумуляторы

💊 Сахарный диабет. Диета. Осеннее меню при диабете. Врач эндокринолог, диетолог Ольга Павлова.

Варим алюминий ПОЛУАВТОМАТОМ

Windows 11 24H2 (26100.1301) - Лучшая сборка для России!

Натянуть веревку - натяжитель веревки DIY Guy Rope or Line Runners Tensioner

Экипаж смотреть онлай 2016.

💯Откуда берутся физические константы и постоянные?

Как работает швейная машина после сложного ремонта, на разных материалах. Ч.12. Видео №702.

Изготовил червячную пару без фрез!

How to Perform Point Cooling Using Peltier -10 °C

Hiroshima – Short Film

СПЛАВИЛ ВСЕ ЗОЛОТЫЕ СЛИТКИ В ОДИН НА ~ 700 000 РУБЛЕЙ!

Распаковываем iMac 21.5 за 107000 рублей и ставим macOS 11 Big Sur. Как она там вообще работает?

ФИЛЬМ -РЕШАЛА 3, НУЛЕВЫЕ . Фильм 2019 (боевик, криминал)

Новые клипы

Тренды Наука