Register today for upcoming Arm Tech Talks: [ Ссылка ]
Get ready for another one of our Arm Tech Talks! Every fortnight, we discuss and explore some of the latest trends, technologies and best practices in the world of AI, featuring partners from the AI Ecosystem as well as speakers across Arm.
00:00 Start
00:27 Nota Introduction
02:58 Problem Statement
04:47 Arm Virtual Hardware
06:10 Ethos-U65
09:10 Neural Network Blocks
13:16 Netspresso
24:00 Q&A
Since modern AI chipsets have different strategies for efficient operations, most neural network models may not be sufficiently optimized for these devices in terms of latency and memory footprint.
In this talk, we present how we make popular neural models be efficiently deployed on Ethos-U65, a newly launched Micro NPU. To this end, we first examine various operation forms (e.g., convolution types and filter size) and identify suitable operations to improve the accuracy-latency trade-off.
Based on this investigation, we carefully redesign well-known convolutional blocks (e.g., inverted residual blocks and ghost blocks) and use these blocks to replace computationally inefficient blocks in given models. We demonstrate that the model variants obtained by our approach can significantly reduce inference time as well as memory budget without noticeable performance drops on Ethos-U65.
This talk is part of the bi-weekly AI Virtual Tech Talk Series: [ Ссылка ]
If you enjoyed this video, please subscribe to our channel and make sure to add us on Twitter so you can get a lot more content like this delivered straight to your feed!
[ Ссылка ]
Ещё видео!