Player FMアプリでオフラインにしPlayer FMう!
聞く価値のあるポッドキャスト
スポンサード
Optimization Techniques for Powerful yet Tiny Machine Learning Models
Manage episode 422964912 series 3574631
Can machine learning models be both powerful and tiny? Join us in this episode of TinyML Talks, where we uncover groundbreaking techniques for making machine learning more efficient through high-level synthesis. We sit down with Russell Clayne, Technical Director at Siemens EDA, who guides us through the intricate process of pruning convolutional and deep neural networks. Discover how post-training quantization and quantization-aware training can trim down models without sacrificing performance, making them perfect for custom hardware accelerators like FPGAs and ASICs.
From there, we dive into a practical case study involving an MNIST-based network. Russell demonstrates how sensitivity analysis, network pruning, and quantization can significantly reduce neural network size while maintaining accuracy. Learn why fixed-point arithmetic is superior to floating-point in custom hardware, and how leading research from MIT and industry advancements are revolutionizing automated network optimization and model compression. You'll gain insights into how these techniques are not just theoretical but are being applied in real-world scenarios to save area and energy consumption.
Finally, explore the collaborative efforts between Siemens, Columbia University, and Global Foundries in a wake word analysis project. Russell explains how transitioning to hardware accelerators via high-level synthesis (HLS) tools can yield substantial performance improvements and energy savings. Understand the practicalities of using algorithmic C data types and Python-to-RTL tools to optimize ML workflows. Whether it's quantization-aware training, data movement optimization, or the fine details of using HLS libraries, this episode is packed with actionable insights for streamlining your machine learning models.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
章
1. TinyML Talks (00:00:00)
2. Network Pruning and Quantization (00:10:51)
3. Optimizing Quantized Neural Networks (00:21:51)
4. High-Level Synthesis for ML Acceleration (00:37:27)
5. Hardware Design and Optimization Techniques (00:47:06)
23 つのエピソード
Manage episode 422964912 series 3574631
Can machine learning models be both powerful and tiny? Join us in this episode of TinyML Talks, where we uncover groundbreaking techniques for making machine learning more efficient through high-level synthesis. We sit down with Russell Clayne, Technical Director at Siemens EDA, who guides us through the intricate process of pruning convolutional and deep neural networks. Discover how post-training quantization and quantization-aware training can trim down models without sacrificing performance, making them perfect for custom hardware accelerators like FPGAs and ASICs.
From there, we dive into a practical case study involving an MNIST-based network. Russell demonstrates how sensitivity analysis, network pruning, and quantization can significantly reduce neural network size while maintaining accuracy. Learn why fixed-point arithmetic is superior to floating-point in custom hardware, and how leading research from MIT and industry advancements are revolutionizing automated network optimization and model compression. You'll gain insights into how these techniques are not just theoretical but are being applied in real-world scenarios to save area and energy consumption.
Finally, explore the collaborative efforts between Siemens, Columbia University, and Global Foundries in a wake word analysis project. Russell explains how transitioning to hardware accelerators via high-level synthesis (HLS) tools can yield substantial performance improvements and energy savings. Understand the practicalities of using algorithmic C data types and Python-to-RTL tools to optimize ML workflows. Whether it's quantization-aware training, data movement optimization, or the fine details of using HLS libraries, this episode is packed with actionable insights for streamlining your machine learning models.
Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org
章
1. TinyML Talks (00:00:00)
2. Network Pruning and Quantization (00:10:51)
3. Optimizing Quantized Neural Networks (00:21:51)
4. High-Level Synthesis for ML Acceleration (00:37:27)
5. Hardware Design and Optimization Techniques (00:47:06)
23 つのエピソード
すべてのエピソード
×1 Panel Discussion - EDGE AI TAIPEI - Revolutionizing Edge Computing with AI-Driven Innovations 58:42
1 Revolutionizing Edge Devices with Energy-Efficient Generative AI Techniques 50:19
1 Transforming Edge AI Education: Insights from Harvard's Dr. Vijay Janapa Reddi 59:56
1 Recap of the "Beyond Chatbots - The Journey of Generative AI to the Edge" 8:31
1 Crafting Artistic Images with Embedded AI with Alberto Ancilotto of FBK 31:30
1 Revolutionizing Software Development with GenAI-Powered Edge Solutions with Anirban Bhattacharjee of Wipro 28:43
1 Tomorrow's Edge AI: Cutting-Edge Memory Optimization for Large Language Models with Seonyeong Heo of Kyung Hee University 30:29
1 Revolutionizing Automotive AI with Small Language Models with Alok Ranjan of BOSCH 31:59
1 From tinyML to the edge of AI: Introducing the EDGE AI FOUNDATION 14:09
1 Unveiling the Technological Breakthroughs of ExecuTorch with Meta's Chen Lai 31:24
1 Revolutionizing TinyML: Integrating Large Language Models for Enhanced Efficiency 27:06
1 Harnessing Edge AI: Transforming Industries with Advanced Transformer Models with Dave McCarthy of IDC and Pete Bernard of tinyML Foundation 33:53
1 Transforming the Edge with Generative AI: Unraveling Innovations Beyond Chatbots with Danilo Pau, IEEE Fellow from STMicroelectronics 6:47
1 Revolutionizing Weather Forecasting with Acoustic Smart Technology 25:04
1 Revolutionizing Nano-UAV Technology with Cutting-Edge On-Device Learning Strategies 21:57
プレーヤーFMへようこそ!
Player FMは今からすぐに楽しめるために高品質のポッドキャストをウェブでスキャンしています。 これは最高のポッドキャストアプリで、Android、iPhone、そしてWebで動作します。 全ての端末で購読を同期するためにサインアップしてください。