Introduction:
Efficient computing of neural networks within deep learning tasks focuses on optimizing computational processes and resource utilization to enhance computational efficiency and performance. Deep learning models typically involve a large number of parameters and complex computational needs, making efficient computing crucial for accelerating training and inference processes and reducing resource and energy consumption. Research areas include model compression and acceleration, NPU accelerator design, image analysis and computation, and low-power embedded systems.
(1) Model Compression and Acceleration: This research explores neural network sparsity, quantization, low-rank decomposition, and knowledge distillation techniques to compress pre-trained models, thereby reducing model parameter storage and enhancing training and inference speeds.
(2) NPU Accelerator Design: Research in this area involves developing efficient neural network computational architectures based on FPGA/ASIC. The focus is on algorithm-architecture co-design to improve computational efficiency and reduce power consumption in neural networks.
(3) Image Analysis and Computation: This field involves using mathematical models combined with image and video processing techniques to analyze and understand the structure and content of images and videos. Research topics include object detection, semantic segmentation, motion and gesture recognition, low-level image and video processing, and remote sensing image interpretation.
(4) Low-Power Embedded Systems: This area focuses on developing efficient neural network computational acceleration libraries for heterogeneous embedded computing platforms such as CPUs, GPUs, and DSPs. By adapting lightweight algorithms and efficient acceleration libraries, it aims to achieve high efficiency and low power consumption in neural networks on embedded platforms.
Related Papers: