Efficient computation of neural networks--Efficient Intelligent Computing and Learning

Your current location：Home > Research

Efficient computation of neural networks Intelligent Scientific Computing

Introduction:

Efficient computing of neural networks within deep learning tasks focuses on optimizing computational processes and resource utilization to enhance computational efficiency and performance. Deep learning models typically involve a large number of parameters and complex computational needs, making efficient computing crucial for accelerating training and inference processes and reducing resource and energy consumption. Research areas include model compression and acceleration, NPU accelerator design, image analysis and computation, and low-power embedded systems.

(1) Model Compression and Acceleration: This research explores neural network sparsity, quantization, low-rank decomposition, and knowledge distillation techniques to compress pre-trained models, thereby reducing model parameter storage and enhancing training and inference speeds.

(2) NPU Accelerator Design: Research in this area involves developing efficient neural network computational architectures based on FPGA/ASIC. The focus is on algorithm-architecture co-design to improve computational efficiency and reduce power consumption in neural networks.

(3) Image Analysis and Computation: This field involves using mathematical models combined with image and video processing techniques to analyze and understand the structure and content of images and videos. Research topics include object detection, semantic segmentation, motion and gesture recognition, low-level image and video processing, and remote sensing image interpretation.

(4) Low-Power Embedded Systems: This area focuses on developing efficient neural network computational acceleration libraries for heterogeneous embedded computing platforms such as CPUs, GPUs, and DSPs. By adapting lightweight algorithms and efficient acceleration libraries, it aims to achieve high efficiency and low power consumption in neural networks on embedded platforms.

Related Papers：

Peisong Wang, Weihan Chen, Xiangyu He, Qiang Chen, Qingshan Liu, Jian Cheng. Optimization-based Post-training Quantization with Bit-split and Stitching. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Vol.45, No.2, pp.2119-2135, 2023.
Weihan Chen, Peisong Wang, Jian Cheng. Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization. ICCV 2021.
Tianli Zhao, Jiayuan Chen, Cong Leng, Jian Cheng. TinyNeRF: Towards 100x Compression of Voxel Radiance Fields. AAAI 2023.
Peisong Wang, Xiangyu He, Gang Li, Tianli Zhao, Jian Cheng. Sparsity-Inducing Binarized Neural Networks. AAAI 2020.
Weixiang Xu, Xiangyu He, Ke Cheng, Peisong Wang, Jian Cheng. Towards Fully Sparse Training: Information Restoration with Spatial Similarity. AAAI 2022.
Zejian Liu, Gang Li, Jian Cheng. Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing. DATE 2021. (Best paper candidate)
Zejian Liu, Gang Li, Jian Cheng. Efficient Accelerator/Network Co-Search with Circular Greedy Reinforcement Learning. IEEE Transactions on Circuits and Systems-II: Express Briefs, Vol.70, No.7, pp.2615-2619, 2023.
Zejian Liu, Gang Li, Jian Cheng. TBERT: Dynamic BERT Inference with Top-k Based Predictors. DATE 2023.
Jiaxing Wang, Haoli Bai, Jiaxiang Wu, Xupeng Shi, Junzhou Huang, Irwin King, Michael Lyu, Jian Cheng. Revisiting Parameter Sharing for Automatic Neural Channel Number Search. NeurIPS 2020.
Jiaxing Wang, Haoli Bai, Jiaxiang Wu, Jian Cheng. Bayesian Automatic Model Compression. IEEE Journal of Selected Topics in Signal Processing, Vol.14, No.4, pp.727-736, 2020.
Tianli Zhao, Qinghao Hu, Xiangyu He, Weixiang Xu, Jiaxing Wang, Cong Leng, Jian Cheng. ECBC: Efficient Convolution via Blocked Columnizing. IEEE Transactions on Neural Networks and Learning Systems (TNNLS), Vol.34, No.1, pp.433-445, 2023.

Scan itWechat

Research

Scan it
Wechat