Energy Aware Neural Pruning Algorithms For Sustainable Large Scale Model Deployment

Dr K Gopalakrishnan; Dr. K.  Suvarnalakshmi; S. Malarvizhi; Kushagra Kulshreshtha; K.S.S. Joseph Sastry; Dr. K. V. Panduranga  Rao

doi:10.51483/IJAIML.6.4s.2026.749-755

Authors

Dr K Gopalakrishnan Professor, Loyola-ICAM College of Engineering and Technology, Chennai, India.
Dr. K. Suvarnalakshmi Assistant Professor, Department of English, Aditya University, Surampalem, Andhra Pradesh, India.
S. Malarvizhi Assistant Professor, Department of Commerce, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, India.
Kushagra Kulshreshtha Institute of Business Management, GLA University, Mathura, Uttar Pradesh, India.
K.S.S. Joseph Sastry Department of CSE, Ramachandra College Of Engineering, Eluru – 534007, India.
Dr. K. V. Panduranga Rao Professor, Department of CSE-AI&ML, Lakireddy Bali Reddy College of Engineering, Mylavaram-521230, Andhra Pradesh, India.

DOI:

https://doi.org/10.51483/IJAIML.6.4s.2026.749-755

Keywords:

Green AI, Model Compression, Structural Pruning, Sustainable Deep Learning, Hardware Efficiency, Neural Architecture Optimization.

Abstract

The recent explosion in the size of deep learning models has posed new computational and economic difficulties, mainly due to their large carbon footprint and energy expenditure during inference on the edge and in the cloud. This work solves the above challenges by proposing a framework that leverages energy-efficient neural pruning methods that aim at accelerating sustainable learning models. The problem addressed is the non-linear connection between the number of parameters and the power consumption of architectures, in which conventional magnitude pruning approaches fall short in considering the energy cost incurred by the architecture. To solve the above problem, the solution considers energy consumption per layer as part of the selection process. The experimental findings reveal that the energy-prioritized structural pruning technique results in a reduction of the total power usage of 42.6% in standard transformers and large-scale convolutional neural networks without compromising an accuracy benchmark of 98.4%. Furthermore, it is statistically validated that in comparison to the traditional magnitude-based pruning technique, the new paradigm ensures that there is a reduction in hardware memory constraints by 31.5% and inference latency by 24.8%. In conclusion, the research findings validate the fact that focusing on hardware energy parameters in structural pruning will drive the AI deployment paradigm towards a green engineering standard suitable for embedded IoT and high-performance server clusters.

Energy Aware Neural Pruning Algorithms For Sustainable Large Scale Model Deployment

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

INDEXING

Information

Keywords