The overall purpose of the project is to investigate adapting NNs to different complexities and limited resources. A secondary objective is to develop a framework for pruning networks and evaluate importance of parameters for different pruning strategies and applications.
Our project investigates gradual neural network pruning with redeployment into a progressively smaller network between pruning stages, producing a final model that is functionally equivalent to the pruned full-size network but substantially smaller in parameter count, memory footprint, and inference/carbon cost.
The core step is minimization: an exact structural rewrite that converts a pruned network with mask-zeros into a smaller dense network with an identical forward function. In our preliminary experiments, iterating a prune–minimize–re-enable–fine-tune cycle until no further units can be removed yields a ~40× parameter reduction on a fully-connected model on MNIST and a ~10× reduction on a single, incomplete ConvNeXt-Tiny run on CIFAR-10, both at matched or slightly higher accuracy than a single prune-and-shrink pass. We expect the ConvNeXt figure to improve with longer runs. The required compute cost is described below.