Understanding biomolecular dynamics across large spatial and temporal scales remains a central challenge in computational biophysics. This project develops a data-driven, information-theoretic framework for representation learning of biomolecules by combining machine learning with statistical mechanics to model complex molecular dynamics. Using an SO(3)-equivariant autoencoder structure, the goal is to train an estimator of mutual information (MI) in high dimensions between the all-atom protein structure and its encoded counterpart for improving representation learning and increasing the interpretability of equivariant neural networks. Applications include peptides and fast-folding proteins, and we employ deep neural networks to learn expressive CG mappings and the MI estimator. The project requires high-performance computing resources for molecular dynamics simulations and their analysis as well as training and sampling of ML models. Results will be released as open-source tools to enable physics-motivated CG modeling across the biomolecular simulation community.