This project aims to develop an AI-accelerated workflow for generating statistically representative populations of long-chain lignin polymers for large-scale molecular dynamics simulations. Lignin is a heterogeneous, branched biopolymer whose monomer composition, linkage distribution, degree of polymerization, and polydispersity depend strongly on biological feedstock. Existing rule-based lignin generators, including lignin-KMC, LigninGraphs, ligninBuilder-based workflows, and SPRinG, have enabled important progress in generating chemically plausible lignin oligomers and polydisperse molecular systems. However, these approaches remain computationally limited at the chain lengths and population sizes required for simulations of lignin self-assembly and morphology. The target regime for this project is lignin populations with degree of polymerization DP 50–200+, containing hundreds to thousands of chains, with feedstock-specific monomer and linkage statistics. Such populations are needed as input structures for molecular dynamics studies of lignin self-assembly and morphology on nanometre to tens-of-nanometers length scales. Current rule-based approaches can reproduce experimentally derived population-level descriptors, such as monomer ratios and linkage percentages from NMR-informed distributions, but their iterative generation cost becomes prohibitive for producing thousands of long chains with realistic polydispersity.