In this project, we aim to understand how feature learning affects the neural scaling law behaviour observed in the context of linear networks with random features. To this end, we will consider a quadratic linear neural network with one hidden layer. We plan to compute the mean of the corresponding test loss for different sets of inputs, including language and image datasets. We expect to observe qualitative changes which are not present in the phenomenology of scaling laws in linear neural networks, but they are in the context of LLMs i.e. the difference in the exponents N (number of parameters) and T (number of training data points) in the power law formulae.