Kolmogorov-Arnold networks (KANs) are a new development within the field of machine learning where they might offer an alternative to multi-layer-perceptron-based networks. Their key difference is that instead of having set activation functions and trainable weights, KANs have trainable activation functions represented by a sum of functional splines with trainable positions and factors.
This allows for more interpretability which makes them have especially high potential as aids for research within physics and mathematics. In this project we wish to research the applicability of the U-KAN architecture to approximate the gradient in the shallow water equations. We will use data from the "Well" dataset (https://arxiv.org/pdf/2412.00568). The goal is to identify the behavior of this function, which could be a step in the way of finding a closed-form expression for the gradient. This, in turn, could play an important role in simplifying predictions of atmospheric flows in the future. If we do not successfully manage to identify any significant behavior of this model, the thesis is still interesting from a computational point of view as KANs have the potential of offering better scaling with smaller numerical errors than classic MLPs.
The general layout is that we will follow the implementation of the U-KAN in https://arxiv.org/pdf/2406.02918, but try to implement it in multidimensional physical problems. There is code available in the area which we can base ourselves upon. We will perform the functional analysis by investigating the activation functions of the KAN to see what types of functions it seems to be approximating.
If we succeed earlier than expected, we will try to implement ODE-KAN in a similar fashion as in https://arxiv.org/pdf/2407.04192 to establish if these models can be used for quasi-geostrophic problems.
This project will be carried out within the context of the co-PI's Master thesis at KTH, under the supervision of the PI.