Neural tangent kernel (NTK) is a powerful tool for analyzing artificial neural networks (NN). By drawing connections between a NN and its gradient kernel, NTK allows us to fully characterize the training process of the NN using kernel theories. The goal of this project is to investigate the potentials of NTK in knowledge transfer, which are crucial for network training where training set is not sufficiently large. To this end, we will first develop efficient metrics for comparing high dimensional kernel matrices using information theoretic tools, which is faster to compute than the conventional L2 norm, and then use the metrics as training signals in our training process. We plan to conduct extensive numerical experiments to test the resulting algorithm in scenarios such as domain adaptation, network pruning and knowledge distillation.