This digital humanities project consists of several parts. The main part is to be able to do face recognition among historical photographs, mainly from late 1800 to early 1900. The idea is to later be able to see whether additional photos of a person can help identifying that person as already existing in the database. Historical photos are in many ways different from modern photos. Especially analysis of age and gender fails even more badly for those photos.
In this part of the project we will look at how different parts of the pipeline works in different packes and what combinations will give the best recognition accuracy known as mean average precision. Furthermore we will investigate how dimensionality reduction techniques (t-SNE UMAP etc) can be used to visualise the data and also investigate what the spatial distribution actually means. One exciting result is that it seems that women can be find in one part of the cluster in certain conditions. We do not know yet if age etc can have a similar impact. We also need to look at clustering using for instance DBSCAN can be done in a high dimensional space and not only in 2D. Hence, we need to do a lot of computing using CPU's rather than learning on GPU's since we will do no learning at this stage.
This project starts now as a 15hp student project but will be continued in the spring as one or two master thesis projects.