SUPR
Geospatial vision-language pre-trained model
Dnr:

NAISS 2024/22-781

Type:

NAISS Small Compute

Principal Investigator:

Weiming Huang

Affiliation:

Lunds universitet

Start Date:

2024-06-01

End Date:

2025-06-01

Primary Classification:

10507: Physical Geography

Webpage:

Allocation

Abstract

In this project, we intend to develop a geospatial large-scale pre-trained model through fine-tuning vision-language models, e.g., CLIP, with geospatial data and theories. Specifically, we intend to use both geospatial visual data (e.g., remote sensing images and street view images) as well as textual data (e.g., points of interest and social media data) to adapt the general-purpose vision-language models to a geospatial and urban context. We expect that the model will then be able to produce highly effective urban representations (embeddings) for various urban analytical tasks, such as urban land use inference and population density estimation.