Information about smoking status and other tobacco use is routinely collected at dental visits in Sweden. This rich source of information will enable multiple research projects related to tobacco use. In the first phase of this project, the tobacco data collected from multiple regions in Sweden will be harmonized, including processing of structured data and free-text descriptions of smoking habits. Regular data management in R and potentially machine learning for natural language processing will be performed.