Large Language Models (LLMs) like GPT-4 or LLaMA-2 are increasingly used in software engineering tasks, being part of such tools as Github Copilot. Their use for these tasks is based on the assumptions that programmers express their design in vernacular close to the problem domain -- the so-called naturalness hypothesis. In this project, we evaluate this hypothesis by studying whether the large language models can recognize the domain vocabulary better than the programming language for any given program. We analyze the Rosetta code repository with four language models in three different scenarios.