• Українською
  • Creation of the National Corpus of the Crimean Tatar Language is underway: Ministry for Reintegration of Temporarily Occupied Territories

    The Ministry for Reintegration of Temporarily Occupied Territories of Ukraine has initiated the creation of the National Corpus of the Crimean Tatar Language as part of the implementation of the Strategy for the Development of the Crimean Tatar Language for 2022-2032. This is an online platform for language research that will be based on data from textual materials in Crimean Tatar language.

    The collection of printed and online sources in the Crimean Tatar language for the National Corpus has been underway for 4 months.

    During this time, 675 materials by more than 180 authors have been processed and included in the catalog. This is more than 50,000 printed pages (40 million characters). Among them are works by famous authors, newspapers, magazines, textbooks, scientific articles, international legal documents, etc.

    Currently, the oldest work dates back to the 13th century, and the most modern one to the 21st century (2023).

    The catalog already contains materials in 4 graphic systems used in the Crimean Tatar language. Namely, Arabic script, pre-war Latin, Cyrillic, and modern Latin.

    Anyone can join the creation of the online database of texts.

    The project is being implemented with the support of the Ministry for Reintegration of Temporarily Occupied Territories, the Swiss-Ukrainian EGAP Program implemented by the East Europe Foundation, and Taras Shevchenko National University of Kyiv.