Dervaze: A Spelling Dictionary for Digital Translation of Ottoman Documents
Osmanlıca Belgelerin Elektronik Çeviri Uygulamaları İçin Bir İmla Kılavuzu Örneği: Dervaze

Author : Ayşegül ERGİŞİ -- İ. Emre Şahin
Number of pages : 78-84

Abstract

The goal in this paper is to present Dervaze (dervaze.com), which is utilized in translation of Ottoman Turkish to Modern Turkish and vice versa through morphological analysis. As of today in the dictionary there are spellings of more than 72600 Ottoman Turkish words. Both Ottoman and Latin spellings, as well complementary information like Abjad values for the words are available. It is possible to search and model various relations between words (e.g. heteronymy, heterography) thanks to the digital nature of the dictionary. It is planned to store all Ottoman Turkish vocabulary in the dictionary. Dervaze underpins the translation facilities between historical Ottoman texts to Modern Turkish. The translation tool analyses the suffix and roots of Turkish words, then renders these separately in the target language. Such a technique is preferred because the most roots are irregular in Ottoman Turkish and there is no one to one mapping between these two orthographies. The primary objective in the dictionary is to translate XIX. and XX. century texts of Ottoman Turkish. Our initial sources are 1856 Redhouse Lexicon and Ali Kemal Belviranlı's spelling dictionary which relies on 1928 Spelling Committee's decisions. Various word lists available on the Internet are also added despite their poor quality to identify spelling variations. Spelling variations are especially important in Turkish origin words in Ottoman language, as there are no fixed spelling rules for them. It is required to list them in their entirety, as the one of the aspects of the dictionary is to provide a basis for an Optical / Intelligent Character Recognition system for the Ottoman language. As the OCR facilities become available we plan to extend the dictionary with all the primary sources automatically. Going back from XIX. century documents, we plan to include all vocabulary of Ottoman Turkish in the dictionary.

Keywords

Lexicography, dictionary-making, computer-assisted translation, Optical Character Recognition, Ottoman Turkish

Read: 1,506

Download: 608