For grantees and commercial use.

LingvoDoc is a linguistic platform.

Designed for compiling, analyzing and storing dictionaries, corpora and concordances of various languages and dialects.
It currently contains more than 1000 audio dictionaries and 300 text corpora representing the dialects of various world languages.
It stores unique data on the endangered languages of Russia.
Many dialects have already disappeared, and the LingvoDoc platform holds data from archives, which are presently stacked and inaccessible.
It keeps records on some extinct languages
(for example, Eastern Mansi) as well as those that are in danger of extinction (that is, languages that have no more than 10 speakers over 60 years old left).

The pros of the LingvoDoc platform

A chance for many researchers to work simultaneously and independently
A possibility to automatically check for errors in the processed data
Unique software that reproduces the experimental-phonetic, etymological and morphological work of a researcher 100 times faster
An option to create results of intellectual activity (RIAs) for the needs of writing reports and working with data

Opportunities

Follow the link to learn more about using these options: https://github.com/ispras/lingvodoc-react/wiki.
User options for working with dictionaries
  • creating any columns; adding any text, audio files, marking spectrograms using the Praat phonetic software; creating etymological connections between words from different dictionaries;
  • automatic segmentation of native speaker surveys, uploaded into the Telegram channel “LingvoDoc Support”, into separate words;
  • data processing and analysis software: phonetic analysis; search for etymologies; analysis of cognates in dialects and several languages; acoustic analysis of cognates; measuring phonological statistical distance; phonemic analysis; reconstruction of cognates in dialects and several languages.
User options for working with text corpora
  • uploading audio files of any size, (audio)corpora in ELAN format, texts in Word .odt format;
  • automatic creation of dictionaries from text corpora;
  • data processing with existing parsers (for the Erzya, Moksha, Udmurt, Komi, Kazakh, Tatar languages) or creating new parsers quickly and integrating them into LingvoDoc;
  • user-friendly interface for online manual word sense disambiguation which may arise after the text has been processed by the parser;
  • software for morphological analysis of glossed corpora, in particular, automatic identification of government models.
User options for mapping linguistic features
  • creating search queries of any type of complexity and plotting them on the map;
  • mapping geographic areas;
  • presenting results as online fragments of audio dictionaries and corpora which can be further edited or in the Excel file format;
  • an option to save the online map one has created as a link, and its automatic update when adding new materials to LingvoDoc.