GF WordNet

GF WordNet is a large interlingual lexicon created by combining resources such as WordNet, PanLex and Wikipedia, plus several translation dictionaries and morphological lexicons. See here for details:

Krasimir Angelov. A Parallel WordNet for English, Swedish and Bulgarian. Proceedings of The 12th Language Resources and Evaluation Conference. 2020. Pages 3008-3015. (PDF)

The lexicon consists of abstract lemmas such as apple_1_N which are mapped for each language to an inflection table. The table contains all possible forms for the most appropriate verbalization of the lemma.

Like in WordNet, the abstract lemmas are organized in synsets which group together lemmas with similar meaning. The synsets, on the other hand, are oganized into a semantic graph copied from WordNet.

The lexicon is compatible with the GF's Resource Grammars Library. Moreover, the grammar can be used to render the examples in the WordNet glosses to different languages. We use the examples to ensure that each abstract lemma is mapped into an appropriate translation for all languages. Since the translations of the examples are generated from a single interlingual abstract syntax tree, it is also possible to compute the alignment between the translations. We show the alignment when the user clicks on a word from the examples.

When there is a corresponding Wikipedia article for a lemma, then we also show the tumbnail image from the corresponding article. Clicking on the image opens the article itself in a new tab.

The lexicon is a work in progress. The status of each verbalization is shown as follows:

The lexicon contains about 100 000 lemmas, however, only for English all lemmas have a verbalization. The relative lexicon sizes for different languages as well as the current lemma statuses are summarized in the diagram bellow:


here the green colour corresponds to verbalizations that have already been validated.

The following is a list of all resources in addition to Wikipedia and PanLex that have been used in the creation of the current lexicon:

LanguageResourceSourceInfo
BulgarianBulTreeBank WordnetOpen Multilingual WordNetTranslations/Senses
BG OfficeBG OfficeMorphology
SA DictionarySourceForgeTranslations
CatalanMultilingual Central RepositoryOpen Multilingual WordNetTranslations/Senses
FreeLingMorphology
ChineseChinese Open WordnetOpen Multilingual WordNetTranslations/Senses
DutchOpen Dutch WordNetComputational Lexicology LabTranslations+Morphology
EnglishPrinceton WordNetPrinceton and Open Multilingual WordNetTranslations/Senses
Oxford Advanced Learner's DictionaryMorphology
EstonianEstonian WordNetEstonian WordNetTranslations/Senses
GF-EstonianGF-EstonianMorphology
FinnishFinnWordNetFinnWordNet and Open Multilingual WordNetTranslations/Senses
KotusKotusMorphology
ItalianMultiWordNetOpen Multilingual WordNetTranslations/Senses
FreeLingFreeLingMorphology
PortugueseOpenWN-PTOpen Multilingual WordNetTranslations/Senses
MorphoBrMorphoBrMorphology
SlovenianSloWNetOpen Multilingual WordNetTranslations/Senses
SLOLEKSSLOLEKSMorphology
SpanishMultilingual Central RepositoryOpen Multilingual WordNetTranslations/Senses
SwedishWordNet-SALDOOpen Multilingual WordNetTranslations/Senses
Svenskt OrdNätSvenskt OrdNätTranslations/Senses
Folkets LexikonFolkets LexikonTranslations
SALDOSALDOMorphology
ThaiThai WordnetOpen Multilingual WordNetTranslations/Senses
TurkishKeNetKeNetTranslations/Senses
ZemberekZemberekMorphology
ZuluPanLexPanLexTranslations/Senses