Estonian Resource Grammar.
Project at the GF Summer School 2013
Inari Listenmaa, Kaarel Kaljurand
Third GF Summer School 2013, Frauenchiemsee, Bavaria
2013-08-30

Presenter Notes

Estonian resource grammar

Presenter Notes

Morphology

  • Nouns
    • 14 cases * 2 numbers
  • Verbs
    • Tense: present, imperfect
    • Mood: indicative, imperative, conditional, quotative
    • Voice: active, passive
    • Nominal forms: {ma infinitive, da infinitive, past participle} * cases ; present participle
  • Adjectives
    • Noun inflection + comparative and superlative

Presenter Notes

Morpho(phono)logy

  • Variable stress
    • not shown in orthography
    • kala ['ka.la] vs. banaan [ba.'naan] (mostly words of foreign origin)
  • No vowel harmony
  • 3-way quantity system
    • koli (I, junk), kooli (II, school (gen)), kooli (III, school (part, illat))
    • link (nom), lingi (gen), linki (part)
  • Palatalization
    • palk (salary), pa`lk (log)

Presenter Notes

Nouns

param
  Case = Nom | Gen | Part
     | Illat | Iness | Elat | Allat | Adess | Ablat
     | Transl | Ess | Termin | Abess | Comit;

  NForm = NCase Number Case ;
  • started implementing the noun morphology based on "Heiki-Jaan Kaalep. Eesti käänamissüsteemi seaduspärasused" (HJK EKS)
  • 6 forms needed, others calculated from genitive
  • currently ignoring parallel forms
  • rules in HJK EKS depend on singular nominative, its stress, quantity, derivation from a verb, foreignness

Presenter Notes

Smart paradigms

13 templates for creating 6 forms from 1 (sg nom)

-- link -> link, lingi, linki, linki, linkide, linke
hjk_type_VI_link x =
    let
        x_n : Str = weaker x
    in
    hjk_nForms6 x (x_n+"i") (x+"i") (x+"i") (x+"ide") (x+"e") ;

Mapping singular nominative to the templates

/2 II V/               -> type 3
/2 I V/, foreign       -> type 3
/2 III V/              -> type 4a
/3 CV[lmnr]/           -> type 6 seminar
/e/, derived from verb -> type 7
...

TODO: genitive form as an additional argument

lakk, laka, lakka, lakka, lakkade, lakkasid
lakk, laki, lakki, lakki, lakkide, lakkisid

Presenter Notes

Adjectives

param
  AForm = AN NForm | AAdv ;

oper
  Adjective : Type = {s : Degree => AForm => Str} ;
  • derived from noun forms
  • comparative derived from genitive
  • superlative more complex but can be expressed also by kõige + Comparative
  • adverbial form derived from ablative

Presenter Notes

Verbs

param
    VForm = 
    Presn Number Person | Impf Number Person | Condit Number Person 
    | Imper Number | ImperP3 Number | ImperP1Pl | ImpNegPl 
    | PassPresn Bool | PassImpf Bool 
    | Inf InfForm | PresPart | PastPartAct AForm | PastPartPass AForm ;

    InfForm = 
    InfDa | InfDes | InfMa | InfMas | InfMast | InfMata | InfMaks ;
  • full conjugation tables from 4 forms (regular verbs) or 8 forms (25 irregular verbs)
    • verbs be and go formed separately
    • for comparison: Finnish worst case 12 forms
  • smart paradigms for 1, (2,) 4 and 8 arguments currently
  • TODO: add quotative, rethink the nominal forms (Inf)

Presenter Notes

Smart paradigms

  • 15 templates for creating 8 forms from 1 forms
  • 1-argument smart paradigms for some verbs
  • TODO better 1-arg paradigms, decide 2- and 3-arg paradigms

Presenter Notes

Testing morphology

TODO

  • main questions:
    • percentage of words covered by the 1-arg smart paradigms?
    • types of remaining problems?
  • using Filosoft's morph. synthesizer as the gold standard
  • test vocabulary e.g. from the Estonian WordNet (44k words in 29k synsets)

Presenter Notes

Syntax

  • (2 years old) copy of the Finnish RG
  • several changes
    • "-ko" particle -> "kas"
    • possessive endings -> "minu"/"tema"/"oma"
    • reflexive possessive oma as a separate construction in ExtraEst
  • TODO
    • trennbare verben
    • also lacking from Finnish (Fin ylläpitää ~ pidän yllä; Est aru saama ~ saan aru)

Presenter Notes

Testing syntax

  • developing PhrasebookEst in parallel
    • this has not highlighted any major problems
  • adding Estonian to ACE-in-GF
    • this has not highlighted any major problems

Presenter Notes

Resources used

Presenter Notes

Future work

  • testing
  • cleanup
  • submission to the GF RGL
  • sharing of code with Finnish
  • Dict(Eng)Est, from WordNet?

Presenter Notes

Täname!

Presenter Notes