Machinese Semantics
Machinese Semantics is a semantic analyzer that provides semantic role recognition as well as grammatical, lexical and sentential semantic features. These include
- recognition of the multi-word forms as one entity (e.g. would_have_been_informed)
- harmonization of the syntactic structures
- name recognition and classification (person, location, organization)
- semantic feature recognition (human, animate, tool, durative etc.)
- Machinese Semantics includes features that make it especially suitable for use as a source language analyzer in machine translation or in information extraction.
The output of Machinese Semantics consists of possibly-recursive attribute-value pairs. Categories are values of the corresponding attribute. Each node in the analysis is a single word or a multiword unit. The parser produces three attributes whose value is a string: word is the running-text token; lemma is the base form of the nucleus; head is printed only in such multi-word units where it is different from the lemma. After that follow the syntax, semantics and linear matrices.
Here is the Machinese Semantics analysis of the sample sentence “A book was given to John.” as a graphical feature structure presentation:
Click the image above for a larger feature structure.
The backbone of the feature structure is a functional representation of the sentence. It gives a semantic interpretation of the syntactic structure, which means that many language-specific patterns are normalized. For example, the Machinese representation of the sentence “A book was given to John” shows the notional roles object and indirect object that correspond to the similar roles in “Somebody gave John a book”.