This book presents a synthetic analysis about the characteristics of time expressions and named entities, and some proposed methods for leveraging these characteristics to recognize time expressions and named entities from unstructured text. For modeling these two kinds of entities, the authors propose a rule-based method that introduces an abstracted layer between the specific words and the rules, and two learning-based methods that define a new type of tagging scheme based on the constituents of the entities, different from conventional position-based tagging schemes that cause the problem of inconsistent tag assignment. The authors also find that the length-frequency of entities follows a family of power-law distributions. This finding opens a door, complementary to the rank-frequency of words, to understand our communicative system in terms of language use.
- ISBN13 9783030789602
- Publish Date 24 August 2021
- Publish Status Active
- Publish Country CH
- Imprint Springer Nature Switzerland AG
- Edition 1st ed. 2021
- Format Hardcover
- Pages 96
- Language English