What is Word Sense Disambiguation?
In computer linguistics, word sense disambiguation (WSD) is an open problem in natural language processing and ontology. Ambiguity and disambiguation are the core issues in natural language understanding. At the level of word meaning, sentence meaning, and discourse meaning, there will be different phenomena of language according to context and semantics. Disambiguation refers to the process of determining object semantics based on context. Word sense disambiguation is semantic disambiguation at the word level.
- Semantic disambiguation / word sense disambiguation is a core and difficult point of natural language processing tasks, which affects the performance of almost all tasks, such as search engines, opinion mining, text understanding and generation, reasoning, etc.
- In the long-term development of linguistics, the language itself has accumulated many polysemous usages. The emergence of language is the result of many aspects. The use of language is constantly changing. A word has many specific meanings in development, and there are still some meanings in common use now. Different regions may have different usages of a word, and different industries may have different usages of a word. Even different groups, different individuals, and different tones have their own special interpretation meanings. Semantic disambiguation is a way of language understanding. On the one hand, we must understand the meaning and application of the polysemy of common words. On the other hand, we must also consider specific scenarios and use relevant knowledge bases and corpus training to increase polysemy Righteous performance.
- So far, a variety of technologies have been studied, dictionary-based methods, using knowledge base and knowledge map technology, supervised learning, unsupervised, semi-supervised, word or word vector based. It should be the direction of development based on various resources, semi-supervised, and based on words and word vectors. [1]
Word sense disambiguation dictionary
- Dictionary-based semantic disambiguation depends on the distinction between the dictionary and semantics. Coarse-grained polysemy refers to distinguishing larger semantics, such as water, which may mean natural water or parallel imports; fine-grained polysemy means to distinguish smaller, different semantics. If the dictionary lacks a certain level / some semantic description, a full description using the dictionary as the meaning of the word will cause problems. This feature is also applicable to WSD (word sense disambiguation) and EL (entity linking). The solution to this problem is to auto-increment clustering for semantic aggregates with less description.
- Commonly used dictionaries in English include WordNet, Roget'Thesaurus, BabelNet. Any language can use commonly used dictionaries, dictionaries, online encyclopedias, professional knowledge bases / databases as disambiguation dictionary files [2]
Word sense disambiguation and part-of-speech tagging
- Part-of-speech tagging and word sense disambiguation are two interrelated issues, and they can be satisfied at the same time in human systems. However, the current system generally does not allow the two to share parameters and output at the same time. Semantic comprehension, including word segmentation, part-of-speech tagging, word disambiguation, syntactic parsing, and semantic parsing are not feed-forward, but are interdependent existence feedback.
- Part-of-speech tagging and semantic disambiguation both rely on context to tag, but part-of-speech tagging is simpler and more successful than semantic disambiguation. The main reason is that the tagging set of part-of-speech tagging is deterministic, but semantic disambiguation is not, and the magnitude is much larger; the context dependence of part-of-speech tagging is shorter than semantic disambiguation. [2]
judge Word sense disambiguation
- Sometimes people can't judge which meaning a word belongs to. The coarse-grained distinction is definitely higher than the fine-grained. So coarse-grained tasks are generally chosen because human judgment is used as the gold standard. [2]
Word sense disambiguation linguistics
- Many researchers believe that in order to disambiguate the meaning of a word, it is necessary to understand pragmatics and some common sense. Linguistics itself is closely integrated with knowledge, and language-related common sense is definitely needed for analysis, just as entity disambiguation requires entity-related knowledge. [2]
The difference between using word sense disambiguation for different tasks
- The specific word sense disambiguation will be different for different tasks. For example, in translation, it is not necessary to explicitly output the intermediate result of word sense disambiguation, and it only needs to be synonymous with the last sentence. [2]
Definition of word sense disambiguation polysemy
- People can generally get a consensus on the definition of coarse-grained, but when it is finer, it is difficult to unify. And even with the same semantics, in different environments, there may be differences, because the language expression has unlimited possibilities, resulting in semantics may be transferred at a fine granularity. [2]