What Is Machine Translation?
Machine translation, also known as automatic translation, is the process of using a computer to convert one natural language (source language) into another natural language (target language). It is a branch of computational linguistics and one of the ultimate goals of artificial intelligence. It has important scientific research value.
machine translation
- Statistics-based machine translation
- Machine translation, also known as automatic translation, is the process of using a computer to convert one natural language (source language) into another natural language (target language). it is
- The development of machine translation technology has been closely following the development of computer technology, information theory, linguistics and other disciplines. From the early dictionary matching, to the translation of dictionaries combined with the knowledge of linguistic experts, to statistical machine translation based on corpora, with the improvement of computer computing capabilities and the explosive growth of multilingual information, machine translation technology has gradually come out
- The general Corpus-Based machine translation system is statistical-based machine translation. Because this field has a sudden rise, statistics is the statistical parallel corpus, which has derived many different statistical models.
- Unlike the rule-based machine translation system, which consists of a dictionary and a grammar rule base, the corpus-based machine translation system is based on the application of a corpus, and consists of a divided and labeled corpus. Corpus-based methods can be divided into Statistics-based methods and Example-based methods.
- The machine translation method based on statistics regards machine translation as a process of information transmission, and uses a channel model to explain machine translation. This idea holds that the translation from source language sentences to target language sentences is a probability problem. Any target language sentence may be a translation of any source language sentence, but the probability is different. The task of machine translation is to find the sentence with the highest probability. . The specific method is to treat the translation as a decoding process of converting the original text into a translation. therefore
- Since 2013, with the great progress of deep learning research, Neural Machine Translation based on artificial neural network has gradually emerged. The core of its technology is a deep neural network with massive nodes (neurons), which can automatically learn translation knowledge from the corpus. After the sentences of one language are vectorized, they are transmitted layer by layer in the network, converted into representations that the computer can "understand", and then through multiple layers of complex conduction operations to generate translations in another language. The translation method of "understanding language and generating translation" was realized. The biggest advantage of this translation method is that the translation is smooth, more in line with grammatical norms, and easier to understand. Compared with the previous translation technology, the quality has been improved by "leap-forward".
- At present, widely used in machine translation is Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN). This model is good at modeling natural language, transforming sentences of any length into a vector of floating point numbers of a certain dimension, while "remembering" the more important words in the sentence, and allowing "memory" to be stored for a longer time. This model solves the problem of vectorization of natural language sentences very well, which is of great significance for the use of computers to process natural languages, so that the computer's processing of languages no longer stays at the level of simple literal matching, but goes further. To the level of semantic understanding.
- Representative research institutions and companies, including the University of Montreal's Machine Learning Lab, have released the open source neural network-based machine translation system GroundHog. In 2015, Baidu released an online translation system combining statistics and deep learning methods, and Google also conducted in-depth research in this area.
- As machine translation still has a considerable market, China's manufacturers in this field are also different. On the domestic market
Machine translation errors are inevitable
- Many people have misunderstandings about machine translation. They believe that machine translation is too biased to help people solve any problems. In fact, the error is inevitable. The reason is that machine translation uses linguistic principles. The machine automatically recognizes the grammar, calls the stored thesaurus, and automatically performs the corresponding translation. , For example, "Give me a reason to kill you, first" in "Large Journey to the West" followed by adverbials. After all, a machine is a machine. No one has a special feeling for language. How can it feel the charm of "the gentleness of bowing down, like the shyness of a water lotus?" After all, because of changes in its morphology, grammar, syntax, and context, the meaning of Chinese is very different. Even many Chinese are monks of the second familydon't even think about machines, let alone the mind.
Machine translation bottlenecks
- In fact, no matter which method is used, the biggest factor affecting the development of machine translation is the quality of the translation. Judging from the achievements already made, the quality of machine translation is still far from the ultimate goal.
- Chinese mathematician and linguist Zhou Haizhong once pointed out in the paper "Fifty Years of Machine Translation": To improve the quality of machine translation, the first thing to solve is the problem of language itself, not the problem of programming; it depends on several programs alone. The machine translation system must not improve the quality of the machine translation. At the same time, he also pointed out that in the case that humans have not yet understood how the brain performs fuzzy recognition and logical judgment of language, it is impossible for machine translation to reach the level of "faithfulness, elegance, and elegance". This view is probably the bottleneck that restricts the quality of translations. [1]
- It is worth mentioning that the American inventor and futurist Ray Cozwell predicted in an interview with the Huffington Post that the quality of machine translation will reach the level of human translation by 2029. There is still a lot of controversy in the academic community about this assertion.
- In any case, this is the period when people are most optimistic about machine translation. This concern is based on objective understanding and rational thinking. We also have reasons to believe that with the joint efforts of computer experts, linguists, psychologists, logicians, and mathematicians, the bottleneck of machine translation will be resolved.