What Is a Free-Form Language?
In mathematics, logic, and computer science, a formal language is a language defined by precise mathematical or machine-processable formulas.
- Chinese name
- Formal language
- Foreign name
- Format Language
- Field
- Mathematics, Logic and Computer Science
- In mathematics, logic, and computer science, a formal language is a language defined by precise mathematical or machine-processable formulas.
- Like languages in linguistics, formal languages generally have two aspects: grammar and semantics. The branch of mathematics and computer science that specializes in the grammar of languages is called formal language theory , which only studies the grammar of a language and does not focus on its semantics. In formal language theory, formal language is a collection of certain finite-length strings on the alphabet. A formal language can contain an unlimited number of strings.
- A finite or infinite set of sentences or symbol strings formed according to a certain law.
Formal language details
- Letters of a formal language are a set of symbols, letters, or marks that can be formed from strings in that language; usually its requirements are limited.
- A string is formed by this letter called a word, which belongs to a particular formal language sometimes called a formal formula. A formal language is often defined by a formal grammar, such as a regular grammar or a context-free grammar, called a formation law.
- Formal language theory mainly studies the pure grammatical domain of languages such as internal structural patterns. Formal language theory is derived from linguistics as a syntactic rule for understanding natural language. In computer science, formal languages are often used as the basis for defining programming languages and grammars, and are a subset of formal versions of natural language. In computational complexity theory, decision problems are usually defined as formal languages, and complex classes are defined as a collection of formal languages, which can be parsed by machines with limited computing power. In logical and mathematical foundations, formal language is the syntax used to represent axiom systems.
Formal language natural language and formal language
- Natural language is the language spoken by humans, such as Chinese, English and French. Such languages are not artificially designed (although some people try to impose some rules) but evolved naturally. Formal Language is a language designed for specific applications. For example, numbers and arithmetic symbols for mathematicians, molecular formulas for chemists, and so on. A programming language is also a formal language, a formal language specifically designed to express computational processes. [1]
- Formal languages have strict syntax rules. For example, 3 + 3 = 6 is a grammatically correct mathematical equation, while 3 = + 6 $ is not, H2O is a correct molecular formula, and 2Zz is not. Grammar rules are made up of rules about tokens and structures. The concept of Token is equivalent to words and punctuation in natural language, numbers and operators in mathematical formulas, and element names and numbers in chemical molecular formulas. The rules about tokens are called lexical rules, and the rules about sentence structure are called grammar rules. [1]
Formal language operations between languages
- Inter-language operations are operations on * power sets.
- Intersection and complement operations on a collection of strings.
- Join operation: L1L2 = {xy | x belongs to L1 and y belongs to L2}.
- Power operation: Ln = L ... L (a total of n L are connected together), L0 = {}.
- Closure operation: L * = L0L1 ... Ln ...
- Right quotient operation: L1 / L2 = {x | exists if y belongs to L2 and xy belongs to L1}.
- Let S * be a language, and the complement language of S is defined as the set { | * and S}
Formal language
- A formal language can define itself in many ways, such as:
- Enumerate individual strings (only for a limited set of strings).
- Generated by formal grammar (see Chomsky Pedigree).
- Generated by regular expressions.
- Recognized by some kind of automaton, such as Turing machine, finite state automaton.
Formal language application
- Programming language
- Main article: Syntax (programming language and compiler)
- The compiler usually consists of two different parts. A lexer, formed by a tool like lex. That recognizes the syntax markup of a programming language. For example: identifiers or keywords, in a simple language expression, usually a regular expression tool. In the most basic concept, a parser consists of a yacc-like parser generator. Try to determine whether the source program is valid. Of course, compilers do more than just parse source code; they usually translate it into some executable format. Therefore, a parser usually outputs mostly yes or no answers, a typical abstract syntax tree, which is used by the subsequent stages of the compiler to finally generate machine code, which includes running directly in hardware for execution, or some intermediate code needs to be executed by a virtual machine. .
Formal linguistics
- Also known as algebraic linguistics, it studies general abstract symbol systems and uses formal models to theoretically analyze and describe languages (including artificial and natural languages).
- Formal grammar: A format that describes what sentences are legal in the language and indicates the rules for combining words into phrases and sentences.
- There are three ways to describe a language: 1, exhaustive 2, grammar 3, automaton, the Chinese method refers to the process of generation, and automaton refers to the process of identification. If a language has a process of identifying it, there must be The process of its creation and vice versa.
- The current formal grammar system is a theoretical model proposed by Chomsky in 1959 to describe natural language
- How to define formal language strictly
- Formal grammar: A formal grammar G consists of four parts, which can be written as G = (VN, VT, S, P), where:
- VN: a non-terminal alphabet called grammar G, VN does not appear in the sentence of the language set represented by G;
- VT: a terminal symbol alphabet called grammar G. The sentence of the language represented by G is composed of elements in VT, VN VT =; S: represents the sentence symbol, S VN.
- P: represents a set of formulas, and the formula in P has the following form: ->
- The production needs to meet the following conditions: 1) can be any string on VN and VT, but it must contain at least one non-terminal character and cannot be a null character; 2) can be any character on VN and VT String, which can also be a null character; 3) at least one of the productions in P must be performed by S;
- Characteristics of formal language
- 1, a high degree of abstraction (using formal means-special symbols, mathematical formulas-to describe the structural relationship of language, this structural relationship is abstract) 2 is a set of deductive systems (the purpose of formal language Limited rules are used to derive infinite sentences in language, and the philosophical basis of formal language is also to use deductive methods to study natural language.3 It has the characteristics of algorithms. Derivation tree)
- Chomsky divides grammar into 4 types, namely type 0, type 1, type 2, and type 3. Type 0 grammar is also called phrase grammar. The ability of type 0 grammar is equivalent to Turing, or any type 0 language is recursively enumerable. Type 1 grammar is also called context-dependent, and its ability is equivalent to linear bounded automata. Type 2 grammar is also called context-free, and its ability is equivalent to non-deterministic pushdown automata. Type 3 grammar is also called right linear grammar. Since this grammar is equivalent to the normal form, it is also called a regular grammar. From the ability of grammar to describe language, type 0 grammar is the strongest and type 3 grammar is the weakest.