Namespaces
Variants
Actions

Difference between revisions of "Grammar, generative"

From Encyclopedia of Mathematics
Jump to: navigation, search
(Importing text file)
 
m (fixing subscript)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
 +
<!--
 +
g0448201.png
 +
$#A+1 = 91 n = 0
 +
$#C+1 = 91 : ~/encyclopedia/old_files/data/G044/G.0404820 Grammar, generative,
 +
Automatically converted into TeX, above some diagnostics.
 +
Please remove this comment and the {{TEX|auto}} line below,
 +
if TeX found to be correct.
 +
-->
 +
 +
{{TEX|auto}}
 +
{{TEX|done}}
 +
 
''Chomsky grammar''
 
''Chomsky grammar''
  
 
A type of formal grammar (cf. [[Grammar, formal|Grammar, formal]]); actually it is a special case of a Post calculus (see [[Post canonical system|Post canonical system]]). A systematic study of this grammar was begun in the 1950s by N. Chomsky, who pointed out its applications to linguistics and isolated the classes of generative grammars which are most important to applications — context-sensitive grammars, context-free grammars, regular grammars (cf. [[Grammar, context-sensitive|Grammar, context-sensitive]]; [[Grammar, context-free|Grammar, context-free]]; [[Grammar, regular|Grammar, regular]]). These classes proved to be especially interesting from the mathematical point of view.
 
A type of formal grammar (cf. [[Grammar, formal|Grammar, formal]]); actually it is a special case of a Post calculus (see [[Post canonical system|Post canonical system]]). A systematic study of this grammar was begun in the 1950s by N. Chomsky, who pointed out its applications to linguistics and isolated the classes of generative grammars which are most important to applications — context-sensitive grammars, context-free grammars, regular grammars (cf. [[Grammar, context-sensitive|Grammar, context-sensitive]]; [[Grammar, context-free|Grammar, context-free]]; [[Grammar, regular|Grammar, regular]]). These classes proved to be especially interesting from the mathematical point of view.
  
A generative grammar is an ordered quadruple <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448201.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448202.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448203.png" /> are disjoint finite sets, known, respectively, as the terminal and non-terminal alphabets, or dictionaries (their elements are called, respectively, terminal, or basic, and non-terminal, or auxiliary, symbols), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448204.png" /> is an element of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448205.png" /> called the initial symbol, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448206.png" /> is a finite set of rules of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448207.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448208.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g0448209.png" /> are strings (words, cf. [[Word|Word]]) over the alphabet <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482010.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482011.png" /> forms no part of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482012.png" />; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482013.png" /> is called the scheme of the grammar. If two strings <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482014.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482015.png" /> can be represented as, respectively, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482016.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482017.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482018.png" /> is one of the rules of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482019.png" />, then one says that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482020.png" /> is directly derivable from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482021.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482022.png" /> (denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482023.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482024.png" />). A sequence of strings <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482025.png" /> is said to be a derivation of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482026.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482027.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482028.png" /> if for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482029.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482030.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482031.png" />; the number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482032.png" /> is the length of the derivation. A derivation is said to be complete if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482033.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482034.png" /> does not contain non-terminal symbols. If there exists a derivation of a string <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482035.png" /> from a string <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482036.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482037.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482038.png" /> is said to be derivable from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482039.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482040.png" />. The set of strings in the terminal alphabet that are derivable in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482041.png" /> from <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482042.png" /> is said to be the language generated by the grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482043.png" /> (denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482044.png" />). Two generative grammars are equivalent if they generate the same language. The class of languages generated by all possible generative grammars coincides with the class of recursively-enumerable sets of strings.
+
A generative grammar is an ordered quadruple $  \Gamma = \langle  V, W, I, R \rangle $,  
 +
where $  V $
 +
and $  W $
 +
are disjoint finite sets, known, respectively, as the terminal and non-terminal alphabets, or dictionaries (their elements are called, respectively, terminal, or basic, and non-terminal, or auxiliary, symbols), $  I $
 +
is an element of $  W $
 +
called the initial symbol, and $  R $
 +
is a finite set of rules of the form $  \phi \rightarrow \psi $,  
 +
where $  \phi $
 +
and $  \psi $
 +
are strings (words, cf. [[Word|Word]]) over the alphabet $  V \cup W $
 +
and $  \rightarrow $
 +
forms no part of $  V \cup W $;  
 +
$  R $
 +
is called the scheme of the grammar. If two strings $  \xi $
 +
and $  \eta $
 +
can be represented as, respectively, $  \xi = \chi _ {1} \phi \chi _ {2} $,  
 +
$  \eta = \chi _ {1} \psi \chi _ {2} $,  
 +
where $  \phi \rightarrow \psi $
 +
is one of the rules of $  \Gamma $,  
 +
then one says that $  \eta $
 +
is directly derivable from $  \xi $
 +
in $  \Gamma $ (denoted by $  \xi \Rightarrow _  \Gamma  \eta $
 +
or $  \xi \Rightarrow \eta $).  
 +
A sequence of strings $  ( \omega _ {0} \dots \omega _ {n} ) $
 +
is said to be a derivation of $  \omega _ {n} $
 +
from $  \omega _ {1} $
 +
in $  \Gamma $
 +
if for all $  i $,  
 +
$  1 \leq  i \leq  n $,  
 +
$  \omega _ {i- 1} \Rightarrow _  \Gamma  \omega _ {i} $;  
 +
the number $  n $
 +
is the length of the derivation. A derivation is said to be complete if $  \omega _ {0} = I $
 +
and $  \omega _ {n} $
 +
does not contain non-terminal symbols. If there exists a derivation of a string $  \eta $
 +
from a string $  \xi $
 +
in $  \Gamma $,  
 +
then $  \eta $
 +
is said to be derivable from $  \xi $
 +
in $  \Gamma $.  
 +
The set of strings in the terminal alphabet that are derivable in $  \Gamma $
 +
from $  I $
 +
is said to be the language generated by the grammar $  \Gamma $ (denoted by $  L ( \Gamma ) $).
 +
Two generative grammars are equivalent if they generate the same language. The class of languages generated by all possible generative grammars coincides with the class of recursively-enumerable sets of strings.
  
The so-called complexity functions, the most important ones being time complexity and space complexity, are used to estimate the complexity of a derivation in generative grammars. The time complexity of a grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482045.png" /> is a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482046.png" /> of a natural argument, the value of which for each <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482047.png" /> is equal to the smallest number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482048.png" /> having the following property: For any string <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482049.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482050.png" /> (where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482051.png" /> is the length of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482052.png" />) there exists a derivation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482053.png" /> of this chain from the initial symbol of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482054.png" /> with length at most <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482055.png" />; if there are no strings <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482056.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482057.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482058.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482059.png" />. The space complexity <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482060.png" /> of the grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482061.png" /> is defined in an analogous manner by replacing the length of a derivation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482062.png" /> by the greatest length of the strings <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482063.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482064.png" />.
+
The so-called complexity functions, the most important ones being time complexity and space complexity, are used to estimate the complexity of a derivation in generative grammars. The time complexity of a grammar $  \Gamma $
 +
is a function $  \tau _  \Gamma  ( n) $
 +
of a natural argument, the value of which for each $  n $
 +
is equal to the smallest number $  k $
 +
having the following property: For any string $  x \in L ( \Gamma ) $
 +
such that $  | x | \leq  n $ (where $  | x | $
 +
is the length of $  x $)  
 +
there exists a derivation $  D $
 +
of this chain from the initial symbol of $  \Gamma $
 +
with length at most $  k $;  
 +
if there are no strings $  x $
 +
such that $  x \in L ( \Gamma ) $
 +
and $  | x | \leq  n $,  
 +
then $  \tau _  \Gamma  ( n) = 0 $.  
 +
The space complexity $  \sigma _  \Gamma  ( n) $
 +
of the grammar $  \Gamma $
 +
is defined in an analogous manner by replacing the length of a derivation $  D = ( \omega _ {0} \dots \omega _ {s} ) $
 +
by the greatest length of the strings $  \omega _ {i} $,  
 +
$  i = 0 \dots s $.
  
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482065.png" /> be some set of complete derivations in the grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482066.png" /> and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482067.png" /> be the set of terminal strings of derivations belonging to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482068.png" />; then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482069.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482070.png" /> is effectively given, then one says that a certain way of controlling a derivation in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482071.png" /> has been given. A study of the ways of controlling a derivation is material in applications, since the possibility of using certain definite derivations rather than random ones corresponds to the situation which occurs in natural language. Control of a derivation can be realized, in particular, by imposing restrictions on the sequence of rules used in the derivation (e.g. the set of  "admissible"  sequences of rules may itself be generated by some generative grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482072.png" />; in such a case the language is defined by an ordered pair of generative grammars <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482073.png" />, which is called a generalized grammar), either on the strings constituting the derivation or by some more complicated method (e.g. the rule employed in a subsequent stage may depend on the type of the string obtained in the preceding stage).
+
Let $  M  ^  \prime  $
 +
be some set of complete derivations in the grammar $  \Gamma $
 +
and let $  L  ^  \prime  $
 +
be the set of terminal strings of derivations belonging to $  M  ^  \prime  $;  
 +
then $  L  ^  \prime  \subseteq L ( \Gamma ) $.  
 +
If $  M  ^  \prime  $
 +
is effectively given, then one says that a certain way of controlling a derivation in $  \Gamma $
 +
has been given. A study of the ways of controlling a derivation is material in applications, since the possibility of using certain definite derivations rather than random ones corresponds to the situation which occurs in natural language. Control of a derivation can be realized, in particular, by imposing restrictions on the sequence of rules used in the derivation (e.g. the set of  "admissible"  sequences of rules may itself be generated by some generative grammar $  \Gamma  ^  \prime  $;  
 +
in such a case the language is defined by an ordered pair of generative grammars $  ( \Gamma , \Gamma  ^  \prime  ) $,  
 +
which is called a generalized grammar), either on the strings constituting the derivation or by some more complicated method (e.g. the rule employed in a subsequent stage may depend on the type of the string obtained in the preceding stage).
  
Studies of generative grammars necessarily involve algorithmic problems. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482074.png" /> is a property of languages, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482075.png" /> is a decidable class of grammars, and if there exists an algorithm which allows one to decide, for any grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482076.png" />, whether or not the language <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482077.png" /> has the property <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482078.png" />, then one says that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482079.png" /> is decidable in the class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482080.png" />. In the class of all generative grammars no non-trivial property (i.e. such that the corresponding class of languages contains both languages which display and languages which do not display this property) is decidable. One may speak, in a similar manner, about relations which are decidable in a certain class of grammars. There also arise problems of a different type, e.g. the existence, for a given grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482081.png" />, of an algorithm by which it is possible to find, from any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482082.png" /> strings <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482083.png" /> in its terminal alphabet, the value of a given predicate <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482084.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482085.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482086.png" />. In particular, if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482087.png" /> denotes <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482088.png" />, then one is asking for an algorithm for recognizing if an arbitrary string forms part of the language <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482089.png" />. If such an algorithm in fact exists for the grammar <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482090.png" />, then the problem of its complexity — or, in other words, of the recognition complexity of the language <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/g/g044/g044820/g04482091.png" /> — is important.
+
Studies of generative grammars necessarily involve algorithmic problems. If $  \alpha $
 +
is a property of languages, $  {\mathcal T} $
 +
is a decidable class of grammars, and if there exists an algorithm which allows one to decide, for any grammar $  \Gamma \in {\mathcal T} $,  
 +
whether or not the language $  L ( \Gamma ) $
 +
has the property $  \alpha $,  
 +
then one says that $  \alpha $
 +
is decidable in the class $  {\mathcal T} $.  
 +
In the class of all generative grammars no non-trivial property (i.e. such that the corresponding class of languages contains both languages which display and languages which do not display this property) is decidable. One may speak, in a similar manner, about relations which are decidable in a certain class of grammars. There also arise problems of a different type, e.g. the existence, for a given grammar $  \Gamma $,  
 +
of an algorithm by which it is possible to find, from any $  n $
 +
strings $  x _ {1} \dots x _ {n} $
 +
in its terminal alphabet, the value of a given predicate $  P ( L, \omega _ {1} \dots \omega _ {n} ) $
 +
for $  L = L ( \Gamma ) $,  
 +
$  \omega _ {1} = x _ {1} \dots \omega _ {n} = x _ {n} $.  
 +
In particular, if $  P ( L, \omega ) $
 +
denotes $  \omega \in L $,  
 +
then one is asking for an algorithm for recognizing if an arbitrary string forms part of the language $  L ( \Gamma ) $.  
 +
If such an algorithm in fact exists for the grammar $  \Gamma $,  
 +
then the problem of its complexity — or, in other words, of the recognition complexity of the language $  L ( \Gamma ) $—  
 +
is important.
  
 
See also [[Grammar, context-sensitive|Grammar, context-sensitive]]; [[Grammar, context-free|Grammar, context-free]]; [[Grammar, linear|Grammar, linear]]; [[Grammar, regular|Grammar, regular]]; [[Mathematical linguistics|Mathematical linguistics]].
 
See also [[Grammar, context-sensitive|Grammar, context-sensitive]]; [[Grammar, context-free|Grammar, context-free]]; [[Grammar, linear|Grammar, linear]]; [[Grammar, regular|Grammar, regular]]; [[Mathematical linguistics|Mathematical linguistics]].
Line 15: Line 114:
 
====References====
 
====References====
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M. Gross,  A. Lentin,  "Introduction to formal grammars" , Springer  (1970)  (Translated from French)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.V. Gladkii,  "Formal grammars and languages" , Moscow  (1973)  (In Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  J.E. Hopcroft,  J.D. Ullman,  "Formal languages and their relation to automata" , Addison-Wesley  (1969)  (Revised version:  "Introduction to automata, languages and computation"  Addison-Wesley, 1979)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  A.V. Gladkii,  A.Ya. Dikovskii,  "Theory of formal grammars"  ''J. Soviet Math.'' , '''2''' :  5  (1974)  pp. 542–564  ''Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika'' , '''10'''  (1972)  pp. 107–142</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.N. Maslov,  E.D. Stotskii,  "Classes of formal grammars"  ''J. Soviet Math.'' , '''6''' :  2  (1976)  pp. 189–209  ''Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika'' , '''12'''  (1975)  pp. 156–187</TD></TR><TR><TD valign="top">[6a]</TD> <TD valign="top">  E.D. Stotskii,  "Formal grammars and constraints on derivation"  ''Problems Inform. Transmission'' , '''7''' :  1  (1971)  pp. 69–81  ''Problemy Peredachi Informatsii'' , '''7''' :  1  (1971)  pp. 87–101</TD></TR><TR><TD valign="top">[6b]</TD> <TD valign="top">  E.D. Stotskii,  "Control of the conclusion in formal grammars"  ''Problems Inform. Transmission'' , '''7''' :  3  (1971)  pp. 257–270  ''Problemy Peredachi Informatsii'' , '''7''' :  3  (1971)  pp. 87–102</TD></TR></table>
 
<table><TR><TD valign="top">[1]</TD> <TD valign="top">  M. Gross,  A. Lentin,  "Introduction to formal grammars" , Springer  (1970)  (Translated from French)</TD></TR><TR><TD valign="top">[2]</TD> <TD valign="top">  A.V. Gladkii,  "Formal grammars and languages" , Moscow  (1973)  (In Russian)</TD></TR><TR><TD valign="top">[3]</TD> <TD valign="top">  J.E. Hopcroft,  J.D. Ullman,  "Formal languages and their relation to automata" , Addison-Wesley  (1969)  (Revised version:  "Introduction to automata, languages and computation"  Addison-Wesley, 1979)</TD></TR><TR><TD valign="top">[4]</TD> <TD valign="top">  A.V. Gladkii,  A.Ya. Dikovskii,  "Theory of formal grammars"  ''J. Soviet Math.'' , '''2''' :  5  (1974)  pp. 542–564  ''Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika'' , '''10'''  (1972)  pp. 107–142</TD></TR><TR><TD valign="top">[5]</TD> <TD valign="top">  A.N. Maslov,  E.D. Stotskii,  "Classes of formal grammars"  ''J. Soviet Math.'' , '''6''' :  2  (1976)  pp. 189–209  ''Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika'' , '''12'''  (1975)  pp. 156–187</TD></TR><TR><TD valign="top">[6a]</TD> <TD valign="top">  E.D. Stotskii,  "Formal grammars and constraints on derivation"  ''Problems Inform. Transmission'' , '''7''' :  1  (1971)  pp. 69–81  ''Problemy Peredachi Informatsii'' , '''7''' :  1  (1971)  pp. 87–101</TD></TR><TR><TD valign="top">[6b]</TD> <TD valign="top">  E.D. Stotskii,  "Control of the conclusion in formal grammars"  ''Problems Inform. Transmission'' , '''7''' :  3  (1971)  pp. 257–270  ''Problemy Peredachi Informatsii'' , '''7''' :  3  (1971)  pp. 87–102</TD></TR></table>
 
 
  
 
====Comments====
 
====Comments====

Latest revision as of 07:49, 13 May 2022


Chomsky grammar

A type of formal grammar (cf. Grammar, formal); actually it is a special case of a Post calculus (see Post canonical system). A systematic study of this grammar was begun in the 1950s by N. Chomsky, who pointed out its applications to linguistics and isolated the classes of generative grammars which are most important to applications — context-sensitive grammars, context-free grammars, regular grammars (cf. Grammar, context-sensitive; Grammar, context-free; Grammar, regular). These classes proved to be especially interesting from the mathematical point of view.

A generative grammar is an ordered quadruple $ \Gamma = \langle V, W, I, R \rangle $, where $ V $ and $ W $ are disjoint finite sets, known, respectively, as the terminal and non-terminal alphabets, or dictionaries (their elements are called, respectively, terminal, or basic, and non-terminal, or auxiliary, symbols), $ I $ is an element of $ W $ called the initial symbol, and $ R $ is a finite set of rules of the form $ \phi \rightarrow \psi $, where $ \phi $ and $ \psi $ are strings (words, cf. Word) over the alphabet $ V \cup W $ and $ \rightarrow $ forms no part of $ V \cup W $; $ R $ is called the scheme of the grammar. If two strings $ \xi $ and $ \eta $ can be represented as, respectively, $ \xi = \chi _ {1} \phi \chi _ {2} $, $ \eta = \chi _ {1} \psi \chi _ {2} $, where $ \phi \rightarrow \psi $ is one of the rules of $ \Gamma $, then one says that $ \eta $ is directly derivable from $ \xi $ in $ \Gamma $ (denoted by $ \xi \Rightarrow _ \Gamma \eta $ or $ \xi \Rightarrow \eta $). A sequence of strings $ ( \omega _ {0} \dots \omega _ {n} ) $ is said to be a derivation of $ \omega _ {n} $ from $ \omega _ {1} $ in $ \Gamma $ if for all $ i $, $ 1 \leq i \leq n $, $ \omega _ {i- 1} \Rightarrow _ \Gamma \omega _ {i} $; the number $ n $ is the length of the derivation. A derivation is said to be complete if $ \omega _ {0} = I $ and $ \omega _ {n} $ does not contain non-terminal symbols. If there exists a derivation of a string $ \eta $ from a string $ \xi $ in $ \Gamma $, then $ \eta $ is said to be derivable from $ \xi $ in $ \Gamma $. The set of strings in the terminal alphabet that are derivable in $ \Gamma $ from $ I $ is said to be the language generated by the grammar $ \Gamma $ (denoted by $ L ( \Gamma ) $). Two generative grammars are equivalent if they generate the same language. The class of languages generated by all possible generative grammars coincides with the class of recursively-enumerable sets of strings.

The so-called complexity functions, the most important ones being time complexity and space complexity, are used to estimate the complexity of a derivation in generative grammars. The time complexity of a grammar $ \Gamma $ is a function $ \tau _ \Gamma ( n) $ of a natural argument, the value of which for each $ n $ is equal to the smallest number $ k $ having the following property: For any string $ x \in L ( \Gamma ) $ such that $ | x | \leq n $ (where $ | x | $ is the length of $ x $) there exists a derivation $ D $ of this chain from the initial symbol of $ \Gamma $ with length at most $ k $; if there are no strings $ x $ such that $ x \in L ( \Gamma ) $ and $ | x | \leq n $, then $ \tau _ \Gamma ( n) = 0 $. The space complexity $ \sigma _ \Gamma ( n) $ of the grammar $ \Gamma $ is defined in an analogous manner by replacing the length of a derivation $ D = ( \omega _ {0} \dots \omega _ {s} ) $ by the greatest length of the strings $ \omega _ {i} $, $ i = 0 \dots s $.

Let $ M ^ \prime $ be some set of complete derivations in the grammar $ \Gamma $ and let $ L ^ \prime $ be the set of terminal strings of derivations belonging to $ M ^ \prime $; then $ L ^ \prime \subseteq L ( \Gamma ) $. If $ M ^ \prime $ is effectively given, then one says that a certain way of controlling a derivation in $ \Gamma $ has been given. A study of the ways of controlling a derivation is material in applications, since the possibility of using certain definite derivations rather than random ones corresponds to the situation which occurs in natural language. Control of a derivation can be realized, in particular, by imposing restrictions on the sequence of rules used in the derivation (e.g. the set of "admissible" sequences of rules may itself be generated by some generative grammar $ \Gamma ^ \prime $; in such a case the language is defined by an ordered pair of generative grammars $ ( \Gamma , \Gamma ^ \prime ) $, which is called a generalized grammar), either on the strings constituting the derivation or by some more complicated method (e.g. the rule employed in a subsequent stage may depend on the type of the string obtained in the preceding stage).

Studies of generative grammars necessarily involve algorithmic problems. If $ \alpha $ is a property of languages, $ {\mathcal T} $ is a decidable class of grammars, and if there exists an algorithm which allows one to decide, for any grammar $ \Gamma \in {\mathcal T} $, whether or not the language $ L ( \Gamma ) $ has the property $ \alpha $, then one says that $ \alpha $ is decidable in the class $ {\mathcal T} $. In the class of all generative grammars no non-trivial property (i.e. such that the corresponding class of languages contains both languages which display and languages which do not display this property) is decidable. One may speak, in a similar manner, about relations which are decidable in a certain class of grammars. There also arise problems of a different type, e.g. the existence, for a given grammar $ \Gamma $, of an algorithm by which it is possible to find, from any $ n $ strings $ x _ {1} \dots x _ {n} $ in its terminal alphabet, the value of a given predicate $ P ( L, \omega _ {1} \dots \omega _ {n} ) $ for $ L = L ( \Gamma ) $, $ \omega _ {1} = x _ {1} \dots \omega _ {n} = x _ {n} $. In particular, if $ P ( L, \omega ) $ denotes $ \omega \in L $, then one is asking for an algorithm for recognizing if an arbitrary string forms part of the language $ L ( \Gamma ) $. If such an algorithm in fact exists for the grammar $ \Gamma $, then the problem of its complexity — or, in other words, of the recognition complexity of the language $ L ( \Gamma ) $— is important.

See also Grammar, context-sensitive; Grammar, context-free; Grammar, linear; Grammar, regular; Mathematical linguistics.

References

[1] M. Gross, A. Lentin, "Introduction to formal grammars" , Springer (1970) (Translated from French)
[2] A.V. Gladkii, "Formal grammars and languages" , Moscow (1973) (In Russian)
[3] J.E. Hopcroft, J.D. Ullman, "Formal languages and their relation to automata" , Addison-Wesley (1969) (Revised version: "Introduction to automata, languages and computation" Addison-Wesley, 1979)
[4] A.V. Gladkii, A.Ya. Dikovskii, "Theory of formal grammars" J. Soviet Math. , 2 : 5 (1974) pp. 542–564 Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika , 10 (1972) pp. 107–142
[5] A.N. Maslov, E.D. Stotskii, "Classes of formal grammars" J. Soviet Math. , 6 : 2 (1976) pp. 189–209 Itogi Nauk. i Tekhn. Teor. Veroyatnost. Mat. Statist. Teoret. Kibernetika , 12 (1975) pp. 156–187
[6a] E.D. Stotskii, "Formal grammars and constraints on derivation" Problems Inform. Transmission , 7 : 1 (1971) pp. 69–81 Problemy Peredachi Informatsii , 7 : 1 (1971) pp. 87–101
[6b] E.D. Stotskii, "Control of the conclusion in formal grammars" Problems Inform. Transmission , 7 : 3 (1971) pp. 257–270 Problemy Peredachi Informatsii , 7 : 3 (1971) pp. 87–102

Comments

See also Formal languages and automata.

For a survey of the topic "controlling a derivation" see [a1], Chapt. 5.

References

[a1] A. Salomaa, "Formal languages" , Acad. Press (1973)
[a2] J.E. Hopcroft, J.D. Ulman, "Introduction to automata theory, languages and computation" , Addison-Wesley (1979)
How to Cite This Entry:
Grammar, generative. Encyclopedia of Mathematics. URL: http://encyclopediaofmath.org/index.php?title=Grammar,_generative&oldid=11465
This article was adapted from an original article by A.V. Gladkii (originator), which appeared in Encyclopedia of Mathematics - ISBN 1402006098. See original article