Linear grammar

In computer science, a linear grammar is a context-free grammar that has at most one nonterminal in the right hand side of its productions.

A linear language is a language generated by some linear grammar.

Example

A simple linear grammar is G with N = {S}, Σ = {a, b}, P with start symbol S and rules

S aSb
S ε

It generates the language \{ a^ib^i \; | \; i \geq 0\}.

Relationship with regular grammars

Two special types of linear grammars are the following:

Collectively, these two special types of linear grammars are known as the regular grammars; both can describe exactly the regular languages.

Another special type of linear grammar is the following:

By inserting new nonterminals, every linear grammar can be brought into this form without affecting the language generated. For instance, the rules of G above can be replaced with

S aA
A Sb
S ε

Hence, linear grammars of this special form can generate all linear languages.

Expressive power

All regular languages are linear; conversely, an example of a linear, non-regular language is {anbn}, as explained above. All linear languages are context-free; conversely, an example of a context-free, non-linear language is the Dyck language of well-balanced bracket pairs. Hence, the regular languages are a proper subset of the linear languages, which in turn are a proper subset of the context-free languages.

While the linear languages that are regular languages are deterministic, there exist linear languages that are nondeterministic. For example, the language of even-length palindromes on the alphabet of 0 and 1 has the linear grammar S → 0S0 | 1S1 | ε. An arbitrary string of this language cannot be parsed without reading all its letters first which means that a pushdown automaton has to try alternative state transitions to accommodate for the different possible lengths of a semi-parsed string.[1] This language is nondeterministic. Since nondeterministic context-free languages cannot be accepted in linear time, linear languages cannot be accepted in linear time in the general case. Furthermore, it is undecidable whether a given context-free language is a linear context-free language.[2]

Closure properties

If L is a linear language and M is a regular language, then the intersection L \cap M is again a linear language; in other words, the linear languages are closed under intersection with regular sets. Furthermore, the linear languages are closed under homomorphism and inverse homomorphism.[3] These three closure properties show that the linear languages form a full trio. Full trios in general are language families that enjoy a couple of other desirable mathematical properties.

References

  1. Hopcroft, John; Rajeev Motwani; Jeffrey Ullman (2001). Introduction to automata theory, languages, and computation 2nd edition. Addison-Wesley. pp. 249–253.
  2. Greibach, Sheila (October 1966). "The Unsolvability of the Recognition of Linear Context-Free Languages". Journal of the ACM 13 (4). doi:10.1145/321356.321365.
  3. John E. Hopcroft and Jeffrey D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley Publishing, Reading Massachusetts, 1979. ISBN 0-201-02988-X., Ex. 11.1, pp. 282f

This article is issued from Wikipedia - version of the Tuesday, October 27, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.