Formal language theory

Subject classification: this is a mathematics resource.
Educational level: this is a tertiary (university) resource.
Completion status: this resource is a stub, so not much has been done yet.
Development status: this resource is experimental in nature.


A formal language is a set of strings over a finite alphabet. Formal language theory is the study of formal languages, or often more accurately the study of families of formal languages. It deals with hierarchies of language families defined in a wide variety of ways. Formal language theory is concerned with the purely syntactical aspects, rather than a semantics or meaning of the strings[1].

It is closely related to automata theory, which deals with formally defined machines that accept (or generate, according to the viewpoint) formal languages.

It is also related to complexity theory, because once a formal system capable of universal computation is fixed, a computational problem is nothing but a formal language containing all the descriptions of problem instances, and the answer to the problem is the subset of instances that would be accepted (could be derived) -- which need not be effective, and need not be a computable set, of course.


Preliminaries edit

The free monoid   over a finite set   is the set of finite strings whose letters are in the alphabet  , together with the basic operation of concatenation  , which puts two strings together.   contains the empty word, written as   or  . Languages may be finite or infinite (  itself is a language containing all possible words over  , whilst   is a finite language), but contain only finite words.

Hierarchies edit

One is often interested in establishing hierarchies of language families. This is often much easier than in the field of computational complexity theory, since the languages of a family often share some structural property that allows one to pinpoint the difference between two families.

Hierarchies you might want to know about include the (extended) Chomsky hierarchy. It originates with the linguist Noam Chomsky, who sought to classify syntactic phenomena in natural languages. The classes described therein are now basic to formal language theory because they have many equivalent definitions and many of their properties are well known.


Planned Subjects edit

  • Complexity of parsing 
  • Decidability of certain questions 
  • Beyond ordinary words 
  • Some applications 


Extensions edit

Topics closely related to formal language theory are the study of infinite words, labeled partial orders, the study of graph languages, formal power series. Some of these subjects have a history of their own, but all include as a special case the case of ordinary word languages. Also, infinite alphabets have found applications.

Applications edit

Outside of computer science edit

Formal language theory finds uses outside of computer science and can also be of use to group theorists, for instance, when studying finitely generated groups.

Further Reading edit

Hopcroft and Ullman (the original version)

References edit

  1. Wikipedia, https://en.wikipedia.org/wiki/Formal_language