Formal Grammar and Information Theory: Together Again?
||5.0 (1 WK096)
||Oct 26, 2008|
This is a Ph.D. research paper for advanced graduate students only.
ABSTRACT: In the last forty years, research on models of spoken and written language has been split between two seemingly irreconcilable traditions: formal linguistics in the Chomsky tradition, and information theory in the Shannon tradition. Zellig Harris had advocated a close alliance between grammatical and information-theoretic principles in the analysis of natural language, and early formal-language theory provided another strong link between information theory and linguistics. Nevertheless, in most research on language and computation, grammatical and information-theoretic approaches had moved far apart. Today, after many years in the defensive, the information-theoretic approach has gained new strength and achieved practical successes in speech recognition, information retrieval, and, increasingly, in language analysis and machine translation. The exponential increase in the speed and storage capacity of computers is the proximate cause of these engineering successes, allowing the automatic estimation of the parameters of probabilistic models of language by counting occurrences of linguistic events in very large bodies of text and speech. However, I will also argue that information-theoretic and computational ideas are playing an increasing role in the scientific understanding of language, and will help bring together formal-linguistic and information-theoretic perspectives.