Chair for Human Language Technology and Machine Learning


Human Language Technology and Pattern Recognition

The Lehrstuhl für Informatik 6 is concerned with research on advanced methods for statistical pattern recognition. The main application of these methods is in the field of automatic processing of human language, i.e. the recognition of speech, the translation of spoken and written language, the understanding of natural language and spoken dialogue systems. The general framework for the research activities is based on statistical decision theory and problem specific modelling. The prototypical area where this approach has been pushed forward is speech recognition. The approach is expressed by the equation:

Speech Recognition = Acoustic-Linguistic Modelling + Statistical Decision Theory

The characteristic advantages of the probabilistic framework and statistical decision theory are:

  • The approach is able to model weak dependencies and vague knowledge at all levels of the system.
  • The free parameters of the models can be automatically learned from training data (or examples), and there exist powerful algorithms for this purpose.
  • Using the Bayes decision rule (as derived from statistical decision theory), the final decision is made by taking all available context into account. For example, in large vocabulary speech recognition, a sound is always recognized as a part of a word, which itself is part of a sentence. This allows the optimal feedback from the syntactic-semantic constraints of the language down to the level of sound recognition.

From speech recognition, we have extended and are still extending this approach to other areas, in particular the translation of spoken and written language and other tasks in natural language processing.

For language translation, the approach is expressed by the equation:

Language Translation = Linguistic Modelling + Statistical Decision Theory

This approach has been pursued in projects like VERBMOBIL (German) and EUTRANS (European). The experimental comparisons with traditional rule-based and other competing approaches show that the statistical approach is competitive in terms of performance or even superior. In addition, it offers a couple of advantages like increased robustness and easy adaptation to a new task. In the final large-scale end-to-end evaluation of the VERBMOBIL translation project, the RWTH Aachen translation approach achieved a sentence error rate which was lower by a factor of two in comparison with three competing translation approaches.

In summary, the research activities of the Lehrstuhl für Informatik 6 cover the following applications:

  • speech recognition
    • large vocabulary recognition
    • multi-lingual speech recognition
    • speaker independent and adaptive speech recognition
    • robust speech recognition
  • machine translation of spoken and written language
  • natural language processing
    • document classification
    • language understanding
    • information retrieval for text and audio documents
    • spoken dialogue systems
  • image recognition

Most of these research activities have been or are carried out in the framework of national or European projects, such as the national German project VERBMOBIL and European projects like ARISE, EUTRANS, CORETEX, and ADVISOR. In addition, there are bilateral research projects with companies.