How do computers represent, process and organize textual and spoken information? In this class, we will look at everyday tasks that involve natural language processing: document classification, spelling and grammar correction, dialogue systems, machine translation, search engines and forensic linguistics. You will get insight into how these systems work. We also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.



Course Requirements

There will be six assignments, one essay, and two exams (a mid-term exam and a final exam). The mid-term will consist of the material covered in the first half of the class, and the final will be comprehensive, but with a greater emphasis on the contents covered in the second half of the class.

A tentative schedule for the entire semester is posted on the schedule page. Readings and assignments may change up to one week in advance of their due dates.

Given that assignments and the exams address the material covered in class, good attendance is essential for doing well in this class.


Homeworks (40%): Six assignments. Homework assignments are due by the beginning of class. No late homeworks will be accepted. The lowest grade will be dropped, so each homework that counts is worth 8%.

Essay (15%): A 1000-1500 word essay on a topic dealing with the social implications of computational applications for language.

Mid-term Exam (20%): There will be a mid-term exam in class on February 26 over the material covered in class up to February 14.

Final Exam (25%): The final exam will be given during finals week and will cover all course material.

Grades will be assigned using the standard OSU scale.


We'll be using the Carmen system for the schedule, homework and reading assignments. There will also be discussion forums for posting questions and providing feedback (comments, complaints or ideas) during the course, anonymously if desired.

Materials for in-class activities for each unit will be posted on Carmen, as will the slides presented in class. These slides are meant to aid classroom discussion and cannot replace actually being in class. Other readings may also be assigned periodically.

The course satisfies the GEC category 1B2 (formerly 2B), Mathematical and Logical Analysis. It does so by using natural language systems to motivate students to exercise and develop a range of basic skills in formal and computational analysis. The course philosophy is to ground abstract concepts in real world examples. We introduce strings, regular expressions, finite-state and context-free grammars, as well as algorithms defined over these structures and techniques for probing and evaluating systems that rely on these algorithms. The course goes beyond merely subjective evaluation of systems, emphasizing analysis and reasoning to draw and argue for valid conclusions about the design, capabilities and behavior of natural language systems. This course does not cover speech (audio) or statistical analysis in depth.

