Statistical models of language acquisition

Semester:Fall 2012 (Aug 22-Dec 4)
Meeting time:Tuesday/Thursday, 3:55-5:10
Room:Pomerene 206
Instructor:Micha Elsner (melsner@ling.osu.edu)
Office hours:TBA or by appointment

As shown by experiments, infants are sensitive to the statistical regularities of the world around them and can learn to recognize patterns in the stimuli they are exposed to. This has led to a variety of computational models of early language learning based on statistical inference. These models inform us about the kinds of evidence and cognitive biases necessary to learn particular linguistic generalizations. Some of these models are also intended to suggest particular mechanisms of infant learning or explain observed developmental trajectories.

This course is intended to provide an introduction to the computational tools and research methods used to model language acquisition. The course will assume a basic knowledge of linguistics; detailed knowledge of developmental linguistics will be very helpful but not necessary. However, we will not cover non-computational work on developmental linguistics in any detail. We will also assume an understanding of basic probability. There will be two introductory lectures covering standard machine learning techniques for unsupervised learning intended to make the course more accessible to students who are comfortable with mathematical techniques but have not worked extensively with machine learning before. (Students who are not sure of the prerequisites or want to take this course alongside comp ling or psych ling 1 should see the instructor beforehand.)

The course will be divided into two sections. In the first, we will take a brief tour of the literature on statistical acquisition, covering phonetics, the lexicon, morphophonology and syntax. In the second, students will propose a short research project of their choice involving experimenting with a statistical model of acquisition. In each section, students will present one or more papers to the class (how many depends on class size); after the project proposals, the students will choose papers to present which are related to their projects.

Students will be required to post a question on each reading to Carmen the day before class; these questions will be organized and raised in discussion by the leader for that day.

Grades will be based 40% on presentations, 40% on projects and 20% on short paper reviews for each presented paper.

Schedule

Introductory lectures (1.5 weeks)

Course overview (this is the first class, so no reading, although students who haven't might look at Saffran "Statistical language learning: Mechanisms and constraints" Current Directions in Psychological Science 2003):

28 August Maximum likelihood methods and EM (optional reading
Bishop "Pattern recognition and machine learning", chapter 9):

30 August Bayesian methods (optional reading Navarro, Griffiths, Steyvers and Lee "Modeling individual differences using Dirichlet Processes" Journal of Mathematical Psych 2006)

30 August Please email me the list of three or four papers you would like to present.

Morphophonology (1 weeks):

20 September Peperkamp, Le Calvez, Nadal, and Dupoux "The acquisition of allophonic rules: statistical learning with linguistic constraints" Cognition 2006 AND Martin, Peperkamp and Dupoux "Learning phonemes with a proto-lexicon" CogSci 2012 (page proofs on Carmen) Abby Walker, if she wants, or else Evan

25 September Wilson and Hayes "Maximum entropy phonotactics" Linguistic Inquiry 2008 (up to section 6) Jennifer Zhang

25 September Your project proposal is due in two weeks! Start thinking/talking to me about papers you want to read in the second half of the course.

Project proposals (.5 weeks)

9 October Your written project proposal (about 1 page) is due before class today, as is your list of paper suggestions for the second half of the class. Since not too many people seem to be officially taking the class, audits and vagabonds may also make paper suggestions, which will be considered after those from enrolled students.

Proposed projects: Mental state verbs, filler-gap dependencies.

Class-proposed papers (4-5 weeks)

11 October Barak, Fazly and Stevenson "Modeling acquisition of mental state verbs" CMCL 2012

16 October Shatz, Wellman and Silber "The acquisition of mental verbs: A systematic investigation of the first reference to mental state" Cognition 1983

18 October Alishahi and Stevenson "A computational model of early argument structure acquisition" CogSci 2008

23 October Continue discussion of Alishahi

25 October Papafragou, Cassidy and Gleitman "When we think about thinking: the acquisition of belief verbs" Cognition 2005

29 October Gagliardi, Mease and Lidz "U-shaped development in the acquisition of filler-gap dependencies: Evidence from 20-month-olds" submitted to Language Learning and Development 2012

1 November Connor, Gertner, Fisher and Roth "Baby SRL: Modeling early language acquisition" CONLL 2008

6 November Yuan, Fisher and Snedeker "Counting the nouns: simple structural cues to verb meaning" Child Development 2012

8 November Perfors, Tenenbaum and Wonnacott "Variability, negative evidence, and the acquisition of verb argument constructions" Journal of Child Language 2010

13 November Nixon "Mental state verb production and sentential complements in four-year-old children" First Language 2005

20 November Titov and Klementiev "A Bayesian Approach to Unsupervised Semantic Role Labeling" EACL 2012

25 November Class canceled

27 November Continue discussion of Titov

4 December Dillon, Dunbar and Idsardi "A single stage approach to learning phonological categories: Insights from Inuktitut" submitted to Cognition 2010

Project presentations (1 weeks)

12 December at 12pm in the linguistics lab!

Collaboration policy

This is a seminar, so go ahead and discuss the papers and your projects outside of class; however, please write the questions you post to Carmen yourself. If you want to collaborate on a project, please see me; otherwise, your project writeup is expected to be your own work, although you are encouraged to use software, datasets and articles written by others, as long as you give credit where due using citations. See the COAM site http://oaa.osu.edu/coam.html

Disability policy

Any student who feels they may need an accommodation based on the impact of a disability should contact me privately to discuss their specific needs, and contact the Office for Disability Services at 614-292-3307 in room 150 Pomerene Hall to coordinate reasonable accommodations.