The Lain Model

What is the Lain model?

The Lain model implements a theory of the acquisition of vowel normalization based on an infant's creative generation and alignment of models of the self and others, and serves as a formalization of a computational modeling framework. Details of the conceptualization and theory are provided in Plummer (2014) and in Plummer and Beckman (to appear). The framework focuses on basic aspects of the vocal interaction between caregiver and infant agents, how each agent represents this interaction, and how these representations structure phonological acquisition. The implementation itself is divided into a base model, consisting of a core set of functions, and a set of modules that extend the base model. The base model and demos-module are currently available for download. Other modules are in development and will be made available as they are completed.

Downloads

base model and demos-module: [tar.gz, doc]
virtual machine [speechkitchen link]

Quick Start Demos

The base model and demos-module are written in the programming language R. The virtual machine already has R installed. If you don't have R installed on your own computer you can get it here.

The base model and demos-module also require the packages listed below. The virtual machine already has these packages installed, but if you're using the tar.gz download, you'll need to install them before proceeding.

igraph, network, mgcv, Matrix, RANN, rgl, scales, ggplot2, akima, mnormt

After unpacking the tar.gz download (or within the virtual machine) change directory to the demos-module. There should be several demo scripts within this directory. We'll walk through two of them: basicDemo.R and quadSequenceResponseFieldDemo.R

Basic Demo

meh

The script basicDemo.R carries out the alignment of the two data sets shown above. Let's call the red points hexagon1 and the blue points hexagon2. We'll briefly cover how the alignment happens. First, run the script, which will generate several images that are saved in the directory demos-module/figures/demo-figures/basic-demo-figures. You can do so within an R session by executing the command below.

source("basicDemo.R")

Now, open the script in an editor of your choosing in order to view the commands listed below. The initial portion of the script (about the first 180 lines) generates the data and creates several visualizations. The first step in aligning the data sets is the creation of graphs over each one in turn. The following code within the script does just this.

nNeighbors <- 2
adjMatrix1 <- adjacencyRelation(hexagon1,neighborhoodSize=nNeighbors)
adjMatrix2 <- adjacencyRelation(hexagon2,neighborhoodSize=nNeighbors)
weightedAdjacency1 <- weightedAdjacency(adjMatrix1)
weightedAdjacency2 <- weightedAdjacency(adjMatrix2)

The graphs yielded by these commands are shown below. The nNeighbors <- 2 specifies that the graphs we're creating are based on looking at the two nearest neighbors of each data point. The adjacenyRelation function creates the graphs over the data sets accordingly, and the weightedAdjacency function assigns weights the graph edges. In this case each edge is assigned a weight of 1.

meh

Next, we need to add edges to these graphs that capture intuitive correspondences between the data points. The following code creates an alignment index that encodes the correspondence.

alig1 <- c(1:6)
alig2 <- c(1:6)
alignment <- cbind(alig1,alig2)

Adding edges to the graphs according to the alignment index yields the following aligned graph.

meh

Finally, the aligned graph is used to generate a mapping, call it f, from the original data points to representations that reflect the alignment. The code below derives the mapping.

embedding <- combinedWeightedAdjacencyEigenmap(weightedAdjacency1,weightedAdjacency2,alignment,mu=1)
embeddedHex1 <- embedding[[1]]
embeddedHex2 <- embedding[[2]]

The new "aligned" representations are depicted below.

meh

Intermediate Demo

meh

The script quadSequenceResponseFieldDemo.R carries out the alignment of the two data sets shown above. Let's call the blue points quad1 and the magenta points quad2. This demo differs from the basicDemo above in a few key ways: i) the data sets are much larger, ii) the alignment index specifies only a partial correspondence between points across the data sets, iii) the alignment index is derived from "response fields" over the data sets, and iv) the aligned representations are used to generate paths through a space (i.e., the alignment is just a step toward further computation).

We'll go through the main computations step by step. First, run the script, which will generate several images that are saved in the directory demos-module/figures/demo-figures/quad-sequence-response-field-demo-figures. Again, you can do so within an R session by executing the command below. The script takes a little longer to run than the basicDemo, so be patient.

source("quadSequenceResponseFieldDemo.R")

Now, open the script in an editor of your choosing in order to follow along with the commands listed below. The first step in our more advanced model of alignment is the creation of a "cognitive object", which we call quadObject, that organizes the data. This is achieved by the following line in the script:

quadObject <- initiateCognitiveObject()

Since the sets quad1 and quad2 are quite large, it is beneficial (e.g., for perceptual categorization experiments) to have smaller "stimulus sets" that fall within these larger data sets. Let's called these stimulus sets stimulusGrid1 and stimulusGrid2. The stimulusGrid1 for quad1, which contains 361 points, is depicted below.

meh

Next, we need to place the stimulus sets stimulusGrid1 and stimulusGrid2, and the data sets quad1 and quad2, within the quadObject. The addNewComponent function is used to add new components to cognitive objects, so we can add these components as follows:

quadObject <- addNewComponent(quadObject,newComponent=stimulusGridData1,componentType="stimulusSets",componentName="stimulusGrid1")
quadObject <- addNewComponent(quadObject,newComponent=stimulusGridData2,componentType="stimulusSets",componentName="stimulusGrid2")
quadObject <- addNewComponent(quadObject,newComponent=quad1Data,componentType="dataSets",componentName="quad1")
quadObject <- addNewComponent(quadObject,newComponent=quad2Data,componentType="dataSets",componentName="quad2")

Now, to make the next few steps conceptually clearer, we carry out a simple renaming to convey that the quadObject is being used to model a subject's knowledge of the quad data. This renaming is carried out as follows.

quadSubject1 <- quadObject

Suppose the subject being modeled by quadSubject1 has knowledge of the "corners" of the quads, meaning that they can differentiate corner regions from non-corner regions, and can label the corner regions as "corner 1", "corner 2", "corner 3", and "corner 4". Moreover, suppose the subject is presented with the stimulus grid data and is asked to rate how good each data point is as an example of each of the corners. For example, for "corner 1" of quad1, the "goodness ratings" to the stimuli in stimulusGrid1 might look as follows, where point size is positively correlated with goodness rating:

meh

Next, we need to place the responses to the stimulus setsstimulusGrid1 and stimulusGrid2, within quadSubject1. The addNewComponent function is again used to add these components. The generateStimulusResponseField function combines a stimulus grid with the responses for each corner into a single representation of the responses called a "stimulus response field".

quadSubject1 <- addNewComponent(quadSubject1,generateStimulusResponseField(quadSubject1,stimulusSetName="stimulusGrid1",categoryResponseSetName="quad1ResponseSet"),componentType="stimulusResponseFields",componentName="stimulusResponseField1")
quadSubject1 <- addNewComponent(quadSubject1,generateStimulusResponseField(quadSubject1,stimulusSetName="stimulusGrid2",categoryResponseSetName="quad2ResponseSet"),componentType="stimulusResponseFields",componentName="stimulusResponseField2")

Suppose we have ratings for stimulusGrid1 and stimulusGrid2, and for each corner. We can extrapolate the ratings to larger data sets quad1 and quad2 using statistical modeling methods, yielding "data response fields" like the one depicted below for corner 1. The scale level assigned to each point in quad1 models the subject's potential rating of that point as a good example of corner 1.

meh

Next, we need to place the response fields over quad1 and quad2 within quadSubject1. The addNewComponent function is again to add these components. The generateResponseField carries out the statistical modeling.

quadSubject1 <- addNewComponent(quadSubject1,generateResponseField(quadSubject1,dataSetName="quad1",stimulusFieldName="stimulusResponseField1",responseVars=list("corner1","corner2","corner3","corner4"),smoothFormula="s(x,k=3)+s(y,k=3)"),componentType="dataResponseFields",componentName="quad1ResponseField")
quadSubject1 <- addNewComponent(quadSubject1,generateResponseField(quadSubject1,dataSetName="quad2",stimulusFieldName="stimulusResponseField2",responseVars=list("corner1","corner2","corner3","corner4"),smoothFormula="s(x,k=3)+s(y,k=3)"),componentType="dataResponseFields",componentName="quad2ResponseField")

Now, we can use response fields to select the best examples of corners from both quads. The function generateCategoryPairing carries this out. The pairing involving the twenty best examples of each corner for each quad is depicted below.

meh

We add the pairing to quadSubject1 using the addNewComponent function. Moreover, the pairing is converted into an alignment index via the function pairingToAlignment, which is also added to quadSubject1.

quadSubject1 <-  addNewComponent(quadSubject1,newComponent=generateCategoryPairing(quadSubject1,responseFieldNames=c("quad1ResponseField","quad2ResponseField"),categoryNames=c("corner1_pred","corner2_pred","corner3_pred","corner4_pred"),pairingType="maxOrd",pairingNum=20),componentType="pairingSets",componentName="test1")
quadSubject1 <- addNewComponent(quadSubject1,newComponent=pairingToAlignment(quadSubject1,pairingName="test1",categoryList=list("corner1_pred","corner2_pred","corner3_pred","corner4_pred"),alignmentType="constant",alignmentWeight=10),componentType="alignmentSets",componentName="aligQuad1Quad2")

Next, we generate manifolds over the quad1 and quad2 data sets using the generateManifold function and add them to quadSubject1 using the addNewComponent function.

quadSubject1 <- addNewComponent(quadSubject1,newComponent=generateManifold(quadSubject1,dataSetName="quad1",manifoldType="nn",neighborhoodSize=20,weightType="id"),componentType="structureSet",componentName="quad1manifold")
quadSubject1 <- addNewComponent(quadSubject1,newComponent=generateManifold(quadSubject1,dataSetName="quad2",manifoldType="nn",neighborhoodSize=20,weightType="id"),componentType="structureSet",componentName="quad2manifold")

The generated manifolds and the alignment index are used to generate a mapping from the quad1 and quad2 data sets to a set of representations that reflect the alignment. The function cognitiveAlignmentAndEigenmap outputs representation sets provided by the mapping, one corresponding to quad1, called quad1Alig, and another corresponding to quad2, called quad2Alig. Both of quad1Alig and quad2Alig, depicted below, are added to quadSubject1.

alignedQuads <- cognitiveAlignmentAndEigenmap(quadSubject1,manifold1Name="quad1manifold",manifold2Name="quad2manifold",alignmentName="aligQuad1Quad2",aligRepName1="quad1Alig",aligRepName2="quad2Alig",repPrecision=c(2:4))
quadSubject1 <- addNewComponent(quadSubject1,newComponent=alignedQuads[["quad1Alig"]],componentType="dataSets",componentName="quad1Alig")
quadSubject1 <- addNewComponent(quadSubject1,newComponent=alignedQuads[["quad2Alig"]],componentType="dataSets",componentName="quad2Alig")
meh

The representations in quad1Alig and quad2Alig can be used as the basis for the generation of manifolds, which can be added to quadSubject1.

quadSubject1 <- addNewComponent(quadSubject1,newComponent=generateManifold(quadSubject1,dataSetName="quad1Alig",manifoldType="nn",neighborhoodSize=20,weightType="id"),componentType="structureSet",componentName="quad1Aligmanifold")
quadSubject1 <- addNewComponent(quadSubject1,newComponent=generateManifold(quadSubject1,dataSetName="quad2Alig",manifoldType="nn",neighborhoodSize=20,weightType="id"),componentType="structureSet",componentName="quad2Aligmanifold")

These manifolds provide the means for path construction between points in both quad1Alig and quad2Alig. In particular, the function generatePairingPaths creates paths between points drawn from pairings. Paths from corner 2 to corner 3 within the quad1Alig and quad2Alig data sets can be constructed and added to quadSubject1. Two such paths are depicted below.

quadSubject1 <- addNewComponent(quadSubject1,generatePairingPaths(quadSubject1,pairingName="test1",categoryNames=c("corner2_pred","corner3_pred"),pairingIndices=c(1),manifoldName="quad1Aligmanifold"),componentType="pathSets",componentName="paths1")
quadSubject1 <- addNewComponent(quadSubject1,generatePairingPaths(quadSubject1,pairingName="test1",categoryNames=c("corner2_pred","corner3_pred"),pairingIndices=c(1),manifoldName="quad2Aligmanifold"),componentType="pathSets",componentName="paths2")
meh