Blog Archive

Monday, September 2, 2019

9a. Pinker, S. Language Acquisition

Pinker, S. Language Acquisitionin L. R. Gleitman, M. Liberman, and D. N. Osherson (Eds.),
An Invitation to Cognitive Science, 2nd Ed. Volume 1: Language. Cambridge, MA: MIT Press.
Alternative sites: 1, 2.



The topic of language acquisition implicate the most profound questions about our understanding of the human mind, and its subject matter, the speech of children, is endlessly fascinating. But the attempt to understand it scientifically is guaranteed to bring on a certain degree of frustration. Languages are complex combinations of elegant principles and historical accidents. We cannot design new ones with independent properties; we are stuck with the confounded ones entrenched in communities. Children, too, were not designed for the benefit of psychologists: their cognitive, social, perceptual, and motor skills are all developing at the same time as their linguistic systems are maturing and their knowledge of a particular language is increasing, and none of their behavior reflects one of these components acting in isolation.
        Given these problems, it may be surprising that we have learned anything about language acquisition at all, but we have. When we have, I believe, it is only because a diverse set of conceptual and methodological tools has been used to trap the elusive answers to our questions: neurobiology, ethology, linguistic theory, naturalistic and experimental child psychology, cognitive psychology, philosophy of induction, theoretical and applied computer science. Language acquisition, then, is one of the best examples of the indispensability of the multidisciplinary approach called cognitive science.

Harnad, S. (2008) Why and How the Problem of the Evolution of Universal Grammar (UG) is HardBehavioral and Brain Sciences 31: 524-525

Harnad, S (2014) Chomsky's Universe. -- L'Univers de ChomskyÀ babord: Revue sociale es politique 52.

70 comments:

  1. Pinker provides an overview of what we do know about how children acquire language. We know around when they acquire different abilities, have an idea of what skills might be involved when for acquiring different parts of a language, and we have found some patterns that can be properties of a Universal grammar (like parameter setting).

    One principle that seemed indirectly refuted early on was the Learnability Theory. The reason it was rejected was because there didn't seemed to be that children were not provided with sufficient negative feedback for it to be valid. However, I would argue that children do not need to be told, "that's the wrong formulation" for them to receive negative feedback. Instead, they can receive that feedback indirectly by comparing their predictions with what happens.

    Let's say a child, by context, can predict what his mom will say next. The child formulates a possible proposition to say the same thing in his head. Then, he compares that proposition to what his mom says. Suddenly, both the words that were used and that weren't used become important. This process, repeated enough times, will tell the child what they cannot say because it hasn't been said - and there is your negative feedback.

    So, the Learnability Theory still stands. For further reflection, I wonder what other linguistic tips and tricks can be learned by a thing's absence, rather than its presence.

    ReplyDelete
    Replies
    1. Actually, I made the same mistake you did for many years. I thought that even if children never speak or hear UG-violating sentences aloud, they might think them, and then, hearing others say it otherwise than the way they would have said it, that would serve as a correction.

      There are two reasons this wouldn't work:

      (1) If children ever think UG-violations, it's highly unlikely that they would never say them aloud. (They think and say plenty of OG violations aloud, and it can take a while before those get corrected -- and for some people, and some errors, never!)

      (2) UG is about grammatical subtleties: Think of the "John is easy/eager to please... Mary" examples. We utter a huge number of sentences, most of them only once in a lifetime. What are the chances that we would think of uttering the exact same sentence someone else utters -- and that ours would be the UG-violating version? And worse, what are the chances that we would do it enough to be able to infer the correct UG-rule when it took linguists decades to learn what those rules were (while all along having their own unconscious UG in their brains, giving them immediate feedback that it was a violation whenever anyone generated a UG-violation) and linguists are still not finished reverse-engineering UG!

      No, the unlearnability of UG is a much bigger problem than that.

      Think of "Laylek."

      But here's an interesting conjecture: All written and spoken text is UG-compliant. GMT-3, based on big bodies of such text, can also produce only UG-compliant speech -- but that doesn't mean that GMT-3 has learned UG; it just mixed and matched the UG-compliant speech of others.

      So what if we fed GMT-3 both UG-compliant and UG-violating speech (by deliberately generating UG-violations)? And suppose we did it two ways: for one GMT-3 without flagging what was UG-violating -- hence unsupervised learning -- and for another GMT-3 flagging (with the linguists' "*") the UG-violating sentences: Would the unsupervised GMT-3 generate UG-violating output and the supervised GMT-3 not?

      But it wouldn't work. Because UG-violating sentences don't make sense: *"John is easy to please Mary". That's why Chomsky thinks the UG constraint is on thinking, not on language (or thinking is much more language-like than we thought): We can't think UG-violating thoughts.

      I'm not sure about all that...

      Delete
    2. Hi Harnad,

      To your point (1): I'm not sure how we can judge what proportion of children's UG violations are said out loud - namely because we have no measure of how many aren't said. It's entirely possible that for every non-UG sentence a child speaks, no matter how many they speak or how many get corrected, 10 more go unsaid in their head, and I don't see yet why we should eliminate that possibility.

      To point (2): point taken - most of our sentences, uttered or thought, are unique, so if a correction were to be inferred, it would have to be indirect, which multiplies the issue.

      To as to whether we can think UG violating thoughts - what would you say to anyone learning a new language? I'm trying out Spanish at the moment, and I can tell you that most of my Spanish thoughts are ungrammatical without me knowing until I'm corrected. Without the corrections, I would think that they're grammatical, even if they're not! I suspect that that contention is irrelevant though, and that my thoughts, while in discord with Spanish grammar, may still abide by UG. Let me know!

      Delete
    3. All bets are off for 2nd-language learning, especially late 2nd-language learning. And your Spanish errors, whether thought or said, are OG violations (and will get corrected, if you keep trying).

      Delete
  2. When thinking about the origin and evolution of UG, I keep getting stuck on why some ungrammatical sentences sound inherently wrong. Children never say sentences that violate UG but is this because of the presence of UG or did the confusing nature of these sentences pre-date UG in the brain. In other words, did UG arise in the brain in order to both allow rapid language acquisition and prevent confusing (and therefore ungrammatical) sentences, or did the progressive evolution of UG rules in the brain cause sentences to sound inherently ungrammatical (therefore preventing us from thinking and speaking UG-violating sentences)? Or did both occur simultaneously, causing a vicious cycle which enabled the rapid emergence of UG? This is not only difficult to answer but also doesn’t touch on the equally important questions of why UG would be necessary for language to the extent that it would be selected for by evolution and why this hasn’t arisen in other cooperating species.

    ReplyDelete
    Replies
    1. Utterances that violate UG sound wrong to the ear of a normal member of the human species today. That is because two or three hundred thousand years ago the human brain (for some reason(s)) rapidly evolved, and one of the capacities it evolved was the capacity for language.

      What evolved? The capacity for language has many components: phonology, prosody, vocabulary, Ordinary Grammar (variable, changing, and learned), Universal Grammar (universal and unlearnable by the child), semantic (including propositional) content.

      Language also had precursor capacities, shared with many other species: communication, category learning, imitation, attention, (perhaps) pointing, sentience and thinking (cognition).

      Language can express thought in words. So you are asking whether the (universal) rules for expressing thoughts in words are rules for what can be said or rules for what can be thought.

      And as you note, this question is nonspecific about how or why language evolved.

      Delete
    2. Hi Professor!
      In response to your statement that UG violations sound wrong, I was curious whether UG, or the same UG can be understood as governing sign languages, as well as spoken languages. As someone who does not sign, I am not sure that I would be able to catch an incorrect sign - on a UG level - or if such thing even exists. It would be interesting to me to consider whether sign language is in fact governed by the same capacities, or whether lacking these capacities, a person would be able to sign acceptably and be understood, but not be able to reliably form UG-compliant utterances in spoken or written language.

      Delete
  3. One of the most interesting points I found in this paper was what I would call the “Goldilocks” of innate learning mechanisms.

    “Any theory that posits too little innate structure, so that its hypothetical child ends up speaking something less than a real language, must be false. The same is true for any theory that posits too much innate structure, so that the hypothetical child can acquire English but not, say, Bantu or Vietnamese” (page 2).

    It is incredible how our ability to learn language is so perfectly balanced that it is broad enough to learn any language, but specific enough so that minimal input is needed for children to learn. Pinker suggests that synaptic plasticity in childhood could underly setting some of these parameters for UG, and this makes sense on the surface to me, but I am still unsure as to exactly how this information could be stored in a synapse. I think this connects back to the idea of reverse-engineering the heart in comparison to the brain, and that just studying the synapses I don’t think is going to be a thorough explanation of how we learn language.

    I think something useful for reverse-engineering would be the subset principle. This seems like a very plausible and convincing explanation for me as to how children are able to learn language with just positive examples because they have made the right first assumption and can update this assumption with only the positive instances that they hear. Hypothesizing these parameters and then empirically testing them to see if they are UG rules is not a trivial task, but I think with more research it could be possible to incorporate these rules into the starting settings for our reverse-engineered cognitive candidate.

    ReplyDelete
    Replies
    1. I also paused when I read those couple of sentences. They reminded me of how we were talking about lazy evolution not getting too specific because then it would be harder to process variation. I feel like this same concept holds true here: we want whatever our innate ability to produce/understand language is to be specific enough to teach us language from scratch, but general enough to be able to teach us any language. In terms of synapses, my educated guess would be that since learning is stored as a change in synaptic strength, this will have something to do with it. During what I guess would be called a critical period when kids learn languages most easily, there's a lot of neurogenesis and pruning going on to keep synapses that have stored relevant language information for the specific languages you're learning and get rid of ones that aren't being used as much. In this case I guess we'd start with a whole lot of synapses (maybe even the amount we'd need to learn any language) and the strength and number of synapses and neurons would get refined as languages are learned. This would sort of explain why learning languages as an adult feels a lot harder.

      In a reverse engineering context that'd mean having to hard code in all the rules for all possible languages to be learned and then keep/strengthen the ones for languages you learn (assign a higher value?) and weaken ones that aren't used as much? My coding skills are nowhere near close to being able to get the proper technicality of what I'm trying to communicate but overall, I agree that something needs to be hard coded in and then modified throughout learning

      Delete
    2. Oh actually to add to that (I'm still going through the reading so I just saw this now) the authors said that in addition to most adults never being able to fully master a foreign language, they also usually have an accent in that language. I was wondering what the repercussions would be if the neurons that weren't used were pruned, like in that case, how come we can still learn language well enough. So I guess that sort of answered my question: we almost never learn language perfectly, and even if we almost do, we will probably have an accent. I also remember a study that I will no way remember in full but it mentioned that adults that have not made/heard certain sounds after a certain age can never produce them later. I don't know how accurate that study was but it sort of ties into the idea that there might be some permanent loss in language ability from the neuronal pruning after that critical period

      Delete
    3. (when I talk about language ability I mainly mean grammar and phonology)
      (sorry @Stephanie I'll stop spamming your post with replies and just keep my thoughts for my own post)

      Delete
    4. Still a lot of conflation of OG (learned) and UG (innate) here, and also conflation of UG and language.

      UG is not learnable (positive examples only, Laylek-like) so it is innate. But UG has "parameters" -- properties like whether a language drops pronouns by turning them into inflections at the end of the word:

      French (non pro-drop) « Je viens » (I'm coming.)
      Spanish "vengo" (I'm coming.) (No need to say "Yo vengo".

      Pro-drop languages have a slightly different UG from non-pro-drop languages, but whether a language is pro-drop or not is learnable, because there are plenty of positive and negative examples to learn it from.

      Same is true for the order parameter: Subject/Verb/Object (SVO) (English) She loves him. VSO: Loves she him. (Arabic). VOS: Loves him she. etc. (See WP for all 6 possibilities: My native Hungarian is free order, but you have to use names because Hungarian has no he or she! John loves Mary can be:
      János szereti Máriát or Máriát szereti János or Szereti Máriát János etc. any combination because it's the word-endings [inflections] that matter.
      If Mary loves John, it becomes, in any order: Jánost szereti Mária, etc.)

      All languages comply with UG, but the (learned) parameter setting makes the shape that UG takes slightly different.

      Yes, the innate UG with its learnable parameters was perhaps a compromise of lazy evolution.

      But it's still hard to explain how and why UG evolved.

      It is Chomskian linguistics, however (and not the data on synapse-pruning) that is reverse-engineering UG.

      So, because of parameter differences:

      "Who did he think went out?" and "Who did he thank that went out?" are both UG-compliant in English (though the second one used to violate OG, which used to require "Whom")

      But *"Who did he thank went out?" and *"Who did he think that went out?" both violate UG -- in English.

      But in Hungarian three of the four are ok (because of the inflections):

      Kiről hitte kiment? "Who did he think went out?"
      Kiről hitte hogy kiment? *"Who did he think that went out?"
      Kinek köszönt hogy kiment? "Who did he thank that went out?"

      For an intuition of how it used to be ok in Shakespearean (OG) English:

      Whereof (instead of “who”) did he think that went out?

      (I’m not sure these examples are all right! I’m just trying to convey the flavor of how different parameter settings slightly alter UG,)



      Delete
    5. With phonology it's "use it or lose it." All infants are born with potential CP boundary between ra and la, but if your language does not use one or the other or both, by the time you learn a second language you can never get ra/la quite right.

      Delete
    6. I am going to try to write this again in my own words just to make sure I fully understand it and so I can make sure language, UG, and OG are distinct in my mind. Language is made up of many things, with one of the underlying principles being universal grammar (UG) as well as more language-specific ordinary grammar (OG), as well as a number of other related components like prosody, phonology, etc.. I suppose that UG and OG are both still types of categorization in a way, in that we still must do the right thing with the right kind of thing but in this case the things are noun phrases, verb phrases, and more.

      The UG itself is innate and not learned, which just means that it is not through the environment that you come to learn all of the parameters. What you do learn through the environment is the settings of these parameters. So if a parameter is like a switch, the switch itself is innate but which setting the switch is on is determined by environmental inputs. I think where my main misunderstanding came from was that I thought you could only learn the setting that the UG parameter is set to by positive instances that you hear. In your reply you say that there are plenty of positive and negative instances to learn this setting, so does this mean that you would hear instances of people speaking French just saying “viens” but then being corrected or finding out that this was a negative instance?

      OG consist of rules that are not set by an innate UG structure. One could make OG grammatical errors and say *I runned or *I swimmed which are fine in UG but not OG (for English). Children have access to both positive and negative instances to learn the OG rules which is why they don't need to be innate.

      Delete
    7. Yes, UG-compliance vs. UG-noncompliance (as well as OG-compliance vs. OG-noncompliance are categories. Distinguishing them (and producing them) is a categorization task.

      UG is not learned, but the setting of some of its "parameter settings" (like is this language pro-drop/non-prodrop or SVO/OSV/VSO etc.) are learned, like whether to set a toaster at light, medium, dark. UG is the toaster, and the switch-setting is learned. (Distinguish parameter-setting from rule-learning (or innate rules).

      A French child would quickly learn that its language is non-pro-drop: If it drops the pronoun, no one knows what is the subject of the verb. Ditto for word order. In the germanic languages, largely VSO, there's still a little scope for SOV, but you have to say it with a German accent and intonation if you say it in the least inflected one of all, English: "JOHN hess ziss aufternoon in ze park MARY hit." In Hungarian (no order, fully inflected) you can say it in any order you like).

      With a pro-drop language you would also learn from experience that you can do it either way. (It's not "environmental inputs" but trial-and-error, with corrective feedback from what works and what doesn't, what you hear [often] and what you don't).

      Some of the UG parameter-settings have a kind of ordering in priority, and there you might get enough feedback from the positive cases alone. It's as if it can be either 1 or 2 or 3 or 4. So hearing 3 is enough to signal that it's not 1 or 2.

      Delete
  4. I am going to comment and ask questions regarding the idea of parameter setting and UG that was discussed in the reading and in Stephanie's comment thread above.

    As the professor said in his reply above, Universal Grammar (UG) is innate and cannot be learned. It makes sense that there must be some kind of innate language structure given that children are able to acquire language rapidly in the absence of consistent positive feedback, substantial negative feedback, and a insufficient amount of language exposure (given that children are forming complex, grammatically correct sentences by age 3) to explain language acquisition through learning alone. Ordinary grammar (OG), on the other hand is the result of learning and exposure to the world. We care about UG because it must be the case that in order reverse engineer language capacities, we would need to have a system of constraints and parameters in T3 that would be similar to our innate UG.

    My question pertains to the idea of parameters in UG. In the article, Pinker postulates that languages are built on the same "basic plan", and that perhaps all languages diverge in specific places where there are a few possible permutations. These parameters have to be learned, so it must the case that the parameters are related but separate from UG (since UG is innate)? In the article, Pinker mentions that there might be a 'default' parameter that children assume (such as rigid word order) which can be altered according to what they encounter around them. Does this mean that the default parameter would be part of UG, since default implies a preexisting 'setting' of something?

    ReplyDelete
    Replies
    1. The idea of UG is that we have an innate predisposition towards language-learning. We are, presumably, born with a template of what language is going to look like - a general enough idea that babies can then come to learn any language, but specific enough that we can overcome the poverty of the stimulus. (Ling majors, please correct this ling minor if I'm wrong) The parameters you are talking about, the language-specific divergences Pinker discusses, are simply possible variations in-built into our UG. For example, X-bar structure was posited to explain different word orders in different languages for this reason. As we age and age out of the various critical periods for language acquisition, we presumably lose access to all the different variations that are accessible to us through UG, and come to use only those available to us in our environment and further strengthened by our OG.

      Delete
    2. This makes sense and I think I have a much better handle on it now. I think because UG is thought of as hardwired or innate I mistakenly thought that this kind of flexibility might be incompatible. However it is as you and others have said that there must be some degree of flexibility in our innate language capacity so that any given child has the ability to learn whatever language is spoken by those around them. It seems to be the case that UG endows us with the ability to learn any possible language (exemplified by the fact that children can easily pick up multiple languages at a young age) through these parameters and then the relevant parameters are 'activated' because of the child's linguistic environment.

      Delete
    3. Allie About parameters, see Replies above. Default settings are related to what I said about priority order for 1 2 3 4: Before you start, assume it's 1. If you encounter more than 1 start with the highest you encounter, say, 3, and if you encounter nothing higher, assume that's it.

      Yes, UG enables you to learn any possible (i.e., UG-compliant) language. But 1st-language learning is special. You can learn multiple 1st-languages during the critical period, but later languages are learned differently, and late 2nd language speakers are not so good at making UG-compliance judgments in their 2nd language.

      Eli, UG less a "blueprint" (what's a blueprint?) then a set of rules or constraints). With parameter-settings, there's a critical period for 1st-language parameter settings, and then they're set and the option to set them another way is "pruned" like synapses, or the ra/la distinction.

      Delete
  5. Is the youtube video not working for anyone else?

    ReplyDelete
  6. “English (Language A) has to be hypothesized before Language C, and rejected only if a subjectless and suffixless sentence turns up in the input. That is because Language C is a superset of English; if the learner tries C first, nothing in the input will ever tell him he's wrong. Language B can be hypothesized at any point, and confirmed whenever the child hears a sentence with an agreement in it or disconfirmed when the child hears a sentence without agreement”
    Universal grammar is innate and not learned because there are never any negative instances to learn from since nobody violates these rules. Ordinary grammar rules, which are learned, can slightly modify parameters of UG. While learning OG, children usually start with a certain assumption and stick with it until they get to an instance in which is doesn’t work. For example, children learn that to use a verb in past tense, they should just add -ed. This will work until the child reaches a word like run with an irregular past tense. At this point the rule has to be modified to include the irregular word. This all makes sense to me, learning OG sounds like it happens by trial and error. But we don’t really have much of an explanation for UG, we just know that it’s innate. But is that all we need to know to reverse engineer language (or at least the grammar part of language)? For UG, with a big enough sample, we can figure out which mistakes kids simply never make when learning to speak, regardless of the language. We could hard code those instances into the robot. Then OG just has to be learned by supervised learning. I’m sure I’m oversimplifying it but to me it sounds like the grammar aspect of language is definitely something we can figure out. How we managed to get UG, what its evolutionary pressure was, etc… is something that needs more thought, but in terms of reverse engineering it sounds like we have a plan?
    Also, what I mentioned above about synaptic plasticity and pruning would I guess be the mechanism involved in learning OG and not UG (strengthening synapses carrying information about correct instances & pruning wrong instances).
    On a separate note: The authors mentioned that blind kids easily learn language but now I’m wondering how they’d ground their symbols. Usually I imagined a mom pointing at the sun and saying “sun” to the kid. And I know with blind people you usually just describe things instead, but if a child is still learning language how would you describe the sun/ how would they ground the symbol sun? I’m sure this has an answer that is easier than I think, I just can’t seem to figure it out.

    ReplyDelete
    Replies
    1. "For UG, with a big enough sample, we can figure out which mistakes kids simply never make when learning to speak, regardless of the language."

      There's an infinity of mistakes that children never make. How do you find and hard-code things that never happen?

      Blind kids can't see anything (e.g., colors) and they know it, just as we know bats, but not humans, can do echolocation, and a color-blind person knows they will never see certain colors. But why would you think that means their language is not grounded?

      Delete
    2. Oh no I'm sure their language is grounded, I was just wondering how since my idea of grounding is very visual and I imagined it would be hard to describe the sun to a baby who cannot understand the words being used to describe it.

      Would using as many UG mistakes as we can come up with work instead? Or maybe we can categorize UG mistakes into different types and then stick to a few examples from each type? Surely we don't have to hard code the infinite mistakes, but maybe there's a way we can get close enough?

      Delete
    3. Think of grounding as sensorimotor, not just visual.

      And you don't need UG mistakes to ground category names.

      Yes, UG is learnable, but only by teams of adult linguists, across decades, guided by their grammaticality judgments (because they have an innate UG in their heads already) as they test explicit hypotheses about what the right rules might be.

      You don't learn OG, either, from a huge set of violations. You learn it be trying to say something and finding out it can't be said quite that way. From that, your brain infers (or your instructor explains) what the underlying rule is. It's the rule you need to encode, not a list of violations.

      Delete
    4. "For example, Gleitman (1990) points out that when a mother arriving home from work opens the door, she is likely to say, "What did you do today?," not I'm opening the door. Similarly, she is likely to say "Eat your peas" when her child is, say, looking at the dog, and certainly not when the child is already eating peas."

      This extract also served to remind us of the distinction between sensorimotor and visual. Children do not just learn by observing what is happening and associating words to it. The sensorimotor interaction required of children is multidimensional. Accordingly, the sensorimotor interaction required of T3 robots would also need to use multiple sensory modalities to gather sufficient input about the world for sensorimotor grounding.

      Delete
    5. Sensorimotor grounding is multidimensional, but, more important, it can also be interactive, as in "affordances." See above.

      Delete
    6. I hadn’t initially thought about how a blind person would ground words… As Harnad mentioned above, grounding is sensorimotor, not just visual. I would imagine blind people would utilize their senses of touch, smell and taste, to interact with objects and ground the words that classify them. I also imagine it would be quite helpful, being able to hear others speak, during this process. The more I think about it, the less of a problem language acquisition for a blind person seems. The only difference would be that they wouldn’t have a visual for their words — but is this really that detrimental, given that their most important senses (the ones they would be using to interact with the world) don’t include vision?

      I think an even more interesting question might be how a person who is both blind and deaf, acquires language.

      Delete
  7. I know prof said not to make our skywritings too long but when I saw these next couple of sentences I thought of bunny and ever since I learned of her existence five days ago I’ve just come to love her so much so here’s a separate skywriting about her.
    “Our car. Papa away. Dry pants… These sequences already reflect the language being acquired: in 95% of them, the words are properly ordered”
    Is this language? This sounds to me like red apple which we all know bunny the dog can definitely say! If this is already classified as language then I think that settles our debate from last class. Bunny is getting trained a lot to learn these phrases whereas children learn them without actively trying sure, but maybe that’s because of Baldwinian evolution where part of this is innate and doesn’t have to be learned from scratch anymore and because of Darwinian evolution which says that if something is seen to help us survive and reproduce we will be able to learn it more easily/faster and more efficiently.
    I guess the next step for bunny would be to see if she would be able to apply grammar, because right now she just has a bunch of words in specific tenses. Maybe a good test would be to see if she would be able to learn to differentiate between a situation in which she should press buttons for let’s walk vs buttons for we walked. I feel like (at least from what I’ve seen) bunny seems to have pretty good sentence order and part of me definitely wants to take that as evidence of some sort of UG. Although it could be that she has never heard the sentence ordered in any other way and so she has memorized (“lets walk” and never “walk lets”) so maybe this is OG? Or just conditioning? But that would be some very impressive conditioning considering she is able to form basic sentences the same way a 2-year-old would. I just really want to live in a world where we figure out a way for my cat to talk to me (although she’d probably just want food or space but still)

    ReplyDelete
    Replies
    1. It's not "red apple" that bunny can't say, but "apple red" (and an infinity of other things that you can say, if you know the difference between "red-apple" and the proposition "the apple is red).

      Sentence order is not UG; it's a habit, probably learned from Bunny's human. And beware of "translations"! "Papa away" does not mean "Papa is away". It could just as well be translated as "no-papa" (vs. "here-papa").

      I love Bunny too. I just don't want to put words -- or, rather, propositions -- in her mouth...

      Delete
  8. In Pinker's article, we learned about the rapid trajectory of language acquisition.

    Chomsky falsified the belief that language must be learned and is not a module.

    Firstly, there is evidence that shows 18-month-old babies can put two or three words together in a sequence that conforms to syntactic rules. This is because of Universal Grammar (UG) which is unlearnable and hardwired into their brains.

    Secondly, even though they seldom produce utterances longer than two or three words, kids know the right word orders. But because they are not receiving enough exposure to grammatically-correct linguistic data, how is this possible? Well, Chomsky believed that it was because of the Language Acquisition Device, some innate language-learning structure.

    By the time toddlers are two to three years old, they can start communicating using complex sentences, where one "branch" is embedded in another. It is also at around this age when they will overregularize past tense forms of irregular verbs. As mentioned in Lyla's post, children can only correct this kind of error through trial-and-error or reinforcement/supervised learning.

    Taking a step back, does UG really help? It seems that there are many different ways for a child (or a T3 robot) to learn a language.

    - From the behaviourist approach, language can develop from operant conditioning involving positive or negative feedback.
    - From the connectionist approach, language results from environmental input; as long as there is exposure to words and sentences, this allows for the generation of rules.
    - From the statistical learning approach, language is the outcome of discovering linguistic patterns from environmental input.

    Nonetheless, I agree with Allie that we need to fully understand "the system of principles and parameters in UG to reverse engineer our language capacities."

    ReplyDelete
    Replies
    1. I'm not sure the 3-word strings of 18-month olds are complex enough to test whether they have UG or just OG.

      "Does UG really help?" This is the question of what UG in particular contributes to language. It's obvious how language helps.

      Delete
    2. Hi Professor. Sorry, I did not intend to sound like I think language or UG is not adaptive or helpful to us. I actually wanted to ask whether UG was absolutely necessary for language learning. My apologies for the incorrect wording.

      After today's class, I think I can answer my own question. According to the Poverty of Stimulus (PoS), children do not encounter negative evidence (like ungrammatical phrases) or enough linguistic data to extract grammatical rules for language acquisition. Consequently, Chomsky argued that they must have a set of basic rules obeyed by every language genetically hardwired into the brain, called Universal Grammar (UG). *However, this is not the case for Pirahã! It seems that UG can also be considered the capacity to learn Ordinary Grammar (OG) too. Another reason is that UG accelerates language learning. For example, being able to set parameters (e.g., word order or recursion) in the critical periods allows the child to "learn one fact about a language [and] deduce that other facts are also true of it without having to learn them one by one" (Pinker, 2008, page 27).

      Delete
    3. Ting, no apologies: You're doing well! I just meant you should distinguish the benefits of language and the benefits of UG: UG is just a part of language. The question about the origin and adaptive value of UG is only part of the question of the origin and adaptive value of language.

      There is generalization from individual errors and corrections in OG learning, but there are no UG errors to generalize from, because there are no UG errors. What Pinker (who fails to distinguish OG from UG) probably is referring to is the learning of UG parameter settings. But that is indeed learning, with trial, error, and correction, just as with OG.

      Delete
  9. As yet, it still seems plausible to me that UG could rely on the exact same mechanisms that underpin our typical capacity for CP. This article discusses the way that languages may have different arrangements of different component grammatical rules but any and all permutations of said rules found in any language ever spoken share certain syntactic properties. If these syntactic properties are so universal today, there must be some dynamic system built into the human machinery by genetic coding that gives almost every human brain the capacity to learn languages so young.
    I made a connection between this reading and last week's reading also by Pinker (and Bloom) in their discussion of syntax. More than categorizing words as certain kinds of things with which to do certain kinds of things in relation to other words and kinds of things, our linguistic capacities depend on our ability to categorize and thus link phrases and clauses.

    In "Natural Language and Natural Selection" (1990) they write "All you need for recursion is an ability to embed a phrase containing a noun phrase within another noun phrase or a clause within another clause...Given such a capacity one can now specify reference to an object to an arbitrarily fine level of precision" (sec. 5.3.2). Then in this article, "Language Acquisition" (1995), Pinker addresses the ways in which children interpret and generate recursive embedded phrases and clauses from a very young age, demonstrating more complex categorization which occurs without direct instruction. Pinker states "if children are constrained to look for only a small number of phrase types, they automatically gain the ability to produce an infinite number of sentences," suggesting that the ability to categorize is central to language acquisition and UG. Categorization allows for the linguistic precision and infinite possible outputs which characterize human languages.

    ReplyDelete
    Replies
    1. I see a few problems with the idea that UG may rely on our "typical capacity for CP". If I understand your argument correctly, you are arguing that a certain form of CP is what allows us to learn langugage and that that is UG. We have discussed two forms of categorical perception: the unsupervised form and the supervised form. In the case of language, we must resort to the unsupervised learning form given the notorious absence of negative feedback in the linguistic environment. A child is never told what isn't UG-conforming. But categorization cannot occur without negative feedback (recall the "Layleks" discussed in class). Furthermore, if unsupervised categorization was all there was to UG, then animals could probably do it too. But they can't and that's because they don't have any innate grammar that allows them to learn propositional symbol systems.

      On a similar note though, categorical perception certainly is part of the language learning process. For instance, we learn speech sounds by pruning away speech sound representations that are not used in our mother tongue. And the boundaries between the speech sounds that we do use in our mother tongue become more salient.

      Delete
    2. Alex (PoM) (and Solim), don't mix up category learning (sup and unsup) and CP: Do you know the difference?

      Pinker, in both papers (last week and this week) ignores the difference between UG and OG, mixing up what is learnable with what is not learnable.

      UG rules are recursive. But recursion is not UG.

      Solim you are right that sup learning needs both positive and negative feedback. And you can only learn simple, obvious categories with unsup learning alone.

      The only analogy between innate UG and innate phoneme boundaries like ba/da or ra/la is about UG parameter settings. Once you set a parameter in your native language during the critical period, later languages do not re-set or co-set those parameters. See replies above. You lose that option if you don't learn a first language that uses it, just like you lose ra/la.

      Delete
    3. So my argument basically comes from "Categorical Perception" (2003) by Harnad. Under the subheading "Evolved CP:"

      "In this respect, the "weaker" CP effect for vowels, whose motor production is continuous rather than categorical, but whose perception is by this criterion categorical, is every bit as much of a CP effect as the ba/pa and ba/da effects....it looks as if the effect is an innate one: Our sensory category detectors for both color and speech sounds are born already "biased" by evolution: Our perceived color and speech-sound spectrum is already "warped" with these compression/separations."

      I think it's logical that this innate CP capacity that has evolved to be nearly universal to humans is what gives us UG. Clearly UG is not words or ideas we are born conscious of and apply once we learn to speak, just sitting there in our minds waiting to be used. CP is our innate ability to recognize categorical features and patterns by some internal causal mechanism(s) which can automatically compress within-category differences and/or enhance between category differences and thus we can interpret things in the world, and determine whether they're the right kind of thing or another kind of thing.

      UG seems to be an enigma... we don't know exactly what it is because we don't know what it isn't. But, because it has to do with syntax, and syntax is the ordering of word categories, sentential constituents and affixes etc., I want to argue that UG is possible because of CP. Encyclopaedia Britannica describes: "Universal grammar consists of a set of atomic grammatical categories and relations that are the building blocks of the particular grammars of all human languages." My primary thread is that it is all about categories, and learned languages represent forms of learned CP.

      Perhaps CP is the more basic structural mechanism which evolved before UG, and UG is in fact what we're calling an evolved, more complex addition onto the house of CP. In class we discussed how there is no "half-UG" that we've found evidence for so it's hard to explain where it came from - the assumption being that the UG gene didn't just pop into existence one day fully formed to generate this capacity for language.
      Under the subheading "Acquired Distinctiveness," Harnad (2003) writes "Eimas et al. (1971)...found that infants already have speech CP before they begin to speak. Perhaps, then, it is an innate effect, evolved to "prepare" us to learn to speak. But Kuhl (1987) found that chinchillas also have "speech CP" even though they never learn to speak, and presumably did not evolve to do so." I want to suggest instead that "speech CP" is CP that allowed for the further development of UG. The fact that infants have it already, and so do other organisms who can produce sound, is suggestive of the idea that UG is a categorical mechanism for sound production evolving after and from earlier categorical mechanisms which existed before speech and language.

      In response to your last lines, prof, - The parameters of UG are then at least partially grounded in innate phoneme boundaries which we perceive because of innate CP. So perhaps certain other kinds of CP are also harder or impossible to change/relearn after a certain critical period?

      Delete
    4. CP, whether it’s innate or learned, is about perception: things look more different (or more similar) than would be expected. With innate CP it’s more d/s than expected from their physical differences and in learned CP it’s more d/s after learning compared to before.

      Yes, distinguishing UG-compliant sentences from UG-violating sentences is categorization. With OG it’s learned; with UG it’s innate. So what would be the perceptual effect of UG (if there is one)?

      For OG CP, violations of an OG rule should sound different after you’ve learned the rule, compared to before. Nobody has tested this; but iit would be a natural extension of other learned categorization.

      What complicates it is that OG-compliance and OG-violation is not just something you hear, but also something you do. But that is true for phonemes too: They’re not just something you hear, but something you also produce (hence the motor theory of speech perception).

      So maybe, as you suggest, UG rules have a CP-like effect on how we perceive and produce sentences. But just as features are what underlie visual CP effects, features would have to be what underlies both OG and UG CP effects (if there are any). And the “features” of UG-compliant sentences are the rules that generate them. So whether or not there is a UG CP effect, it’s the rules we need to reverse-engineer; the perceptual CP effects, separation/compression, are a side-effect that make the errors pop out, if you ever hear them.

      It is not that “we” don’t know what UG is (and isn’t)! Those who study it and who do research on it know. And all of us know a violation when we hear one. And none of us produce violations. So our brains “know” UG implicitly. Linguists are just trying to make the formal rules explicit, the way they are explicit for OG.

      CP is not a mechanism. Category learning is the mechanism (feature-detection). CP is a perceptual side-effect of feature detection and filtering. In the case of phoneme CP (and UG CP, if there is any) the fetaure “learning” was done by evolution. But how? and what were the adaptive advantages of UG? (The evolution of phoneme CP is not a puzzle the way the evolution of UG is.) I think you are equating speech perception/production and language syntax a bit too closely.

      (It is true, though, that Chomsky and many linguists consider innate constraints on phonology to be like innate UG. It’s just that, despite all my fondness for CP, I don’t see how it casts any light on UG!)

      Delete
  10. After reading Pinker’s paper, I also read Harnad’s paper about “why and how” UG is hard (2008). In the section “Evolutionary trial and error”, professor Harnad writes that “evolution faces the same learning problem the child does” (which I also agree with). I think another thing that I would just want to add is that, in the previous couple lectures, we’ve noted that natural selection doesn’t “choose” anything; so, in order for UG to have evolved, it must have had a clear adaptive advantage, which I fail to see now. I feel like, if anything UG is a bit of a hindrance because it seems extremely complex, and it isn’t necessarily required in order to communicate. It all just seems so circular.
    I really like the idea proposed by Chomsky that maybe UG is constrained by how we think. This kind of sounds like the opposite of the whorf hypothesis (in which language defines how we perceive the world around us; not quite the same but sort of?).

    What is the stance of whether other species also categorize? If we think that cognition is categorization, then maybe UG is a by-product of categorization. I mean, all the NP and VP and the types of branches that can be built look like they’re really structured and organized… I only ask about other species because, if they also categorize, then never mind all that I just said because then clearly their categorization skills did not lead to anything homologous to UG.

    ReplyDelete
    Replies
    1. If UG is because or constraints on how we can think, then those are the constraints that need to be explained evolutionarily -- or as formal logical necessities (which they are not).

      Of course other species categorize! (Remember what categorization means?)

      Delete
  11. In this essay, Pinker contends that Universal Grammar should include all universal linguistic characteristics, and that a complete theory of a Universal Grammar would help us understand what the crucial functions are that enable humans to acquire language. He is able to rule out different factors that others have suggested enable UG, playing a critical role in acquiring language e.g. negative evidence.

    Eventually he describes a model where a few rules are initially bootstrapped based on the available information, and then Universal Grammar guides the intermingling of how different information obtained as input interacts with each other to form rules, such as blocking.

    Pinker views language acquisition as solving a series of categorization problems until the learner obtains a generalized method for all inputs. In this view, UG would be the critical function that separates our categorization abilities from other organisms. Most often UG is referred to as if it is an organ that has a set of secret grammar rules encoded, that we aren’t able to consciously define. But, is it possible that that version of UG is just a product of a general superior categorization ability, rather than an isolated capacity?

    ReplyDelete
    Replies
    1. If UG is just a "spandrel" of categorization ability (category learning ability?) you have to know (1) study the rules of UG actually are and (2) explain how UG would be a side-effect of that.

      Delete
    2. I think I might be arguing kind of a similar point to Sam's in my comment thread above, but idk, Sam correct me if I'm wrong!

      My thought is not so much that UG is a "spandrel" of CP as a side-effect it produces, but that CP could be an evolutionary precursor to the innate mechanisms that produce what we call UG.

      Delete
  12. In this paper, Pinker describes Learnability Theory, which holds that children try out hypotheses (the "learning strategy") of what might or might not be correct formulations and then test those hypothesis against the formulations they hear in their environment. In the positive case, if what they hear aligns with their hypothesized formulation, they know their formulation was correct. However, a problem arises in the negative case -- there is not enough negative feedback to tell children they are wrong for any possible formulation they might come up with. If a child could come up with an infinite number of incorrect formulations, "the world can never tell him he's wrong." Hence, there must be some kind of constraint on the hypotheses children can come up with -- they can only come up with possible formulations that abide by the rules of UG: "UG specifies the allowable mental representations and operations that all languages are
    confined to use. The theory of universal grammar is closely tied to the theory of the mental mechanisms
    children use in acquiring language; their hypotheses about language must be couched in structures
    sanctioned by UG."

    If the formulations children can come up with--their hypotheses--are constrained by UG, would this mean that Learnability Theory applies for the learning of ordinary grammar but not of UG? Does UG constrain the hypothesized formulations enough that children can no longer come up with so many incorrect formulations that the world cannot correct them--that is, with the constraints of UG, are there a limited number of incorrect formulations in ordinary grammar? It seems to me that this would have to be the case for Learnability Theory to hold, because if there are an unlimited number of incorrect formulations in ordinary grammar (even given innate UG), we would run into the same problem of a lack of negative feedback. In order to know just how constraining UG is, we would have to reverse engineer the rules of UG and then determine, given those rules, how much space there is for incorrect hypothetical formulations in ordinary grammar.

    ReplyDelete
    Replies
    1. Distinguish (1) OG rules, (2) UG parameter settings and (3) UG rules. (1) and (2) are learnable by unsup and sup learning and instruction. (3) is not (at least not by the language-learning child. (See prior Replies.)

      Delete
    2. (1) OG rules are the ones that we learn through positive and negative feedback. For example, if we make a grammatical error such as "I swimmed", this would be in violation of OG rules and we would need negative feedback to learn that it is incorrect.

      (2) UG Parameter Settings would be things like whether or not the language is pro-drop or not. These parameters can be learned. It is possible to learn it using only positive feedback, if something is heard a lot, but can also be learned with corrective feedback. UG parameter settings are language specific.

      (3) UG rules are the ones that underly all languages. This is what is completely innate.

      Distinguishing whether something is UG or OG compliant or not is a categorization task. However, are the OG rules themselves a category in our minds? Are UG parameters and UG rules also categories? Or is it the sentences that are UG/OG compliant that make up the categories, independently of parameter settings?

      Delete
    3. Language (like speech, and movement) is a perception/production "mirror" skill: you can both perceive it as input and produce it as output.

      "Features" are whatever distinguishes the members of a category from the non-members (positive and negative examples).

      For perceptual (input) categories, the features are either sensory, or sensorimotor (i.e."affordances": based on things you can and cannot do in interacting with the input).

      For production (output) categories, features are constraints on what you can and cannot do as output.

      With speech, features are motor constraints on what you can and cannot pronounce: You can't make the sound "ga" using just your upper and lower lip, as in "ba." Your "mirror" system "knows" that rule, when you hear and try to imitate "ga."

      With grammar features are constraints on what you can and cannot (or should and should not) say, such as "Yesterday I swam" vs. *"Yesterday I swimmed." The grammatical features are motor constraints on what you can and cannot (or should and should not) do. (But they are not purely motor, as in speech imitation, because they are also related to meaning ("Today I swim" -- "Yesterday I swam")

      This is all OG rules. For UG it's similar, but the constraint does not come from learned, shared grammatical rules that can change, but from innate rules that cannot change (and vary only in their learned parameter-settings).

      Delete
  13. “Children do not, however, need to hear a full-fledged language; as long as they are in a community with other children, and have some source for individual words, they will invent one on their own, often in a single generation”

    I found the formations of creole languages fascinating. Children who hear only bits and pieces of languages are able to form a whole new language with complex grammar. This shows the power of universal grammar. Should our T2/T3/Tn robots be able to do this? How in the world do you program universal language? Currently, computer systems employ supervised and unsupervised learning to pick out patterns in the world. But since grammar seems to be learn through positive examples only, I don’t see how a robot could ever learn it. Supervised learning relies on the correction of the robot when it is wrong. Without this correction, the robot simply won’t learn anything. So how do we do it?

    ReplyDelete
    Replies
    1. Distinguish OG and UG, but also first and later languages. Children already have UG but they set its parameters in their critical period. If they learn a pidgin at that age, it can become a creole for them as a 1st language.

      Delete
  14. Here are a few disjointed yet connected thoughts on the relationship between syntax and semantics:

    During the lecture, we discussed the question of whether there is independence of syntax from semantics in English. It was said that the symbol grounding problem suggested they are not independent. I do not think I clearly understand how the symbol grounding problem suggests this.

    I recall that the symbols of language, such as words, have meaning while the symbols in arithmetic do not. However, arithmetic can still produce an output that is semantically interpretable. Can syntax be thought of as the element that gives rise to the interpretable meaning of a sentence (the output) that we would not be able to acquire if we only had semantics?

    Pinker writes, “If children assume that semantic and syntactic categories are related in restricted ways in the early input, they could use semantic properties of words and phrases (inferred from context; see Section) as evidence that they belong to certain syntactic categories” (p. 21-22). Accordingly, Pinker believes that semantics can allow for a child to learn syntax in some cases. If you can learn syntax using semantics, does this also imply they are not independent?

    ReplyDelete
    Replies
    1. Good questions.

      In maths, there is no doubt that the semantics is independent of the syntax. The maths is all purely syntactic.

      In language this is not the case. The UG rules according to which you can say "John is eager to please Mary" but not *"John is easy to please Mary" may be formal constraints on abstract tree structures. But it's not clear whether you can even distinguish a noun phrase from a verb phrase without knowing -- semantically -- what the category "noun" refers to, let alone what the noun refers to.

      I think function words (like: if, not, the) and how to combine and manipulate them might be independent of semantics, but I don't think content words (nouns, verbs, adjectives) are. That's why they need to be grounded. So grounding might entail that linguistic syntax is not independent of semantics. (But that's just "Stevan Says.)

      What Pinker says about restrictions in early input. is equivocal because he is not distinguishing OG from UG.

      Delete
  15. “The main linguistic accomplishments during the first year of life are control of the speech musculature and sensitivity to the phonetic distinctions used in the parents' language. Interestingly, babies achieve these feats before they produce or understand words, so their learning cannot depend on correlating sound with meaning.”
    “Children do not hear sentences in isolation, but in a context. No child has learned language from the radio; indeed, children rarely if ever learn language from television.”

    Both sections reminded me of the practice in which you expose a baby in the womb or post-birth to audio recordings of a second language in order to prime them to learn this when they are older. As the second quote reveals, learning OG (and perhaps, extended to L2) cannot occur without a form of context involved. That being said, as the first quote reveals, phonetic categorisation occurs before babies seem to begin to understand words and produce utterances. Could it be that though words certainly aided by this L2 exposure, could it help prime phonetic categories such that learning the accent of the language would be made easier? If this were true, would it not be that phonetic categories are more coupled to the general ability of cognisers to categorise rather than an explicit parameter-setting functionality of OG as derived from UG? Another instance where categories might be primed in the same fashion would be how people born into ethnic communities have an easier grasping the pronunciation of the language when learning it as an L2 later in life even if they had never spoken the language.

    ReplyDelete
  16. I am curious as to the potential impact of the social element of learning in the absence of negative evidence. Specifically in the situation where a child goes through a period of using both the well-formed and ill-formed versions of a word or sentence and eventually shifts to solely using the well-formed structure.

    Could the want for acceptance and inclusion have any potential roll in this? If they have one form with positive evidence for that they can be /sure/ is an acceptable form, and another form that they have no evidence for directly (the wild card option) then the first option is the safe one, which avoids the possibility of shame or embarrassment. I am curious to see the overlap in the learning of irregular language and the increased experience of these emotions (which young children do not seem to experience as they have not learned what acts warrant those feelings in a given culture/situation).

    ReplyDelete
  17. I would like to recapitulate and contrast the important points around the learnability of UG and OG.

    The universal rules of grammar that linguists attempt to reverse engineer are not learnable. They are only learnable by linguists who already have and use the rules of UG in everyday language and in their scientific inquiry. Learning UG is not possible in the absence of negative instances of UG strings in the environment (we never hear UG non-compliant strings only OG non compliant strings) . We are all born with the same set of parameters or rules of universal grammar; that is what is not learnable. Nevertheless, during critical periods of child development, those parameters are set by a first language (order setting for example: VSO/VOS/SVO) differentially than when learning a second language; that is a form of learning and during critical periods, it does not require negative evidence (hearing repeatedly a parameter setting such as a type of ordering in a sentence is enough positive evidence to set this type of ordering rather than the other types). Parameter setting is learning to set a rule from an initial set of rules. For example, learning the VSO ordering rule from the set of VSO, VOS, SVO rules that are innate parameters/rules of UG. The ordinary grammar of our first language is the result of specific parameter settings on innate UG. Learning OG of a second language is different. It is a form of learning that is done via positive and negative evidence that are abundant in the environment. You can learn a second language by verbal instruction and corrective feedback for example, but that will not change the UG settings set by your first language. So learning a second language is in a sense truly different from learning a first language as it is does not rewire UG (or maybe it does but slightly and differently than during critical periods).

    But is learning a second language like learning to think in that second language (language as a constraint on thought) or is it like learning to express ideas in that language (thought as a constraint on language). That is another interesting question,
    Or maybe, learning our first language would constraint thinking and learning a second language would be based on those first language constraints and thus, it would be more like learning to express ideas of first language OG in a second language OG (cognitive capacity of translation different from cognitive capacity of language?)
    It does feel as I learned to think in english during my years at McGill and not just that I learned to express my French thoughts...

    Intuition pump: All the semantic information that I posses of things in the world can be express with UG rules. It is UG’s ability for propositional attitudes that explains my understanding. What if different parameter settings had different effects on semantic or on our understanding. French and it’s specific parameter settings may bring forward some aspects of semantic understanding and learning a second language may also bring forward some other aspects of semantic understanding even if a second language does not really rewire UG. This might explain why it feels like I think in english now that I am at McGill, it is just that I “know” things about the world from two different perspective corresponding to the different settings of UG. I believe that differences in parameter setting may account for semantic differences in understanding.

    ReplyDelete
    Replies
    1. “The universal rules of grammar that linguists attempt to reverse engineer are not learnable. They are only learnable by linguists who already have and use the rules of UG in everyday language and in their scientific inquiry.”

      All adults know when they hear a UG violation, because they have UG in their brains. Linguists can and do learn (though supervised learning) to make the rules of UG explicit, by hypothesizing candidate rules and then testing if they are right using the UG in their brains to provide the supervisory corrective feedback.

      “during critical periods of child development, [UG’s] parameters are set by a first language… during critical periods, it does not require negative evidence (hearing repeatedly a parameter setting such as a type of ordering in a sentence is enough positive evidence to set this type of ordering rather than the other types).”

      I’m not an expert (so you should realy check this out with an expert), but I think parameter-setting is done mostly through ordinary learning (unsupervised and supervised): imitation, generalization, correction; and some may happen because there is an innate order of priority for parameter settings: a default setting (say, 0), that is only re-set to 1 if you hear structure that would not occur with setting 1. There is no POS for learning UG parameter-settings, just for learning UG itself.

      “Learning OG of a second language is different. It is a form of learning that is done via positive and negative evidence that are abundant in the environment. You can learn a second language by verbal instruction and corrective feedback for example, but that will not change the UG settings set by your first language. So learning a second language is in a sense truly different from learning a first language as it is does not rewire UG (or maybe it does but slightly and differently than during critical periods).”

      I think 1st-language parameter settings stick, after the critical period. But, given that the setting options are finite and few, there is no mystery, no POS, about how you learn them. I think the rest of what you say above is right. (But check with someone who has a technical mastery of UG, and knows about L2 learning too.

      “But is learning a second language like learning to think in that second language (language as a constraint on thought) or is it like learning to express ideas in that language (thought as a constraint on language). That is another interesting question.”

      McGill used to have a tradition (from Wally Lambert) of distinguishing “compound” and “coordinate” bilinguals. Coordinates learn both languages in the critical period, as two L1s. For compounds, it’s more like a later L2 kind of “translated” from the L1. But I think none of this is particular to UG. L2’s tend to be less exact both for OG and UG parameter-settings. It is not reliable to ask an L2 speaker to judge starred vs. nonstarred sentence in their L2.

      “Or maybe, learning our first language would constraint thinking and learning a second language would be based on those first language constraints and thus, it would be more like learning to express ideas of first language OG in a second language OG (cognitive capacity of translation different from cognitive capacity of language?)
      It does feel as I learned to think in english during my years at McGill and not just that I learned to express my French thoughts...”


      I think most of your subjective sense of which language you are thinking in concerns OG (and vocabulary), not UG or UG parameter-settings.

      Delete
    2. “Intuition pump: All the semantic information that I posses of things in the world can be express with UG rules. It is UG’s ability for propositional attitudes that explains my understanding.”

      What has UG to do with propositional attitudes in particular, rather than syntax in general?

      “What if different parameter settings had different effects on semantic or on our understanding. French and it’s specific parameter settings may bring forward some aspects of semantic understanding and learning a second language may also bring forward some other aspects of semantic understanding even if a second language does not really rewire UG.”

      Again, I think this are the familiar reflections of a bilingual, and does not have much to do with UG or its parameter settings…

      “This might explain why it feels like I think in english now that I am at McGill, it is just that I “know” things about the world from two different perspective corresponding to the different settings of UG. I believe that differences in parameter setting may account for semantic differences in understanding.”

      I would answer as above. Your introspections are all compatible with vocabulary and OG differences, even stylistic ones (which are probably part of OG too).

      Delete
    3. Thank you for your response, this is much appreciated.

      Delete
    4. One follow up if you could answer me:

      You say in your response that the semantic differences which I am addressing in my introspection between French and English are compatible with OG differences. If we say that during parameters setting, UG is wired according to a first language then, isn’t it true to say that OG of that first language is just an instance of possible UG parameter settings?

      And thus, when I talk about semantic differences between both languages, am I not just talking about differences induced by different specific UG parameter settings? Some UG settings that make French and some UG settings that make English. My point would be that depending on which language sets parameters of UG in the critical period, there would possibly be differences in semantic understanding; that having your UG parameters set by a particular language makes you semantically understand the world in a specific way…


      Delete
    5. Maximilien, this is a good moment to reflect on (1) evidence and causal explanation versus (2) hermeneutics and subjective interpretation -- and on information as the reduction of uncertainty, as discussed at the very beginning of this course,

      The differences in French/English meanings that you are introspecting about are and were completely explainable by the differences in categorization, vocabulary, and Ordinary Grammar before you took this course. Now that you've heard a (tiny) bit about UG and UG parameter-setting in this course, what has changed?

      It is conceivable that some of your introspections about F/E and meaning might turn out to have something to do with UG parameter-settings, but since those too are learned (just as categories, vocabulary and OG are), and since you have not studied UG or parameter-settings, you have not actually learned anything concrete -- no uncertainty has been reduced -- except you now know that if you chose to study linguistics it is conceivable that some of your introspections about F/E and meaning might (or might not) turn out to have something to do with UG parameter-settings.

      What is an explanation? It is something that provides information, something that reduces uncertainty among alternatives that matter to me. In a sense all "mattering" is subjective, because it is based on feeling. But some things matter objectively too (will publicly condemning Trump cost me my job?) and some things matter only subjectively (is X's interpretation of this poem more satisfying than Y's).

      A "scientific" explanation has to reduce objective uncertainty, not just subjective uncertainty. A hermeneutic explanation may reduce subjective uncertainty, but objective uncertainty is not even at issue.

      Now in this light, can you say what prior uncertainty (about the F/E bilingual experience you have had) has been reduced for you by what (little) was taught in this course about the existence of UG and UG parameter-settings? And was the uncertainty and its reduction objective or just subjective?

      This kind of introspective exercise is useful (among other things) in immunizing oneself against fake news and cults, as well as against over-interpretation in general...

      Delete
  18. Pinker refers to "The child must keep an updated mental model of the current situation, created by mental faculties for perceiving objects and events and the states of mind and communicative intentions of other humans. The child can use this knowledge, plus the meanings of any familiar words in the sentence, to infer what the parent probably meant."

    I am also curious as to whether "infer meaning" is a weasel-word. Meaning is part of the hard problem. What we are concerned about when we think of reverse-engineering is symbol grounding, so as to ensure we manipulate the symbols and apply rules correctly. A child being able to infer meaning is different from a robot being able to ground the symbol. However, because of the other minds problem, we are not concerned with the distinction between the two. Could the same technique (mental model) be used for a robot to acquire vocabulary? What does it mean to "infer"?

    Also, how is acquiring vocabulary different from acquiring categories?

    ReplyDelete
    Replies
    1. Good idea, but actually the hard problem is not more (or less) relevant to inferring meaning (i.e., to inferring what proposition and referent was intended by the speaker) than it is to identifying apples: In both cases it feels like something, but it's the hard problem to explain how or why. But the easy problem is enough: making the right inference, identifying the right category, doing the right thing.

      (Inference itself can be done purely formally too, with ungrounded symbols, as in logic or maths. Inference in mind-reading can always be just inferring what the other is doing, or going to do, rather than what that feels like.)

      What is the weasel word is the "mental" in the mental model: Here a Dan Dennett would be right that all you need to infer is probable words and deeds, not feelings.

      Acquiring categories is learning to do the right thing with the right kind of thing. Sometimes the right thing to do is to name a kind of thing. But naming itself is trivial. If you already know what to do with things that look and feel (sic, in the sense of palpation) like apples -- i.e., if you have already acquired the right feature-detectors) -- then also learning to call them "apple" is trivial.

      Having the right "mental model" usually just means having the right feature-detectors and making the right inferences about intentions (probable actions). It's all feeling-independent reverse-engineering.

      Delete
  19. After reading this paper so late into this class, I feel as if this complicates the prospects of reverse-engineering language capacities in a T3 robot.

    Pinker's paper outlines the various theories and models that scientists from various fields of study have come up with that discuss how children use language, and how that usage of language could be linked to how they acquired it. Despite some of these concepts being very intricate and clever, they are flawed, and as Pinker states: "none of their behavior [children] reflects one of these components acting in isolation."

    This then brings upon a huge problem for cog. sci, as language capacities are one of the defining factors of human behaviour. If we cannot properly reverse-engineer them, then we are effectively stuck in our quest to properly reverse-engineer doing capacities. Obviously, we will not want to begin at T3, as that is more complicated than beginning at T2 -- but even at the T2 level -- we haven't quite figured out how to do it properly.

    As Pinker stated, we could not have acquired language capacities by correlation (unsupervised learning and there are too many instances where there is a lack of a correlation, yet the children still learn it), nor could we learn language simply by supervised learning (lack of a negative evidence), and neither through learning via instruction as that would be too demanding on the parents' time, and kids learn words at a faster rate implicitly than they could ever be by being taught the words. This means that we cannot give traditional forms of learning to computational systems in order to learn language, as they would be lacking the tools that are necessary in order to use language properly (this can be seen with the linguistic errors that Siri, Alexa and Google Assistant make constantly, even if they are becoming very efficient for basic things such as recognizing words if you mumble them). As a result, this means that the machines we feed language learning algorithms to will not be able to fully utilize language in the way we do until we find the "right answer" to how we acquired language and then we can test it out computationally and see if the machine uses language like we do.

    ReplyDelete
    Replies
    1. It's just UG that can't be learned, not language: categories, vocabulary, OG are all learnable. Pinker conflates what's true about UG with what's true about the rest of language.

      Delete
    2. This was my initial thought after reading the paper too. Since UG can't be learned and everything we've done to simulate language depends entirely on learning algorithms, I thought we weren't even on the right path to get to T2, let alone T3 or 4.

      However, (and this isn't to say that we're near achieving T2 or anything), I don't think our current technology and advances in NLP are inapplicable to a more human system in a robot. That is, I know we don't fully understand the how and why of UG but wouldn't it be possible, with a thorough enough understanding of which rules fall under UG and which don't, to almost hardcode UG into a machine learning model so that anything learned is built onto UG without changing it, kind of like our innate structure (I know we'd still need to tune the parameters of the innate grammar before building onto it and I'm not sure if a model even resembling this is possible but hypothetically, it feels like it should be? Or even if we simply hard-coded the UG rules for English onto the robot for instance (assuming previous tuning during the critical period or something) and then trained it with our current models.)

      Delete
    3. “Wouldn't it be possible to hardcode UG into a machine learning model?”

      Ada, I think your idea is correct – I remember seeing Professor Harnad say in a response that once UG rules are “reverse-engineered” we will be able to simply encode them into an artificial intelligence. The question that remains is how these rules came to be encoded in us, and why?
      My question is whether reverse engineering and encoding the UG rules really helps us answer those questions. Discovering the rules of UG seems to answer the question of “what” we do, rather than how or why. I’m not sure that this form of reverse engineering is claiming to answer anything more than that, but I have trouble imagining the next step that would answer them.

      In fact, I still have trouble seeing how reverse engineering that creates “doing capacity” in a different way than humans is very helpful. This is something that I brought up in a very early lecture, and Professor Harnad had given the example of the “toy capacity” of playing chess as an example of successful reverse engineering.

      But does having computers that can play chess really teach us that much about the processes that underly the human ability to play chess? Perhaps it is only the conjunction of all doing capacities together that will provide an explanation?

      Delete
  20. There seems to be a parallel between the hard problem of consciousness and the hard problem of universal grammar. In both cases, we do not know exactly what the evolutionary advantage conferred is, or the reason why we have these abilities. What Pinker attempts to do in this paper is describe how language is acquired in children, but he fundamentally misses the point that UG is distinct from OG. UG is unlearnable, and furthermore, UG is hard to pin down to exact rules - we can describe the rules of OG, but UG is still being studied. Pinker fails to make this distinction in his paper, instead focusing on the rules of OG. UG is unlearnable because the child only hears positive evidence, and since there is no way to create categories this way UG must be innate. The “errors” that the child makes in the examples are OG violations, not UG, and the child makes UG compliant speech at a very young age. The paper does a good job laying out theories of how children pick up OG, but how UG seems to be ingrained is still a mystery.

    ReplyDelete
    Replies
    1. I actually don't see much parallel between the hard problem and UG. I agree that in both cases we can't infer how evolution has led us to develop these capacities, but that doesn't mean that they're not advantageous. I'm sure that lazy evolution has allowed us to pass on these processes on the ground that they are so important! Also, I believe UG is unlearnable by most people because they don't have the negative instances -- however, linguists can map out what the exact rules of UG are. With the hard problem of consciousness, we can only be sure that we think, but "mapping" out how and why we feel is a different story. At least with language, understanding OG can help linguists understand UG. In cognitive scientists' case, solving the easy problem does not help in solving the hard problem at all.

      Delete

Opening Overview Video of Categorization, Communication and Consciousness

Opening Overview Video of: