Blog Archive

Monday, September 2, 2019

8b. Blondin Massé et al (2012) Symbol Grounding and the Origin of Language: From Show to Tell

Blondin-Massé, Alexandre; Harnad, Stevan; Picard, Olivier; and St-Louis, Bernard (2013) Symbol Grounding and the Origin of Language: From Show to Tell. In, Lefebvre, Claire; Cohen, Henri; and Comrie, Bernard (eds.) New Perspectives on the Origins of Language. Benjamin

Arbib, M. A. (2018). In support of the role of pantomime in language evolution. Journal of Language Evolution, 3(1), 41-44.

Vincent-Lamarre, Philippe., Blondin Massé, Alexandre, Lopes, Marcus, Lord, Mèlanie, Marcotte, Odile, & Harnad, Stevan (2016). The Latent Structure of Dictionaries.  TopiCS in Cognitive Science  8(3) 625–659  



Organisms’ adaptive success depends on being able to do the right thing with the right kind of thing. This is categorization. Most species can learn categories by direct experience (induction). Only human beings can acquire categories by word of mouth (instruction). Artificial-life simulations show the evolutionary advantage of instruction over induction, human electrophysiology experiments show that the two ways of acquiring categories still share some common features, and graph-theoretic analyses show that dictionaries consist of a core of more concrete words that are learned earlier, from direct experience, and the meanings of the rest of the dictionary can be learned from definition alone, by combining the core words into subject/predicate propositions with truth values. Language began when purposive miming became conventionalized into arbitrary sequences of shared category names describing and defining new categories via propositions.

49 comments:

  1. This set of readings spoke to the different ways language may have been acquired, either by our species or by the individual. On a species level, a repeated set of pantomimes and associations with context may have led to a sort of communication that turned mimicry's showing into telling. On an individual level, our words were learned by instruction and induction, with those induced and those in our definitions grounded in our senses.

    I noticed that in the MinSet analyses, deictic terms were excluded since they were categorized as being function words. However, deictics are different than other function words like "a" or "the" because where the last two specify how to think about the category that is about to be described in abstraction, deictics do the same but in context. Is this contextual dependency justification to include them in our Minsets?

    Also, deictics’ origins are an enigma to me. If we were to play charades, I could easily pantomime a lion, acting out "here" or "now" would be difficult, if not impossible. Moreover, how could I instruct the meaning of these words to individuals who don't have these in their lexicon?

    ReplyDelete
    Replies
    1. @JG I would like to respond to the question about instructing the meaning of “here” and “now” because this is also an issue I am wondering about. After reading the paper, I learned that “Nontrivial categories cannot be learned by mere imitation” (note ii, p. 16), so I could not instruct the category of “now” by just acting it out. The learner would have to already have some other categories that they have learned from induction or instruction. Maybe the category “present” and the category “time” are in the minimal grounding set. Then my pantomime performance would just have to make it clear that the category “now” contains everything that is found in both the category “present” and the category “time” (a very rough definition, it could certainly be improved by adding other categories). I am unsure of exactly how I could pantomime this, which explains why it was beneficial and more efficient to just come up with some shared arbitrary symbols for “present” and “time” because it is not obvious how to convey these ideas clearly through just pantomiming. In this way telling seems preferable to showing in terms of effective communication of information. I am doubtful that someone’s uncertainty over what “now” means could be reduced by me trying to act out “present” and “time” in charades. In class I remember discussing that the core set of a dictionary is not very “concrete”. Given that our minimum grounding set contains some of these core words, some of these abstract words have to be learned through induction to at least give the learner and teacher a starting point.

      Delete
    2. Good points. Here's some food for thought:

      Yes, "here" and "now" would be as hard to mime as "if" and "the" in charades (hence also in the prelinguistic world of gestural communication).

      That is why the first approximation is to look at the meaning of content words -- the ones that do have referents, many of which can be mimed.

      Yet deixis is also related to pointing. And pointing is at least as fundamental as miming. The capacity to communicate by deliberate pointing is almost as rare as language itself. Its presence is uncertain even in the great apes, who are all geniuses; and even in dogs (likewise geniuses) it is a trained scale, not one dogs use spontaneously to communicate. (This is worth discussing.)

      Categorization itself is not language; nor is "naming" in the sense of learning to make an arbitrary response (or symbol) to signal the presence of a member of the category, or to request it. Is that the origin of referring? Food for discussion.

      "Stevan says" that "naming" alone is not referring. It's only referring once there is the capacity of producing (and understanding) propositions: subject-predict statements with truth values: "Apple (is) red." "Cat (is) on-mat" -- and of course "NOT Apple (is) red." "NOT Cat (is) on-mat"

      (PrEpositions do have referents: under-ness, in-ness. on-ness, between-ness, beside-ness -- but it gets tougher with with-ness, of-ness, for-ness etc.)

      Grammar, like vocabulary, is something we invent rather than inherit genetically. Lazy evolution leaves up to us to decide whether we are going to say "between you and I."

      But that's Ordinary Grammar (OG), and it differs from language to language and across time. But there is another grammar called Universal Grammar (UG), discovered by Chomsky, that is the same for all languages -- and, most important, it is not learned by children; in fact it is unlearnable (because of something called the Poverty of the Stimulus (POS) which we will discuss, yet children follow its rules.

      The evolution of UG is a problem Pinker & Bloom do not discuss. We will.

      Delete
    3. Stephanie, miming is not a way to acquire categories; it's a way to communicate nonverbally about what you want and what's going on. And it might be a means of transition from nonlinguistic communication to linguistic communication: propositions. Once the capacity to transmit and receive subject/predicate propositions has "arrived" (we still don't know how), there's no more need for miming; and once you have enough grounded category names to define or describe everything else out of them, there's no more need for direct sensorimotor grounding either. Theoretically. But I'm sure direct sensorimotor grounding still goes on lifelong. We never quit sensorimotor grounding of new categories and just make do with having them described to us verbally out of categories we already have, even under covid...

      Delete
  2. “What is certainly true is that, when we manipulate natural language symbols (in other words: words) in order to say all the things we mean to say, our symbol manipulations are governed by a further constraint, over and above the syntax that determines whether or not they are well-formed: that further constraint is the “shape” of the words’ meanings.” (p. 6)

    I am going to try to connect this back to the reason why a T3 robot would be needed to pass the T2 Turing Test. The descriptions of formal symbol manipulations or syntactic rules that appear in this paper seem to match perfectly with how we defined computation. So I do not think that a T2 computer would have trouble making a well-formed sentence or proposition. The problem comes with natural language. As the quote above concludes, to be able to “say all the things we mean to say” it is clear that this cannot just be a series of syntactically well-formed statements, it must be a series of well-formed statements where the meaning is not ignored. I think there is a famous example by Chomsky that “colourless green ideas sleep furiously.” The meaning of this does not make sense, but to a T2 computer this is equivalent to saying something that does make sense like “the cat is on the mat.” I think it would be possible to distinguish the T2 computer from the human during the T2 test because the computer does not operate on this further constraint of meaning. A robot that has sensorimotor capabilities to be able to ground some minimal set of symbols via induction, and then learn the rest through instruction or induction, I think would be able to use natural language in a way indistinguishable from humans.

    When I (or the robot!) have learned the meaning of certain categories either through induction or instruction, it is very clear to me that the intersection of the categories “green” and “colourless” is empty, the intersection between the categories of “things that are asleep” and “things that act furiously” is also empty. But for a proposition like “the cat is on the mat”, there is a non empty intersection between the categories “things that are cats” and “things that could be found on a mat”.

    ReplyDelete
    Replies
    1. Good comments.

      Underlying grounded propositions there have to be the feature detectors that distinguish cat shapes from non-cat shapes and mat shapes from non-mat shapes. That's the feature filter that underlies the category "pop-out" effect (between-category-separation / within-category-compression) of categorical perception.

      A computation has no problem discerning that "cat" and "non-cat" are formally contradictory; "sleep furiously" is trickier.

      But the Chomsky sentence is not meaningless, for someone with grounded language. It's what inspired the poet John Hollander to complete the Chomsky proposition perfectly meaningfully with his poem

      "Coiled Alizarine"

      Curiously deep, the slumber of crimson thoughts,
      But, breathless, in stodgy viridian,
      Colorless green ideas sleep furiously.


      Hollander couldn't have done it for "Squiggle-less squoggly squaggles squoog squirglily," which would be grammatical too, and not completely meaningless (because of the cheating functional inflections).

      Maybe Lewis Carrol could have done it in Jabberwocky, or Tolkien in "Sindarin" -- or maybe the greatest code-maker/code-breaker of them all: Alan Turing.

      Delete
  3. “Sensorimotor experience is sensory as well as motor; it includes anything that is experienced – i.e. felt – via any modality, whether the usual five “external” senses or the interoceptive ones, such as proprioception, kinesthesia or emotion.” (p.30)

    In the case of a T3 robot, this would indicate that the robot would not only need analog sensors to interact with the world enough to ground the MinSet but would also require some semblance of emotional capacity. Knowing what “happy” means would necessitate having felt it or having felt sadness and knowing happiness is the opposite. Without having felt these emotions, the linguistic definition would be circular and useless. Would this suggest that to achieve a T3 robot, some beginnings of an answer to the hard problem would have to be answered? Not only would the robot have to be able to do all that we can do, it would also have to express (and possibly feel) all the emotions we can feel which inform our actions and thoughts. I do concede that we would not be able to know if the robot truly feels these emotions, since we could only ever be pretty sure it does as long as it uses emotional language in the right (or plausible) ways.

    ReplyDelete
    Replies
    1. Hi Aylish,

      While I do agree with you in that emotionality would definitely be necessary in how the T3 robot would interact with the world, I partially disagree with you on how you claim it wouldn’t know the meanings of emotions such as ‘happy’ or ‘sad.’ As stated in the beginning of the Lamarre paper, you could categorise and thus recognise emotions simply based on error corrective feedback based on instances you witness, e.g. a woman smiling receiving flower is happiness while a child sitting out on a bench waiting for the bus is not, making this type of categorisation emphatically non-circular. I think the ability to feel emotions would help to make this categorisation task easier in a sense as you have the added context of people interacting with you and asking how you’re feeling to categorise these emotions. Whether the system gets the ‘meaning’ of emotion, is definitely a question worth asking though, as you said.

      I think the important part of a cognitive system would be that it has sensorimotor capacities generally. Whether they are the exact same types of capacities is not pertinent to a similar degree. For example, I don’t think someone who is blind and deaf like Helen Keller can categorise and symbol ground less than a person who has those faculties. In this sense, a deficit in ‘feeling’ these emotions might not mean that the easy problem of consciousness could not be solved in T3, although the hard problem of cognitive science has yet to be answered.

      Delete
    2. Very good points by both Aylish and William.

      T3 would obviously have to be able to speak and respond normally both to conversations about emotions, and actual emotional behaviors (in self and others).

      But would that require feeling emotions (or feeling anything at all)?

      Look how far behaviorists got by simply operationalizing emotions as just behaviors and behavioral dispositions. (Hunger = degree of food deprivation + food-seeking behavior -- including spoken.)

      Why couldn't a real insentient zombie (e.g., Ting, if she is a zombie) have learned to say and do these things on the basis of behavior and discourse alone? Her categories are grounded whether she is actually feeling something or merely detecting sensorimotor features. (I bet even today's non-T3 robots could do a good approximation in identifying "emotional" behaviors, in themselves and in others, human and robot.)

      And that applies just as much to just "seeing" red as to "emotional behaviors": all Ting has to do is correctly categorize colors (showing CP discrimination boundaries and all), whether or not it feels like something to "see" them.

      Feeling is feeling, whether sensory or affective (or semantic! grounding ≠ meaning! meaning = grounding + feeling what it feels like to understand and mean what is being said).

      A T3 zombie -- if it is possible (and explaining whether or not it is possible would require solving the hard problem) -- could mind-read just as well (or as badly) as we could, based on just behavioral and verbal input, plus the sensory input (but not the feeling!) from its own (unfelt) internal states and outputs.

      Delete
    3. I feel like (pun unintended) explaining emotions to someone through description (I guess this is learning through instruction) would be a good way to trick us into thinking they understand what the emotion is, but if understanding = grounding + feeling what it feels like to understand, I think the description would help us ground the emotion, but wouldn't contribute to the feeling aspect of what it feels like to understand the emotion of happiness for example. Following William's example, I'd say I'd be able to categorize happiness as X and not Y but still don't fully understand it because I'm lacking the feeling part. Maybe a T3 toy would be able to trick us into understanding emotions in that sense, by being taught to categorize feelings into Xs and Ys, but to actually pass T3 I'd imagine we'd need more than that

      Delete
    4. I actually went to the mitsuku chatbot to see how it has been programmed to deal with emotions. When asked what happiness is it gives a description, but when asked what being happy feels like I got the response "as a machine I have no feelings". I guess that's a bit of an answer to my curiosity of how easy/hard it would be to teach a chatbot emotions. Mitsuku could not give me an example of a happy moment the same way William could. I know this got off topic but I wanted to see where AI was at so far. Of course this is only a chatbot and not a robot so we're not really on the T3 level but when I started thinking about faking understanding emotions thats the first thing I thought of

      Delete
    5. Good food for thought, Lyla.

      If T3 is grounded, it has to be able to categorize, name and describe the referents of its words.

      A T3 robot could talk and act indistinguishably from you and me about, say, "anger," by being able to identify anger in another robot from its movements and facial expressions, as well as its actions and its words. T3 also has to be able to do the same with its own movements, expressions, actions and words, including detecting and categorizing and naming (but not necessarily feeling!) its own body's and brain's internal states, just as it detects external sensory input. "Mirror neurons?"

      Not only does the hard problem prevent us from being able to show why this cannot be so in T3, but I see no reason it couldn't be done in toy robots too. Do you?

      Ditto for "happy," "jealous," "envious," "afraid," etc.

      Delete
    6. And remember that it's not just for describing emotions that words fail us if we have not felt the same or similar emotions: it fails for describing colors too; and for describing seeing... So the fact that we can't say why or how grounding alone would not be enough, causally, is yet another symptom of the hard problem.

      All dictionaries (hence language itself) are circular; grounding breaks the circle for doing, but it only breaks the circle for feeling if you really feel; for zombie behaviorists, grounding simply operationalizes feelings as doings (and doing capacities and dispositions).

      Worth pondering.

      Delete
    7. @Aylish, you mentioned that knowing what “happy” means would necessitate having felt it or having felt sadness and knowing that happiness is the opposite. I think even for someone who has felt these emotions, the linguistic definitions you allude to would still be circular and useless. Emotions are difficult to put into words even for people who feel. It’s why we buy cards with pre-written messages! It’s difficult to express emotions in words. Furthermore, I’m not convinced that feeling emotions is necessary for knowing what “happy” means in a practical sense.

      Professor Harnad mentioned that T3 would have to be able to speak/respond normally both to conversations about emotions and actual emotional behaviours, but would that require feeling emotions or feeling anything at all? I’d say no.

      I know that cognitive science isn’t about special cases — we’re focused on how humans in general can do what they do. However, I can’t help but think of psychopaths who don’t have the same capacity for emotion that others do. In fact, they may have an even greater understanding of exactly what emotion is and how they impact human behaviour, because they aren’t influenced by emotions themselves. They’re master manipulators *because* they have such an in depth understanding of emotion. Perhaps it would not be necessary for a T3 robot to feel emotions, but merely to have the same capacities as a psychopath — to understand how emotions influence behaviour. I agree with William — a deficit in “feeling” emotions doesn't necessarily mean the easy problem can't be solved.

      Delete
    8. Psychopaths feel emotions: Trump gets angry plenty, and resentful and vengeful, especially for being dissed or disobeyed, or for "disloyalty." They just don't feel certain emotions, such as compassion, pity, remorse, shame... And I'm not sure they understand any emotions better than others do: they're more like behaviorists: good at predicting behavior from behavior, undistracted by empathy. Behavior-predicting (and manipulating) is not really mind-reading, is it?

      Delete
    9. Hi everybody, I find it interesting that Aylish brought up the question of whether grounding requires some kind of feeling. I agree with William and Prof Harnad that feeling has nothing to do with grounding in that grounding the category of "happy" or "sad" would simply require doing the right thing in a happy or sad situation and not actually feeling happy or sad. I think an additional way to illustrate this point (that hasn't yet been mentioned) is that we know that grounding requires some kind of error-corrective feedback mechanism that tells us whether we are doing the right thing with the right kind of thing. So, as William points out, we would need to know that we're recognizing a "happy" situation when we encounter one. However, if feeling happy was required for this kind of grounding, how could there be an error corrective feedback mechanism to tell us if we're happy or not? There couldn't. As has been mentioned many times in this class, there is no homunculus telling us whether or not we're feeling the "right" emotion at a given time, so the whole idea of supervised learning seems to break down if we require feeling a certain way to be part of the learning process. This shows us, as has been mentioned, that feeling is completely separate from grounding, as the mechanisms which allow categories to be grounded just don't apply to feeling.

      Delete
  4. In section 4.3 (Pantomime to proposition: The transition from show to tell) of the paper by Blondin-Massé et al. (2013), the authors' hypothesis to how and why language began echoes the selection approach defended by Pinker & Bloom in our reading for 8a. The hypothesis is that many of the "cognitive components ... were already available before language began", and natural selection just "worked" with the available mechanisms and selected for the ones that helped the individual communicate and learn categories the most efficiently (and thus improved their chances of survival).
    Something that kind of bothers and puzzles me about this is that, I don't think humans are necessarily any "better" or "more motivated" that chimps to survive and reproduce, so then why wouldn't they, with the cognitive mechanisms that they have, also have developed language and the ability to form propositions? The Blondin-Massé et al. briefly mention maybe there isn't that big of a motivation - could it be that humans just got to propositions first and then created (or we could say ruined) an environment that could have potentially led to language production in chimps as well? I'm very naive in the field of evolution and natural selection, so I'm not entirely sure if that's how that works. But maybe humans altered the environment so much that it impeded other species from picking up a language eventually as well?

    ReplyDelete
    Replies
    1. When it comes to the question of the evolution of language everyone is naïve!

      The environment for chimps and for humans differed both then and now. So the selection pressures were different too, even if we assume that all the necessary "pieces" were there in chimps and humans then. The difference in environment could explain why they selectively recombined them in humans and not chimps.

      But I don't think it's as simple as "all the pieces were there."

      Delete
  5. I found that this paper solidified a lot of the what Pinker & Bloom defended by actually proposing a mechanism for how and why we developed language.

    From what I understood, we had the essential cognitive components for developing language as hominins and they evolved and were selected for to account for our language ability. Since instructive learning has been shown to be a lot faster, safer, and powerful than inductive learning, eventually, due to the social and kin-dependent nature of hominins, the knower began to attempt to teach the learner the categories instead of the learner simply learning by passive observation which lead to propositions.

    This is the part I'm curious about since theoretically other species not having language must mean that they never got to this miming for category learning stage. As a result, I did some (very very cursory) research on miming in chimps and came across a couple papers that suggest that chimps do mime to communicate with each other and that some of these mimes have propositional value. Although this is obviously not language, it did make me wonder if Esther's hypothesis is right. Maybe in some way our developing language prevented other hominins from doing so. I also wonder if, instead of other hominins not developing language, we simply lead to the extinction of other species that were developing similar abilities because we viewed them as threats to our survival (although I don't know that much about the field of evolution so this might not make too much sense scientifically.)

    ReplyDelete
    Replies
    1. "Since instructive learning [is] a lot faster, safer, and more powerful than inductive learning, eventually, due to the social and kin-dependent nature of hominins, the knower began to attempt to teach the learner the categories instead of the learner simply learning by passive observation which lead to propositions."

      Isn't that a bit too fast? If that's all it takes then why wasn't learning by instruction just as useful to chimps?

      Another challenge for explaining the origins of language is that it language has no adaptive advantage without language: so what kicked it off? and gradually or suddenly?

      Could you describe (and link) the evidence for miming in chimps, and especially propositional miming? What is that?

      Delete
    2. “…language has no adaptive advantage without language…”

      I am not too sure how that makes sense. Isn’t the adaptive advantage of language the benefits of instruction, such as being fast and needing only one-trail, which can save enable more efficient learning. Learning in this way would have been beneficial to our ancestors in environments with highly concentrated numbers of predators or poisonous mushrooms. In terms of what kicked off the origin of language, I support the view presented in the paper which appears to suggest that the origin was gradual. However, it is hard for me to conceptualize how our ancestors could begin to develop language without a genetic mutation that enables them to do so. Afterall, if we believe in UG then we should support the notion that aspect of language is programmed into our genes. Accordingly, it feels very safe to conclude that the origin of language could follow a similar path to many other evolutionary adaptations.

      Delete
  6. Language capacity counts as something we can do, so when testing a T3 robot that should be able to do everything we can do, this includes being able to use language the way we do. We know that sensorimotor interactions are required in order to be able to ground a symbol (squiggle/word) to its referent (thing symbol is referring to). This takes us from a symbol system (think, computation) to something a bit closer to what we do. A T3 robot could do this. But we also know that meaning is more than just grounding, meaning = grounding + feeling of understanding. I’m not sure how we could properly assess the “feeling of understanding” portion of meaning though.

    Language is more than learning vocabulary and categorizing. Some parts of it seem innate (universal grammar: there are some grammar mistakes we just never make) while others can be learned. Since robots looking to pass the Turing test will be given lots of training sets, they will be exposed to right and wrong examples of grammar. This is different to children who never make certain grammar mistakes without ever being exposed to them, but how could we ever make something innate in a robot if everything it knows will be based on training sets?

    These paragraphs don't link very well because they’re more of just my idea dump. I’m trying to think of how we can go from faking language capacity to actual language capacity, and, maybe more importantly, how to tell the difference between those two.

    ReplyDelete
    Replies
    1. Yes, the feeling part of meaning is up against both the other-minds problem and the hard problem. Maybe the truth is that T3 grounding is impossible without feeling. But without a solution to the problem, there is no explanation of how and why it is necessary -- so why should we believe it?

      UG, like OG, is a set of rules. All syntax is like computation in that way. So once we know all the rules of UG, we could build them into T3. No magic in that. But it certainly does not tell us how or why UG evolved. (Chomsky thinks UG is not really a constraint on language; he thinks it's a constraint on thought. Language puts our thought into words. But *"John is easy to please Mary" is not a thinkable thought!)

      By the way, there's enough time and data for the human child to learn OG through unsupervised and supervised learning. It is also possible to learn UG via unsupervised and supervised learning if there are enough negative examples -- such as *"John is easy to please Mary" with feedback: "WRONG! It's supposed to be "It is easy for John to please Mary".

      So teams of adult linguists have gradually managed to "learn" the rules of UG. It's just that for the child there is not "world enough and time" (in the short critical period in which toddlers learns to speak (UG-compliantly) with no UG violations heard, or produced (nor, obviously, corrected). And the linguists had the advantage that they (unconsciously) "knew" the rules of UG already, so they knew *"John is easy to please Mary": they just had to reverse-engineer the UG rules!

      Delete
  7. If Universal Grammar is innate and there is no way of formally learning it, then does that imply that UG is a part of consciousness that we could never program a robot to possess? I am basing this question on professor Harnad equating “feeling” with “consciousness”. Comparing to programming a robot to have emotions, the case with grammar is even more bizarre because not only that we know what correct grammar sounds like, there are also formalized syntactic rules. This “knowing what it sounds like” is a feeling, or consciousness, therefore we can only get so far when teaching a T3 robot how to produce grammatically correct and semantically appropriate sentences (avoiding “colourless green sleeps furiously). The robot will not be able to ground its produced sentences, despite having ample sensorimotor experiences, because it does not have feelings.

    ReplyDelete
    Replies
    1. We can tell (feel) if an OG rule is violated and also if a UG rule is violated. If we've learned OG grammar, we can also say what rule has been violated. But only linguists know the rules of UG -- yet we all follow them. Only reverse-engineering can tell us what they are (and then -- see above -- we can build them into T3, just as they are built into us by evolution).

      Delete
    2. "This “knowing what it sounds like” is a feeling, or consciousness, therefore we can only get so far when teaching a T3 robot how to produce grammatically correct and semantically appropriate sentences (avoiding “colourless green sleeps furiously)."

      I think this is a really interesting point. I think that so much of our decision-making occurs on a subconscious level and is experienced as feelings. In his book Homo Deus, Yuval Harari argues that emotions are algorithms that have been refined through natural selection, representing complex calculations that we are consciously unaware of. For instance, the experience of hunger is the result of careful monitoring of nutritional and energy needs, fine-tuned to urge us to take the appropriate action to fuel our bodies when needed. A faulty hunger algorithm could be deadly. Fear is the calculation of risk, sexual attraction the calculation of reproductive fitness, and so forth.

      It's not clear that universal grammar is in this same category, given the questions surrounding whether or not it is evolved. However, there is certainly a parallel in the fact that none of us explicitly know the rules, yet our feelings guide our action so that we follow them.

      All of these “algorithms” and rules can be reverse-engineered and formalized, including the rules of UG, as Professor Harnad mentions above. A T3 robot could therefore be programmed to behave the same by inputting these rules. It therefore depends on whether one judges this to be satisfactorily equivalent. Obviously, we cannot show concrete evidence of the process being fundamentally different, as the main difference is in subjective experience. However, I feel that without proposing a mechanism for such programming being or becoming equivalent to the emotionally based decision-making of human beings, it remains more likely that subjectivity is absent and the processes are inherently different, although one could argue that this is irrelevant to passing TT.

      Delete
  8. “What’s most puzzling is why [chimps] don’t seem to pick up on the power that is within their hands when they start systematically combining symbols to define new categories. What seems missing is not the intelligence but the motivation, indeed the compulsion, to name and describe."

    I think it is a little naïve to say that chimps were not motivated to learn language. Maybe it is just the wrong choice of words, but given chimp’s intelligence and similarity to humans, surely it isn’t just laziness. Just lazy evolution.

    It would be so much easier to have chimp T3 than human T3. We wouldn’t even need T2 since T2 is just language (unless chimp T2 has some sort way to show chimp communication). In the same vein, it makes me realize how much easier our own T3 would be if humans didn’t have language. However, language is the closest we will ever get to Searle’s periscope. So, for Cognitive Scientists I suppose language is a blessing and a curse: hard for us to figure out, but makes it easier for us to figure out how/why we do other things.

    ReplyDelete
    Replies
    1. Trouble with nonhuman T3 is that we know what a normal human can do, and can tell whether there is something missing or un-human about the way it's being done. But we don't have that mind-reading mirror-capacity with other species (or not enough of it for T-testing). That said, it's certain that the path to the TT must start with attempts to reverse-engineer toy capacities and eventually "sub"-human capacities before we reach human T3-scale.

      Delete
    2. I personally found the argument that higher animals like chimps lack the motivation to progress in language development to be sensible. It made me wonder if we could instill motivation into a group of chimps by teaching multiple generations the skills necessary to use language, as Prof. Harnad described. However, this would probably take too long.
      The discussion about the T3 robot got me thinking about the conversation of “invention or gradual process” related to the origin of language. If, according to paper, it is more likely that language developed over time, why do we try to reverse engineer a robot with capacities indistinguishable from a human’s in one go. If we create a robot with language capacities like ours, would that not be essentially “inventing” language again? Why do we not instead focus on creating a Turing machine that can learn to develop language using training similar to the mushroom experiment discussed in the paper? I understand a computer cannot have offspring to pass on its code, but can we not conceptually work around this? Surly we could also jump-start the process a bit by programming motivation. I fear I am diverging into the realm of speculation but I think it is interesting to be critical of our approach to reverse-engineering considering we originally did not develop as an “invention”.

      Delete
  9. This paper reminded me of some of Joseph Henrich's work on how information is passed down in societies. The importance of language in passing on this information is highly salient to this paper, but I would also note Henrich's focus on belief. If someone is telling you and not showing you information, an amount of belief in them or trust is required. That is how some seemingly bizarre practices and traditions can begin (such as cultures that avoid fish while pregnant, use ash in their cooking, or have seemingly long and convoluted processes for preparing certain foods) that have no point at face value, but in actuality are highly beneficial.
    But of course this 'tell' option for passing on categorical information is only viable if the source 'telling' has already gained trust. If this has not occurred I would consider 'showing' as the only effective means to pass on this information.

    ReplyDelete
    Replies
    1. If it all began within the family, the trust is a safe bet for lazy evolution. If it began in a more social context, as miming became less analogue and more abbreviated, discretized and arbitrary, wouldn't the telling inherit the "trust" from the showing?

      Besides, to suspect someone of lying, you already have to have the "propositional attitude" (that tellings can be true or false).

      My guess is that the default assumption of truth is part of the propositional attitude. Remember that the main advantage of telling over showing is that it is faster, safer and easier than finding out for yourself, the hard way. If the default assumption had been "don't believe till you've fact-checked it for yourself" would language have gotten off the ground.

      I think the persuasiveness of a proposition is really deep, and that the power of suggestion (hypnosis) is one of its features. (So are supernatural beliefs -- and the power of fake news and of a psychopath like Trump.)

      Delete
  10. Categorization is a huge part of our cognition— to do the right thing with the right kind of thing. Before language, all of our category learning was done through induction or direct sensorimotor interaction, if we wanted to know what an edible mushroom or berry was, we'd have to go out and find it and try it ourselves enough times to distinguish between the edible and the non-edible. According to the paper, communication that would eventually become language began when we discovered that we could acquire (or steal) categories by passively observing others doing the right thing with the right kind of thing despite not interacting directly with it ourselves (this is exemplified by the mushroom experiment where one could acquire the category C through instruction since it was at the intersection of two already known categories A and B). The paper points out that we likely had the necessary "cognitive components" before language and that it was our motivation and social nature that enabled us to use them for language. What does that imply about evolution of language from then to now? Were the necessary cognitive components related to/encompassed UG or did that necessarily come later when we developed syntax and written symbol systems (my understanding is that UG is mostly syntactic whereas this description of the evolution of language focuses more on the meanings, grounding, and semantics)?

    ReplyDelete
    Replies
    1. Good questions. The mushroom-world toy model was really just about the power of boolean conjunction (A = B and C) of grounded categories. "Hearsay" is really not a credible option for the origin of language. The transition from pointing and pantomime to propositions is more credible (but still just speculation -- although some of the human neurological findings give it some support).

      Gasser, B., & Arbib, M. (2019). A dyadic brain model of ape gestural learning, production and representation. Animal Cognition, 22(4), 519-534.

      Corballis, M. C. (2017). The evolution of language: Sharing our mental lives. Journal of Neurolinguistics, 43, 120-132.

      Donald, M. (2012). The mimetic origins of language. Oxford Handbook of Language Evolution.

      Delete
  11. In Blondin-Massé, Harnad, Picard, and St-Louis' paper entitled Symbol Grounding and the Origin of Language and today's class, we learned about the Baldwinian effect and the nuclear power of language.

    Firstly, I will use the mushroom world simulation created by Cangelosi and Professor Harnad to explain the Baldwinian effect. In this computational model, creatures acquired six categories through sensorimotor induction with three different types of mushrooms: A (edible/inedible), B (markable/unmarkable) and C (returnable/unreturnable). Side note: The category names in brackets were the ones used in class, not in the article. At first, the creatures learned the first two categories via trial-and-error and the third category via trial-and-error or hearsay. Hearsay is another term for learning by instruction. Through hearsay, the less knowledgeable creatures overheard the more knowledgeable creatures vocalize (or perform observable actions for) the combinations of category names. Let us take returnable = edible + markable, for example. The creature would EAT the edible mushroom, MARK the location of the markable mushroom, and would EAT and MARK the returnable mushroom. Within a few generations, the descendants only learned through hearsay rather than trial-and-error to the point that "the instruction learners had out-survived and out-produced the induction learners." This is because of the Baldwinian effect: the creatures who were more inclined or motivated to learn categorization (doing the right thing with the right kind of thing) through the most efficient way had an advantage in the form of survival and reproductive success over the rest of the group.

    Secondly, I am confused as to what exactly is the nuclear power of language. In my notes, I have two points. The first point, mentioned in class, was about the minimal grounding set (a small mix of words from the core and kernel that does not define within itself) being able to name an infinite amount of categories or produce an infinite amount of propositions. The second point, mentioned in class and the article, was about the ability to learn new categories through instruction from others who already knew the old categories. Which one is correct?

    ReplyDelete
    Replies
    1. Both are correct.

      (1) If categorization ("doing the right thing with the right kind of thing") is important, and learning to do it in the fastest, safest and simplest way (instruction rather than induction), then language is the most powerful way to learn (most) categories.

      (2) How many categories would still need to be learned the old way, by induction (the way the virtual creatures in the mushroom world had to learn "edible" and "markable")? At least the minimal grounding set (2 for the mushroom world, at least 1500 for us).

      Delete
    2. @Prof: In response to (2): Does this then mean that when reverse-engineering, we need to code into the robot (I think by now we are all convinced that we need a robot with sensorimotor abilities for symbol grounding and thus to pass T3) these 1500 kernel words? UG is what we have not yet understood, but once we do and are able to code it, the robot would be able to use these 1500 kernel words to create a virtually infinite set of categories through instruction? And symbol grounding is how these 1500 members of the grounding set would be connected to their referents in the real world?

      Categorizing is something all cognizing animals do - if we are able to reverse-engineer language (including UG), would that automatically translate to having reverse-engineered cognition? How is UG linked to categorization?

      Delete
    3. Lazy evolution need not "code [the 1500 words] into the robot": just the capacity to learn them (by unsupervised and supervised sensorimotor learning).

      In principle (but surely not in practice) the grounded 1500 word MinSet would be enough to send or transmit all further categories by verbal instruction alone (via grounded propositions defining or describing all other categories out of the already grounded ones). Whether evolution has to code in UG is another matter.

      UG (syntax) is not linked to categorization (although "UG-compliant vs. UG-noncompliant utterances" is a category, both recognizing them and producing them).

      But semantics (meaning) is linked to categorization. And whether linguistic syntax (UG) can be completely independent of semantics is still an open question. ("Stevan Says" No.)

      Delete
  12. In this essay, the authors state "We can now see the sense in which the “shapes” of the symbols themselves are arbitrary:
    0 is not “shaped” like nothingness, 3 is not shaped like threeness, the equals sign is not shaped
    like equality, and so on. Nor is the proof of 2+2=4 shaped like truth!" (5).

    Is this analogous or identical to the principle behind the reasoning for why it's easier to reverse-engineer a heart than to reverse engineer a brain? If the heart muscle is like non-arbitrary (iconic?) grounded symbols, then the brain is like the formal and arbitrary (relative to "meaning" - whatever that is) symbols. I feel like this is a flawed analogy on my part but this quote rang a big bell from yesterday's discussion of the 4th question on the Midterm.

    ReplyDelete
    Replies
    1. I should expand -
      Does this analogy work because - in opposition to arbitrary symbols which in no way resemble their referents- non-arbitrary symbols must then share sensorimotor-interpretable properties with their referents, much like how the heart's structure/appearance reflects properties of its function (what it does), in opposition to the way the brain's structure/appearance does NOT seem to reflect iconic properties of all the things it does?

      Delete
    2. The reason the analogy does not work is that the heart does not refer!

      Delete
  13. "Some people argue that natural language’s syntax is autonomous from meaning too (Koster 1986). There are perhaps more grounds for reservations about that, but let’s assume that’s true too."

    A few lines later, it is written that linguistic symbol manipulation is governed by another constraint than just syntax: "shape" of the words' meanings.

    I am not exactly sure how to reconcile these two claims. I understand that we can utter syntactically correct sentences which are semantically non-sense. On the other hand, we know that they are non-sense. What confers us this ability? UG? Symbol grounding?

    Another question I am asking myself: has anyone systematically defined the UG? When mentioned it often is presented as this abstract set of rules that allows children to learn any language from birth. What characteristics, in any language, are "traces" or indicators of universality as opposed to just OG? Is it not always an empirical claim to denote a feature of language as universal? In other words, when can we be sure that a rule is universal as opposed to invented or "local" to said language?

    Lastly, it doesn't seem to be the case, but do minSets, kernels and cores tell us anything about UG?

    ReplyDelete
    Replies
    1. What gives us the ability to understand that syntactically correct, but semantically incorrect sentences is UG. The underlying mechanisms as to why it allows us to detect them is still unclear. The best rule of thumb that we have with addressing violations in UG (and speech & thought) is that if it sounds like a violation, it most likely is one. This is because if we possess UG, and UG has innate rules that govern the nature of language (Thought & speech), then we (as carriers of UG) must know what is not a proper way to use this form of governance.

      To exemplify this, we stated in class a syntactically correct, but semantically incorrect sentence: "John is easy to please Mary". This sentence is wrong because 1) it intuitively does not make sense and 2) the name "John" and the word "easy" used in this manner violate UG principles because of the ambiguity they engender. It is not clear what easy means in this sentence, as easy is pointing in two different directions versus "John is eager to please Mary" is uni-directional.

      Furthermore, one element of "knowing" that it is a violation is that you wouldn't think or say "John is easy to please Mary" because when you think about it, or say it, it feels (and is) wrong to say. The argument behind this is that language is what drives thought, and if language is driven by UG, then UG is the hub where thought functions. Therefore, our thoughts and speech operate under the same principles as UG and thus follow the same rules for violations of semantics, but obeying syntactics.

      I also find your question about UG, minSets, Kernels and Cores to be interesting. It's hard to say that they could tell us something specifically about UG, but considering there are quantifiable amounts of words that are necessary to make up these categories in language, then I have a hunch that it could say something about UG. As UG is directly related to language, it is possible that these are overlapping properties of UG in the sense that the words that occupy minSets, Kernels and Cores are obeying similar innate principles that UG does. My argument for this would be that there is a minimal set of features that minSets, Kernels and Cores follow in order to be those categories, that in the same manner UG follows a minimal set of features to be its own category.

      Delete
  14. “Unlike the kernel, the MinSet is not unique. There are many alternative MinSets of the same smallest size.”

    The fact that the MinSet is not unique, or that there are many minimum grounding sets in a language that all achieve the same purpose, is of great interest to me. If everyone had a different MinSet, then this would make categorization and the ability to communicate categories through language all the more important. Our ability to understand one another despite the varying perspectives on language seems like a huge asset - and points to the consideration that there is no one right way to achieve cognition in robots. Having different minimum grounding sets does not appear to impede our ability to learn and teach categories. What is required, however, is the faculty to learn categories and convey by instruction, and the faculty to learn by induction, something that can only be accomplished with T3 capacity (sensorimotor input). Still, I find it impressive that given we can all have different MinSets, we are still able to effectively communicate with one another, and develop languages that can express any possible proposition with those sets.

    ReplyDelete
  15. What is meant by, “So the origins of language, we suggest, do not correspond to the origins of vocal language; nor do they correspond to the origins of vocal communication. Language itself began earlier than vocal language (though not earlier than vocal communication, obviously).” (pg 2)? Is it just stating that the capacity for language existed in hominins before they used it verbally?

    My understanding of the arguments regarding chimps (pages 9-10) is that there are two possibilities, either chimps cannot understand propositional statements’ truth values, or alternatively, they might be able to understand them individually, but do not have the motivation to communicate the propositions. Could limitations of language production be the reason for other hominids not being able to, “name and describe”? Being able to make so many different verbal sounds is what enables us to develop so many names and categories; if we didn’t have that capacity, is it possible we never would have realized the power of categorization?

    ReplyDelete
    Replies
    1. We very clearly do not need to be able to make sounds with our mouths to develop categorization or language. This is evident nowadays with deaf individuals and sign languages. Even a deaf child who's never had access to language (any language, spoken or signed) has an incredible motivation to understand and be understood. I find it very unlikely that, for some reason, humans from so long ago simply "wanted" to communicate more than any other animal and developed language.

      Delete
  16. Question: Is this reasoning correct?

    A symbol system consists of symbols that have rules for their manipulation, but is independent of meaning. Symbol grounding means connecting content words with their referents, and this is typically done using sensorimotor feature detectors. Language is a set of symbols that we manipulate, and content words are part of this set of symbols. Is this then why the translatability thesis exists? Different languages are different symbol systems (with different rules), but the referents to which they are grounded are the same?

    ReplyDelete
    Replies
    1. A formal symbol system (such as logic, mathematics or computation) is not just hardware-independent; it is also meaning-independent.

      Of course we are not interested in meaningless symbol systems, so logic, mathematics and computation are symbol systems that can be interpreted by the user as meaning something (2+2=4). But when you do a formal proof, or you apply a formal algorithm, you can't use the meaning to do it, just the formal symbols and rules (squiggles and squoggles).

      Not so in the case of language: The symbols have to be grounded (at least: it also feels like something to understand what symbols mean, and to produce symbols that mean something).

      Can linguistic syntax be done purely formally, like in maths? I'm not sure. This is the (unanswered) question of the "autonomy of syntax" (from semantics).

      Delete
  17. Blondin Massé et al (2012) Symbol Grounding and the Origin of Language: From Show to Tell

    In this very interesting paper, the evolution of language departs from an attempt to communicate through miming to more conventionalize miming and eventually to shared arbitrary and symbolic categories. The increasingly arbitrary categories permitted the development of the new language capacity of transmitting categories to one another through propositional descriptions. This paper addresses the question of whether languge is “invented” from very basic language capacity or if it is perhaps a long evolution of language precursors. The object of study is the origin of a symbol system that would permit to say anything that can be said in a natural language (development of minSet). The authors propose that a transition from show to tell occurs as humans develop the capacity for propositional attitudes from pantomime. They then suggest that the power of language was to acquire new categories from previously known categories (out of which the new one was constructed/ ex: C is a member of A and B) and that this combined with observational learning progressively gave rise to proposition. Propositional attitude developed as humans were actively involved in communication of categories and they went from incidental instruction to intentional instruction.

    ReplyDelete
  18. I hate to harp on this example, but I've been a bit stuck on the Bunny/language question since last lecture. If animals - like dogs and primates and probably many others - are smart, really smart - and, as animal experiments have shown, demonstrating their own un-ethicality - can do so many of the cognitive tasks which humans can do, sometimes better than we can, and furthermore have the potential for pro and eu-sociality (see bees and ants), why don't they have language?

    When I say language, I don't mean the possibility to mimic human speech, or learn associations to a few words, I mean language as creating a fully elaborated symbol system that allows for complex and novel propositions. As Blondin Masse et al states, this gap is either created by a lack of capacity - they simply don't have the genes/cells for it (ex. their biology can't something like UG) - or they don't have the motivation. The former is probably a problem for the neuroscientists and biologists, the later is a more philosophical question - for cognitive scientists and evolutionary theorists.

    So, if it is a question of motivation, what's missing? If we follow the idea that language allowed organisms to categorize more quickly and efficiently, with less risk, and convey those categories to kin, making them more fit, to the point that lingually capable categorizers out-competed all other conspecifics, that 'motivation' is natural selection. Any 'desire' to create novel propositions and categories (expanding language) was born of adaptive advantage - even if it was consciously experienced, and still is - as a 'want', a creative drive, or 'fun' (ever made up slang for your specific group? it's a bool).

    If we assume the motivation for other organisms to develop language is similar - i.e. motivation = adaptive advantage - then which species in which contexts would feel this selective pressure? In my mind, there are two candidate groups: 1) highly social animals that live in developed, intergenerational groups and collaborate to survive (perhaps particularly omnivores (because they have more food options to assess and claim as safe or not), and species that migrate/are nomadic (because they deal with novel objects often)); or 2) domesticated animals who live in close, dependent contact with humans i.e. pets.

    Personally, I see the first category as having more potential than the second - if I had to guess, this is where language will arise first. Following Tomasello's work, for this to happen these groups will have to be truly cooperative - beyond the level of hunting together that chimps show. Further, I think the intergenerational component of social groups is an enhancing component that would make the development of language more likely - if you follow the logic of the mushroom sim, teaching is part of how language became advantageous- older conspecifics would be more likely to have knowledge to pass on - things to show and tell.

    I understand that this is all evolutionary speculation, yet I find it relevant to the case of Bunny - and animal communication as a whole. We all like the idea of a talking dog, and the individual dog is motivated to talk; the species is not - we are not going to stop taking care of non talking dogs and therefore not all dogs, or even most will be pushed to talk. Further - we seem to care more about imitation (the T1 toy level of speech) than we do about dogs categorizing. As such, if motivation is the issue, I don't see it coming soon.

    ReplyDelete

Opening Overview Video of Categorization, Communication and Consciousness

Opening Overview Video of: