Pullum sobre Chomsky en la UCL

20 noviembre 2011

El 10 de octubre pasado Noam Chomsky dio una polémica charla en la University College London (UCL). Nada raro en eso. (Se puede ver el video haciendo click aquí).

Hoy, Geoffrey Pullum envió a través de Linguist List una respuesta ciertas partes de la charla de Chomsky. Se trata de un documento corto, pero sumamente interesante. Lo pego a continuación. Al final voy a hacer un breve comentario a favor de Chomsky. Mi comentario, quiero aclarar, no se basa tanto en el contenido de la polémica, sino en la postura que toma Chomsky.

About a month ago (10 October 2011) Noam Chomsky spoke at an invitation-only seminar at University College London (UCL).  I attended along with about 90 other British linguists.  The announced title was: “On the poverty of the stimulus”.  The video of both the talk and the question period is available: (http://www.ucl.ac.uk/psychlangsci/research/linguistics/news-events/latest-news/n_chomsky; henceforth, UCL video). In what follows I summarize some of the content of Chomsky’s London talk and its question session, and explain some of my reactions.
Chomsky’s remarks in London were not very different in tone from things he has said elsewhere: the UCL presentation was extremely similar to a lecture given at Carleton University in Canada last April (http://www.youtube.com/watch?v=XbjVMq0k3uc), and echoed themes from Chomsky’s talk at the symposium on the biology of language at the 2011 Cognitive Science Society conference in Boston last July, and journal articles such as “Language and other cognitive systems” (Chomsky 2011), and particularly the paper “Poverty of the stimulus revisited” (Berwick et al. 2011, henceforth BPYC-2011). These recent talks and papers share a steadfast refusal to engage with anything that might make the debate about the poverty of the stimulus (POS) an empirical one.  They issue blanket dismissals of nearly all modern cognitive/linguistic science as worthless, and sweep aside whole genres of work on the basis of what seems to be extremely shallow acquaintance.  Claims about parallels in the natural sciences feature prominently, as does a preference for authority over evidence.  I will discuss a selection of topics, without attempting to be very systematic.
Two aspects of the way Chomsky chose to deal with the topic of stimulus poverty struck me as startling.  The first was that he stuck entirely with the version of the argument from POS that the late Barbara Scholz used to call the rocks-and-kittens version.
A child’s pet kitten (so the argument goes), exposed to the same primary linguistic data as the child, learns no language at all, and is indistinguishable from a rock in this regard.  Since the linguistic inputs are the same, an innate interspecies difference in language readiness and capacity for language acquisition must be involved; therefore linguistic nativism is true.  (This is not parody, as I scarcely need to document: Chomsky has happily repeated his views on kittens and the like many times.  A Google search on a pattern as specific as {Chomsky granddaughter rock kitten innate} will yield tens of thousands of hits, nearly all relevant ones.  See Smith 1999: 169-170, or Stemmer 1999, or Chomsky 2000: 50 for quotable quotes in print.)
At UCL Chomsky didn’t really give even this much of an argument: he just noted that humans had a genetic endowment that permitted them to learn language, and stipulated that he would call it Universal Grammar (UG). (Compare, e.g., “The faculty of language then is a special property that enables my granddaughter but not her pet kitten or chimpanzee to attain a specific I-language on exposure to appropriate data…”
He even admitted that “intellectually … there’s just nothing to it — [it’s] a truism” (UCL video, 3:42); but he went on to argue that there is “a kind of pathology in the cognitive sciences” (UCL video, 4:24) in that its practitioners obdurately refuse to accept the simple point involved.
The real trouble, of course, is that everyone accepts it — nobody doubts that there is something special about humans as opposed to kittens and rocks — but they do not recognize it as a scientific result concerning human beings or their capacities.
What I had imagined would be under discussion in this seminar is the specific view about the character of human first language acquisition that is known as linguistic nativism.  This is a substantive thesis asserting that language acquisition is largely guided by an intricate, complex, human-specific, internal mechanism that is (crucially) independent of general cognitive developmental capacities.  This assertion seems to me worthy of serious and lengthy discussion.  The rocks-and-kittens claim is surely not.  We all agree that kittens and rocks can’t acquire language, and that it’s not because they don’t get sufficient exposure.  But that hardly amounts to support for linguistic nativism over general nativism (Scholz & Pullum 2002: 189).
It’s not that Chomsky doesn’t recognize the distinction between linguistic nativism and general nativism.  He says (Chomsky 2000: 50, reproduced at (http://www.chomsky.info/books/architecture01.htm):
”Now a question that could be asked is whether whatever is
innate about language is specific to the language faculty or
whether it is just some combination of the other aspects of
the mind. That is an empirical question and there is no reason
to be dogmatic about it; you look and you see. What we seem
to find is that it is specific.”
But to say that you simply look and see, when the question is as subtle and difficult as this one and concerns mechanisms inaccessible to the tools we currently have, is surely not a responsible characterization of what science involves.
The second striking choice Chomsky made was to address the poverty of the stimulus without ever mentioning the stimulus at all.  This was POS without the S.  One would expect that when someone claims that the child’s input is too poverty-stricken to support language acquisition through ordinary learning from experience, they would treat empirical observations about the nature of that input as potentially relevant.  It would give a POS argument some empirical bite if one could specify ways in which the child’s input was demonstrably too thin to support learning of particular features of language from experience of language use.  That would seem worthy of attention.  The rocks-and-kittens version does not.  I was very surprised that Chomsky stuck to it so firmly (though that does explain his lack of interest in the child’s input: the rocks-and-kittens argument doesn’t need anything to be true or false of the input).
The POS issue is going to take a long time to resolve if we can’t even focus on roughly similar versions of the purported argument. Yet Chomsky regards it as crucial that it be resolved.  He began his talk, in fact, with some alarmist remarks about the prospects for linguistics (“the future of the field depends on resolving it”: UCL video, 4:38).  If we do not settle this question of stimulus poverty, he claimed, we are doomed to seeing our subject shut down.  So he portrays current skepticism among cognitive scientists about linguistic nativism as not just obtuse, but actively harmful, a threat to our whole discipline.
This is an interesting (if rather risky) new way of stoking enthusiasm for linguistic nativism: appeal to linguists’ self-interest and desire for security (you don’t want to be shut down, do you?).  But it’s hard to take seriously.  Linguistics is not going to die just because a fair number of its practitioners now have at least some interest in machine learning, evolutionary considerations, computational models of acquisition, and properties of the child’s input, and are becoming acquainted with probability theory, corpus use, computer simulation, and psychological experimentation — as opposed to waving all such techniques contemptuously aside.
Chomsky went on to remind us all of the linguists and psychologists in the 1950s who (allegedly) stuck so rigidly to corpus data that they regarded experiments going beyond the corpus data as almost a betrayal of science.  And he stressed that the work of people today who work on Bayesian learning of patterns or regularities from raw data has no value at all (“zero results”).  He compared their modeling of phenomena to physicists making statistical models to predict the movements of medium-sized physical objects seen outside in the street (UCL video, 36:41).
I think such a blanket dismissal overlooks a crucial conceptual contribution that Bayesian thinking makes to theoretical linguists, one that has nothing to do with the statistical modeling on which Chomsky pours such scorn.  Many linguists have given the impression that they think it is impossible to learn from positive data that something is not grammatical.  Lightfoot (1998: 585) suggests, for example, that although you can perhaps learn from experience that auxiliary reduction is optional in the interior of a clause, you cannot possibly learn that it is forbidden at the end of a clause; hence linguistic nativism has to be true. This reasoning is flawed, and Bayes’ Theorem teaches us why.
The lesson is that probability of a generalization G being correct given a body of evidence E is not dependent merely on whether E contains crucial evidence confirming G over its rivals.  The probability of G is proportional to the product of the antecedent probability of G’s being true with something else: the probability that the evidence would look like E if G were true.  That means that what is absent from experience can be crucial evidence concerning what the grammar has to account for. For example, all the thousands of times you’ve heard clause-final auxiliary verbs uncontracted strengthen the probability that they’re not allowed to contract.
The argument from absence of stimulus is pretty much demolished by this Bayesian insight: the argument form simply is not valid.  And for people who use the phrase “the logical problem of language acquisition” (as linguistic nativists have been doing since 1981), that ought to mean something.  It certainly seems to me sufficient to justify including at least a brief introduction to Bayesian statistical reasoning in the education of every theoretical linguist.
Suppose, though, that it ultimately turns out that the current fashion for constructing Bayesian computational models of learning is something of a dead end.  It still doesn’t follow that it is deleterious.  Much can be learned by watching models ultimately fail.  There is no threat to the discipline here: linguistics is not so fragile that it will collapse just because one possibly false trail was followed.
The people interested in Bayesian modeling and similar computational lines of research are smart enough to eventually perceive its inadequacy (if indeed it is inadequate), and will move to something that looks more interesting.  People get bored in dead-end ventures.  I talked to Roger Brown in 1968 and he told me that the reason he had abandoned Skinnerian behaviorism ten years before had nothing to do with any revolutionary new ideas in scientific thinking about cognition or the impact of Chomsky’s famous review of Skinner: he was just bored with the work that behaviorism demanded, and wanted to try something more interesting. Intellectually agile people want to move on.
About half-way through his talk, Chomsky made some claims about the probability of success with proposals to the NSF to fund research projects on Universal Grammar (UG).  He said: “If you want a grant from the National Science Foundation, you better not include that [the phrase “UG”] in your proposal; it will be knocked out before it even reaches the review board” (UCL video, 30:35).
He warmed to this theme: “If you want to get a grant approved, you have to have the phrase ‘sophisticated Bayesian’ in it, and you also have to ask for an fMRI, especially if you have nothing whatever to do with it” (he chuckled here and there was general laughter) “… if you meet those two conditions, you might make it through the granting procedures” (UCL video, 31:02).
Then he returned to the claim that “UG” will doom your proposal: “But if you use a dirty word like UG, and you say there’s something special about humans and we’ve got to find out what it is, that pretty much rules it out” (UCL video, 31:18). And then, with no chuckling, he added: “I’m not joking; I have concrete cases in mind … of good work that just can’t get funded, because it doesn’t meet these conditions…  Right at MIT in fact” (UCL video, 31:28).
Since award details are public information, it is trivial to find out whether the NSF is making awards for purely theoretical study of UG in a Chomskyan perspective.  And it is. Željko Bošković’s grant “On the Traditional Noun Phrase: Comparing Languages With and Without Articles” (BCS-0920888) is an example.  And MIT is not left out.  For example, David Pesetsky obtained Doctoral Dissertation Research grant no. BCS-1122426 for a project “Argument licensing and agreement”; the abstract begins: “Which properties of human language are universal, and which may vary across languages?  Answering these questions will help us understand the unique human capacity for language, through which we hope to gain insight into the overall architecture of the human mind.”  And Chomsky must know that his co-author Robert Berwick received grant BCS-0951620 for a “Workshop on Rich Grammars from Poor Inputs” at MIT in 2009.
Naturally, many NSF proposals mentioning UG will go unfunded — the majority, given that across the board less than 25% of grant proposals get funded.  But (of course) proposals are sent out for peer review whether they mention UG or not, and whether they mention Bayes or not.
It seems a strange strategy to make claims of this sort to an audience of linguistics professionals in a foreign country who would have little knowledge of the NSF, and send out the message to young investigators internationally that following Chomsky’s theoretical line will blight their careers by dooming their chances of NSF funding. Even if this were true, it would give the impression of a fractious field that has bad relations with its most important Federal funding agency.  But it is much stranger to make such statements when they are easily discovered to be false.
In the question period there was an extremely unfortunate interaction when the computational learning experimenter Alexander Clark tried to ask a question.  Chomsky interrupted and began his answer before Clark had managed to make his point.  The question Clark want to put was roughly the following (I knew enough to see where he was going, and he has confirmed to me that this was what he meant).
A paper Clark had published with Eyraud (2007) on learning some kinds of context-free grammars (CFGs) from positive data is dismissed in BPYC-2011 as useless.  Chomsky repeated that dismissal in his talk. But Clark’s more recent work has focused on languages in the much larger context-sensitive family that are generated by minimalist grammars as formalized by Edward Stabler.  These are strongly equivalent to the Multiple Context-Free Grammars (MCFGs) that were invented by Seki & Fujii (1991), as Clark tried to begin to explain.  He was not attempting to say anything about CFGs, but to raise the issue of learning the languages of minimalist grammars, or equivalently MCFGs.  This is a wildly different class, vastly larger than the class of CFGs.  It corresponds to the infinite union, for all natural numbers N, of a hierarchy of classes of languages (each definable in several ways) in which the first few steps are these:
N = 0      finite languages
N = 1      regular (finite-state) languages
N = 2      context-free languages
N = 3      tree adjoining languages
N = 4      …
There has been much relevant mathematical work on these matters between 1984 and the present by people like Gerald Gazdar, Henk Harkema, Aravind Joshi, Greg Kobele, Jens Michaelis, Carl Pollard, Kelly Roach, James Rogers, Edward Stabler, K. Vijay-Shanker, and David Weir (it is easily findable; I will not try to give even a brief bibliography here.)  If Stabler has accurately captured the intent of the hints in the “minimalist program” about Merge and feature-checking, then minimalism embraces an enormous proper superset of the context-free languages.  (I say “if” because Chomsky declines to refer to any of Stabler’s work, so we don’t know whether the formalization is acceptable as a precise reconstruction of the minimalist program as he conceives of it.)
Clark was trying to get Chomsky’s reaction to recent results (see e.g. Clark 2010) exhibiting efficient algorithms for learning various subclasses of the MCFGs, including some fairly large classes going well beyond CFGs.
Chomsky interrupted the question and began to talk about CFGs.  But he misspoke, and talked about having proved in 1959 that CFGs are equivalent to linear bounded automata (they aren’t; LBAs are equivalent to context-sensitive grammars).  Even if CFGs had been equivalent to LBAs, and even if Chomsky had been responsible for results on LBAs in 1959 (he wasn’t, it was Kuroda five years later), CFGs had nothing to do with the observation Clark was trying to make about MCFGs.  And Chomsky had in any case never proved any theorems about learnability, which was what Clark was trying to ask about. Clark’s question not only was never answered, it was not even heard, hence of course not understood.
After Clark’s question, there were only a few more. I was lucky enough to be allocated time to ask two brief questions before the session ended.  Chomsky had condemned language evolution work wholesale (“a burgeoning literature, most of which in my view is total nonsense”: UCL video, 27:08), and I asked him to speak more directly about Simon Kirby’s research on iterated learning of initially randomly structured finite languages, which he has shown leads to the rapid evolution of morphological regularity.
Chomsky’s answer was that it is not at all interesting if successive generations of learners regularize the language they are trying to learn: the regularity emerges only because human intelligence and linguistic competence is utilized in the task, and if you gave the same task to computers the same evolution would not happen.
Kirby’s group has in fact addressed both those points, and both claims appear to be false.  It seems to be the cognitive bottleneck of memory limitation that forces the emergence of regularity (decrease in Kolmogorov complexity) in the language over learning generations, not human linguistic capacity or intelligence (note the remark of Kirby, Cornish, & Smith 2008: 10685, that “if participants were merely stamping their own linguistic knowledge onto the data that they were seeing, there would be no reason we would find rampant structured underspecification in the first experiment and a system of morphological concatenation in the second”).  And the effect of weak learning bias being amplified by cultural transmission through iterated learning does indeed turn up when the learner is simulated on a computer (see e.g. Kirby, Dowman, and Griffiths 2007).
There is an opportunity for substantive discussion here.  And since both Chomsky and Kirby are invited speakers at the upcoming EvoLang conference in Kyoto (http://kyoto.evolang.org/), there will be a forum where it could happen.  I hope it will.  But maybe I’m too optimistic: I see the current integration of computationally-assisted cognitive science with careful syntactic description and theorizing as precisely what should inspire confidence that the language sciences in the 21st century has a bright future rather than spelling doom to linguistics.
The other topic I was able to ask about was the scientific plausibility of a view that has a remarkable genetic quirk arising between 50,000 and 200,000 years ago, giving a single developing hominid species an unprecedented innate UG that permits articulate linguistic capacities, and then remaining absolutely fixed in all of its details until the present.
A very few linguists (they include James McCawley, Geoffrey Sampson, and Philip Lieberman) have pointed out this prediction of genetically determined variation in UG between widely separated human groups.  Lieberman notes that dramatic evolutionary developments like disappearance of lactose intolerance or radical alteration in the ability to survive in high-altitude low-oxygen environments can take place in under 3000 years; yet (as Chomsky stresses) the evidence that any human being can learn any human languages is strong, suggesting that UG shows no genetic variation at all.
Why would UG remain so astonishingly resistant to minor mutations for so many tens of thousands of years?  There is no selection pressure that would make it disadvantageous for Australian aborigines to have different innate constraints on movement or thematic role assignment from European or African populations; yet not a hint of any such genetic diversity in innate linguistic capacities has ever been identified, at least in grammar.  Why not?
Chomsky’s response is basically that it just happened.  He robustly insists that this kind of thing happens all the time in genetics: all sorts of developments in evolution occur once and then remain absolutely fixed, like the architecture of our visual perception mechanism.  Human beings, he told me solemnly, are not going to develop an insect visual system over the coming 50,000 years.
This was his final point before his schedule required him to leave, and I had to agree with him (so let’s not have any loose talk about kneejerk disagreement, OK?) — we’re not going to develop insect eyes.  But I couldn’t help thinking that this hardly answered the question. There are parts of our genome that remain identical for hundreds of millions of years, like HOX genes; but generally they cause catastrophic effects on the organism if incorrectly expressed.  Even with the visual system, arbitrary changes could put an organism in real trouble.  For widely separated populations of humans to have different constraints on remnant movement wouldn’t do any damage at all, and it would offer dramatic support for the view that there is a genetically inherited syntax module (though the “U” of UG would now not be so appropriate).
So it was just as with the rocks-and-kittens POS argument: I agree with the starting observations, as everyone must; but the broader conclusions that Chomsky defends, and more generally his extremely negative attitude to computer simulation work, human-subject experimentation, evolutionary investigations, and data-intensive research don’t seem to follow.
I am not pessimistic enough to believe that contemporary experimental research in the cognitive and linguistic sciences — Bayesian and connectionist work included — will prove to be some kind of toxic threat to our discipline.  I think it represents an encouragingly lively and stimulating contribution. I think we have a responsibility as academics to acknowledge such work and do our best to appreciate its methods and results. It won’t do anything clarify our understanding of language if we simply condemn it all out of hand.
Geoff Pullum
University of Edinburgh
Berwick, Robert; Paul Pietroski; Yankama; and Noam Chomsky (2011). [BPYC-2011]  Poverty of the stimulus revisited.  Cognitive Science 35: 1207–1242.
Chomsky, Noam (2000). The Architecture of Language.  New Delhi: Oxford University Press.
Chomsky, Noam (2011). Language and other cognitive systems: What is special about language?  Language Learning and Development 7 (4): 263-278.  http://dx.doi.org/10.1080/15475441.2011.584041
Clark, Alexander (2010).  Efficient, correct, unsupervised learning of context-sensitive languages.  Proceedings of the Fourteenth Conference on Computational Natural Language Learning, 28-37. Uppsala, Sweden: Association for Computational Linguistics. http://www.cs.rhul.ac.uk/home/alexc/papers/conll2010.pdf
Clark, Alexander, and Remi Eyraud (2007). Polynomial time identification in the limit of substitutable context-free languages. Journal of Machine Learning Research, 8, 1725–1745.
Kirby, Simon; Michael Dowman; and Thomas Griffiths (2007). Innateness and culture in the evolution of language.  Proceedings of the National Academy of Sciences, 104 (12): 5241-5245.
Kirby, Simon; Hannah Cornish; and Kenny Smith (2008). Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language. Proceedings of the National Academy of Sciences, 105 (31): 10681-10686.
Lightfoot, David (1998).  Promises, promises: general learning algorithms. Mind and Language 13: 582-587.
Seki, Matsumura and Kasami Fujii (1991).  On multiple context-free grammars.  Theoretical Computer Science 88: 191-229.
Smith, Neilson Voyne (1999).  Chomsky: Ideas and Ideals.  Cambridge: Cambridge University Press.
Stabler, Edward (1997).  Derivational minimalism.  Christian Retore, Logical Aspects of Computational Linguistics (Lecture Notes in Artificial Intelligence, 1328), 68-95.  Berlin: Springer Verlag.
Stemmer, Brigitte (1999).  An on-line interview with Noam Chomsky: On the nature of pragmatics and related issues.  Brain and Language 68 (3): 393-401.

Si bien no estoy de acuerdo con la postura cuasi-separacionista que parece promover Chomsky con respecto a otros métodos y perspectivas sobre el estudio del lenguaje, creo que es necesario reconocer el valor de su postura.

Como es sabido, la gramática generativa clásicamente asume que el lenguaje es algo así como un “órgano mental”, un aspecto de la cognición que, mayormente, está definido por principios evolutivos (sean cuales sean). Gracias a esta concepción, gracias a este marco, el estudio de la gramática se convirtió en algo científico. (Por supuesto, no estoy diciendo que antes no lo fuera, sino que la concepción chomskyana del lenguaje le permitió al gramático pasar a un primer plano en la discusión sobre la naturaleza de la mente humana). Cuando se hace gramática ya no sólo se está describiendo un fenómeno gramatical de una lengua X, sino se están  elaborando teorías acerca del funcionamiento de ese órgano que es el lenguaje. Cada análisis gramatical involucra una concepción definida de ciertos aspectos de la Facultad del Lenguaje.

La postura de Chomsky es contra aquellos que en mayor o menor medida sostienen que es exagerado considerar el lenguaje como un órgano. Si el lenguaje no es un órgano, si el lenguaje es sólo una serie de rutinas adquiridas por un proceso estocástico general, entonces la función del gramático pierde mucho sentido. Si el objeto de estudio del gramático deja de ser un órgano mental cuya existencia se asume real, sus análisis sólo son observaciones momentáneamente válidas sobre un conjunto de transiciones probabilísticas.

También comparto el pesimismo de Chomsky con respecto a varios tipo de estudios probabilísticos: la estadística puede ser útil para predecir datos, pero es inútil para explicarnos la naturaleza de la Facultad del Lenguaje. Mientras asumamos que el lenguaje es algo más que una conducta adquirida y que tiene características que le son propias, la predicción estocástica del comportamiento lingüístico será (al menos científicamente) de escaso valor.

5 comentarios

  1. No me gusta esto, con todo respeto: “Si el lenguaje no es un órgano, si el lenguaje es sólo una serie de rutinas adquiridas por un proceso estocástico general, entonces la función del gramático pierde mucho sentido”. Si la función del gramático pierde sentido, o debe limitarse a ser descriptiva, mala suerte. Muchas cosas han perdido sentido y este es el costo de la investigación científica. Las consecuencias que le toque padecer a la linguística (generativa) no pueden ser la pauta para que la postura chomskiana deba cuidarse especialmente, porque si no la discusión se transforma, por debajo de las pretensiones de cientificidad, en gremial.
    Por otro lado, a Chomsky no puede nunca bastarle la falta de pruebas a favor de la adquisición/ aprendizaje (falta que se ha moderado mucho desde finales de los 50 del siglo pasado) para argumentar con elloen pro del innatismo -estrategia que peca de lo que se llama en logica informal falacia ad ignorantiam: pretender razón a partir de la falta de evidencia empírica de la otra parte (falta de evidencia que por supuesto es también harto discutible).
    En fin, nada: veo mucha distancia entre el lenguaje y un sistema axiomático.

  2. Se ve que el Pullumgate tuvo bastante repercusión porque (i) el blog tuvo una inusitada cantidad de visitantes para un día domingo (casi todos a este post vía San Google), y (ii) otros blogs han cubierto la noticia (Replicated Typo se centró en los aspectos relacionados con los comentarios de Chomsky sobre los estudios de evolución del lenguaje y Mr. Verb también hizo alusión al tema).

    Con respecto al comentario de más arriba, voy a decir tres cosas:

    (i) Aguante la discusión gremial. No, en serio: me encanta que Chomsky se ponga en paternalista y en ultra defensor del marco. Puede ser criticable desde afuera, pero a mí me parece loable. El tipo sabe que hoy en día el valor de su figura es más bien simbólico, y hace uso de eso: se queja de que no le dan suficientes fondos a sus discipulos (y, como señala Pullum, no es que no se los den, es sólo que él quiere más). ¿Eso está mal? Yo no le voy a pedir a Chomsky que después de todo lo que hizo por la lingüística venga ahora a discutir con nadie de la misma manera en que lo hizo contra Skinner o contra Piaget. Las discusiones que haya que plantear ahora las tendrá que llevar adelante otra gente. Y, más allá de Chomsky, no estoy al tanto de otros lingüistas interesados en pelearse con gente que trabaja en estadística o con conexionismo.

    (ii) Yo de adquisición no sé demasiado, pero no sabía que hubiera pruebas concluyentes en contra del innatismo. Bueno, en realidad hay que ver a qué nos referimos con “innatismo”: algo del lenguaje TIENE que ser innato, porque sino no se explica por qué otras especies no desarrollan lenguaje (el ejemplo de los gatos y las piedras). Pero todos estamos de acuerdo en que este tipo de innatismo es, al menos, poco interesante. Otros usos del término “innatismo” pueden ser:

    *INNATISMO: involucra aspectos específicos del lenguaje que están genéticamente determinados.

    *INNATISMO: involucra lo anterior más los aspectos del lenguaje que surgen universalmente como correlato del desarrollo normal de la cognición.

    Y se me ocurren algunas más… Así que además de encontrar evidencia contra el innatismo, hay que ver evidencia contra qué innatismo se encuentra.

    (iii) Algo que definitivamente no es innato es el hecho de adoptar el innatismo como teoría. Recuerdo que cuando cursé Lingüística General me parecía una aberración pensar que el lenguaje era innato (“¡es OBVIO que se aprende!”). Yo asumía que debía haber alguna manera de explicar la adquisición en términos piagetianos (yo era un gran fan de Piaget). Sin embargo, después fui aprendiendo gramática… y me encontré con cosas como los cuantificadores flotantes, las relaciones de ligamiento, los fenómenos de isla, los parasitic gaps… y entendí el alcance del argumento de la pobreza de los estímulos: el lenguaje involucra demasiadas cosas como para que pueda ser inferido. Yo sé que suena horrible lo que estoy diciendo (“el innatismo es sólo para iniciados”), pero es un poco así: uno termina adoptando este tipo de teoría porque se da cuenta en la práctica que las alternativas lógicas son sumamente inviables.

  3. UPDATE: hoy (21 de noviembre) hubo una respuesta a Pullum de parte de Mark Brenchley y David J. Lobina. La pego a continuación.

    It seems to us that there is one aspect of Noam Chomsky’s talk that really stands out (and this includes the papers Geoffrey Pullum mentions; namely, Chomsky 2011 and Berwick et al. [BPYC] 2011): scholars need to stop and reflect upon what they are doing.

    (1) This is no more true than in the case of whether machine learning is relevant (and if so, how relevant) for linguistic theory in general and language acquisition in particular. Whilst it is true that the field of mathematical linguistics has yielded many interesting results (some of which were initiated by Chomsky himself), Chomsky has been rather adamant regarding their limited relevance to the study of language qua biological system. This is undoubtedly true, in our opinion.

    When someone states that language is mildy context-sensitive, surely they do not mean it in a literal sense (how could they?). Rather, what scholars really mean by statements like these is that the expressive power of language, when the latter is described in terms of strings of symbols that stand for terminals and non-terminals, is mildly context sensitive (that is, generable by the right collection of rewriting rules); a slightly different matter. Thus, even if an ‘efficient, correct’ algorithm (to reference Clark, 2011; cited by Pullum in his discussion piece) is shown to successfully acquire multiple context-free grammars, this is not ipso facto a demonstration that is directly relatable to the acquisition of natural language.

    As many authors have pointed out before, the expressive power of a (formal) language and its place within the so-called Chomsky Hierarchy constitute a fact about what has come to be known as ‘weak generativity’ (i.e. string-generation), but what the linguist ought to be studying is the generation and conceptualization of structure (i.e., strong generativity). Consequently, whilst it may be true that Chomsky misunderstood/misheard Clark’s question, Clark misses the point that we ought to be interested in strong generativity, and not on the weak equivalence between strings of symbols and the structures they supposedly stand for.

    We are certain that both Pullum and Clark are aware of this, but some of their publications appear to show the suspension (temporary, we hope) of belief in these facts. In Rogers & Pullum (2011), we find a very careful analysis of the different grammars and languages of the Chomsky Hierarchy, but there is much at fault when these authors seek to identify the ‘psychological correlates’ that would show, in an experimental setting, what system subjects are employing/have internalized. The supposed connection between these cognitive abilities (e.g. the ability to recognize that every A is immediately followed by a B versus the ability to detect that at least one B was present somewhere) and the expressive power of an underlying grammar tells us very little indeed about mental properties and principles. Plausibly, the psychological correlates they list are the result of hierarchical mechanisms that operate over hierarchical (mental) representations, and the cognitive science literature contains myriad examples of theories that explicitly make use of these two components. Miller et al.’s (1960) TOTE units, or those studies that focus on Control operations (such as Simon 1962, Newell 1980 or Pylyshyn 1984) are some of the clearest examples we can think of. Crucially, these complex systems bear no relation whatsoever to formal grammars or languages. Much like in natural language, the key notion here is structure (incidentally, Miller & Chomsky 1963 already pointed to the
    analogy between TOTE units and the syntactic trees linguists postulated for sentences, something they did not consider coincidental).

    In a way, computational linguists are hostage to the fact that strong generativity has so far resisted formalization and that, therefore, their results do not appear to be directly relatable to the careful descriptions and explanations linguists propose; a fortiori, their formulae do not tell us much about the psychological facts of human cognition. In our opinion, then, Chomsky’s analysis does not show an ‘extremely shallow acquaintance’ with computational models, but a principled opposition to them because of what these models assume and attempt to show.

    (2) We also take issue with Pullum’s comment that the aforementioned papers ‘share a steadfast refusal to engage with anything that might make the debate about the poverty of the stimulus (POS) an empirical one.’ We think this is both false and not a little unfair.

    It is true, of course, that Chomsky seems to have little interest in what we might call empirical “number crunching” with respect to POS (e.g. quantifying the actual syntactic patterns in the child’s environmental input and relating these quantifications to the actual frequencies of equivalent patterns within the child’s developing output). However, the fact that he himself has not undertaken such research is entirely orthogonal to the claim that he has not provided empirical grounds for debating the POS. On the contrary, the last fifty-plus years have seen Chomsky build up a substantial body of actual natural language analysis. And it is this analysis which we would argue constitutes a clear empirical contribution to POS arguments.

    In particular, it seems to us that what Chomsky’s work does (or, at least, looks to do) is provide an explication which is grounded in the study of natural language syntax, thereby attempting to establish the nature of human syntactic knowledge. As such, it necessarily establishes a framework within which all learning models must operate, defining the particular target structures that these models are to converge on. So, for example, whatever learning model/algorithm is eventually worked out – a task we believe to be both important and non-trivial – it must account for the fact that languages are hierarchical in structure; for it indeed seems to be an empirical fact that human languages have such structure (unlike, say, the linear strings of formal language theory; see BPYC for evidence to this effect). If a proposed general learning model does not produce such structures, it necessarily fails to provide a viable account of language acquisition, and does so precisely because it fails to match the empirically established account of natural language structure.

    And, indeed, if you listen to the talk, this seems to be precisely the grounds on which Chomsky criticizes the computational cognitive science research literature raised in the Q+A session. So, when he criticizes the Perfors article in the talk, he does so because the researchers’ specific approach simply fails to capture the syntactic knowledge that (some) linguistic theory has not only argued for, but argued for through detailed empirical analyses of natural language. Hence, perforce, their work fails outright to constitute an adequate rebuttal to POS (UCL video, 65:00; see also the relevant section in BPYC).

    A similar point applies to his comments regarding Clark’s question (or, rather, what he takes to be Clark’s question; not at all, as Pullum points out, the same thing). That is, Chomsky seems to argue against it (past it?) because the approach does not provide a realistic model of human syntactic knowledge. And the approach is not realistic because it doesn’t stand up to (what he believes to be) the independent, viable and empirically established account of what this knowledge consists of (UCL video, 69:00; see Chomsky 2011 and BPYC for a brief recapitulation of certain pertinent features of this account). Hence, it couldn’t possibly constitute a genuine POS counterargument.

    The basic schema of the argument would, therefore, seem to be something like this: (1) As linguists, we are interested in the nature of human linguistic knowledge. (2) Our analyses of actual natural language syntax lead us to believe certain facts to be true of this knowledge (e.g. structure dependent movement), which we account for in a certain way (e.g. Merge). (3) The computational cognitive science literature has so far failed to provide domain-general learning models that adequately capture these facts about human language. (4) Therefore, they do not constitute POS counterarguments.

    Now, whilst this may of course turn out to be a bad argument, perhaps even a terrible one, it is prima facie one that looks to ground itself in empirically-derived content; content that Chomsky has surely been instrumental in contributing to.

    Mark Brenchley
    David J. Lobina

    Berwick, R. C., Pietroski, P., Yankama, B., & Chomsky, N. (2011) Cognitive Science, 35, 1207-1242.

    Chomsky, N. (2011) Language and other cognitive systems. What is special about language? Language Learning and Development, 7, 263-278.

    Miller, G. A. & Chomsky, N. (1963) Finitary models of language users. Handbook of Mathematical Psychology, vol. 2, John Wiley and sons, Inc. 419-492.

    Miller, G. A.; Galanter, E. & Pribram, K. H. 1960. Plans and the Structure of Behaviour. Holt, Rinehart and Winston, Inc.

    Newell, A. 1980. Physical symbol systems. Cognitive Science, 4, 135-183.

    Pylyshyn, Z. 1984. Computation and Cognition. The MIT Press.

    Rogers, J. & Pullum, G. K. 2011. Aural Pattern Recognition Experiments and the Subregular Hierarchy. Journal of Logic, Language and Information, 20, 329-42.

    Simon, H. 1962. The architecture of complexity. Proceedings of the American Philosophical Society, 106, 467-82.

  4. Curioso por otra parte el rechazo de Chomsky a los estudios de imágenes de actividad cerebral, que son lo más cercano hoy en día a una aproximación al estudio del “órgano” de la Gramática Universal precisamente en tanto que órgano. Claro que hasta que se hagan seguimientos detallados de la actividad cerebral de bebés escuchando y aprendiendo a hablar, en plan industrial y en tiempo real… eso no parece cercano; o sí, quién sabe, pobres críos. El sentido en el que se puede hablar de un órgano del lenguaje tiene que ver con el desarrollo inmaduro del cerebro humano (por neotenia probablemente) que lleva a arrojar un cerebro todavía “sin hacer” al mundo social y lingüístico. Allí se adapta el cerebro a la forma del lenguaje, que en sí es una pura forma sintáctica, como dice Chomsky; pero no hay que olvidar que hay una “Merge” más fundamental que la lingüística, una fuente de generación de formas y sentidos que seguramente es la que posibilita la capacidad lingüística “to begin with”, y que sigue activa por suerte en los adultos; me refiero a la fusión conceptual, que es un elemento que trastoca todas las gramáticas formalistas.

  5. De verdad que estaba buscando esto, la verdad que es bueno conseguir paginas como esta, ahora mismo iniciaré un trabajo que tiene mucho que ver.


Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Salir /  Cambiar )

Google photo

Estás comentando usando tu cuenta de Google. Salir /  Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Salir /  Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Salir /  Cambiar )

Conectando a %s

A %d blogueros les gusta esto: