Sunday, May 10, 2009

Newspeak = Fail

Here's an excerpt from George Orwell's 1984 about Newspeak, a modified version of English.

Big Brother is watching you.
In your heart you'd prefer to stick to Oldspeak, with all its vagueness and its useless shades of meaning. You don't grasp the beauty of the destruction of words. Do you know that Newspeak is the only language in the world whose vocabulary gets smaller every year? ... Don't you see that the whole aim of Newspeak is to narrow the range of thought? In the end we will make thoughtcrime literally impossible, because there will be no words to express it.

One thing I think people don't get about language, maybe the biggest misconception English speakers have, Orwell included... The size of a language's vocabulary is not the same thing as its expressive power. In fact, the two things may be inversely related. What really gives a language expressive power is how freely, logically, and consistently its small pieces can be combined into larger ones. As you might imagine, the more pre-built words you have, the less demand there seems to be for clear, productive word formation rules. But as soon as you want to express something there's no English word for, you might get a little tripped up having to use a whole sentence or two to make sure you're understood, which interrupts the flow of your speech or writing. So most people just find a rough equivalent to what they were trying to communicate and move on, generally without even noticing. (When you study a very different language like Japanese, by the way, you notice a lot of really useful things that aren't so easy to say in English.)

Most of our ability to create new words comes from Latin, actually. You can coin new words (or neologisms) pretty easily by following the word formation rules of Latin and using Latin roots and affixes. Scientists make extensive use of Latin in this way, but most everyone else won't feel comfortable using a given word unless they know it's listed in a dictionary somewhere, regardless of how clear the meaning is. New terms built from Germanic roots (such as "Newspeak") sound silly and childish whereas new terms from Latinate origins (such as "neoloquium") sound stilted and technical (and most people aren't literate enough to figure out the meanings very easily anyway). This uneasiness with new words is particularly prominent in English, and it actually makes clear communication in "Oldspeak" problematic.

Sure, the relative inflexibility of English word formation can make it difficult to express certain things from time to time, but an even bigger problem is all the connotation that gets attached to the words we do use frequently enough that they "sound right" (and not highfalutin or otherwise weird). A good example is the idea of "paranoia". If you say somebody's "paranoid", you're not only saying they think people are out to get them but also that the paranoid person is mistaken. What's it called when the paranoid person is right? Justified paranoia? Not really, that still implies the person is mistaken, just understandably so. If you're goal is to make thoughtcrime impossible, words like "paranoid" are a good start.

It's interesting to note that Orwell's Newspeak, which was designed to limit the expressive power of English, actually turned out to be really useful in labeling some ideas that weren't particularly easy to communicate in standard English: thoughtcrime, doublethink, unperson, etc. That's because he not only simplified the vocabulary of English but also simplified (and clearly delineated) its word formation rules. The way to limit expressive power is to eliminate or obscure virtually all word and sentence formation rules, which means you'd need quite a number of atomic, unanalyzable words to talk about anything.

If you could just extend people's attitudes about coining new words a little bit to make them uneasy generating new sentences they've never heard before, you'd really be controlling their minds. But that kind of language would look nothing at all like Newspeak. For one thing, almost every sentence would consist of a single word. Lame. Right?

Labels: , , , , , ,

Monday, March 23, 2009

Getting Started in Spanish

If you're currently trying to learn Spanish, certainly a useful venture for any English speaker, I've got some links for you!

The BBC has a nice free set of online lessons, complete with video and activities, called "Talk Spanish". If you don't already have RealPlayer installed, you'll have to go to Real.com and get the free player before the videos will work. Just click the blue button at the top of the page that says "Get RealPlayer 11 - FREE". The BBC has several other useful tools for learning Spanish as well, so you might want to look around.

Spanish Pronto is a pretty small site, but it has a couple of really useful references: the Basic Study Reference and a categorized list of the 175 most common Spanish words. The latter can also be used as an informal evaluation tool to see how much you already know.

David Luton has built up a nice collection of basic Spanish lessons that amount to a small textbook. It's all free, though he does offer some print books as well.

One last website I'd like to mention is SpanishDict.com. As you might have guessed, it has an online Spanish-English dictionary. It also has several other useful features, though some of them are unavailable right now due to maintenance issues. It looks like they're expanding the site to include free online courses.

Now that I've pointed you to all this free stuff, I'd be remiss if I didn't link you to what I feel is the best way to learn a new language in a hurry (besides getting romantically involved with a native speaker): audio-based programs that make you answer back. Here are a few good ones you can buy from Amazon:

If something looks too expensive, look for other sellers under the main listing. You might be able to save a lot of cash! Just watch out for those cheap programs like Learn in Your Car. If all you're doing is hearing and repeating vocabulary, you're never going to get fluent.

Labels: , , , , ,

Sunday, November 30, 2008

The Quotable Bickerton

Derek Bickerton

I was in the downtown El Paso library a few weeks ago, and a book caught my eye. It seemed familiar somehow and had a curse word right on the cover, so I had to see what it was about. Creoles! I'd almost forgotten about them! I'd had a fascination with that sort of thing years ago, but then I got distracted and forgot. Bastard Tongues helped me to rekindle that fascination, with a vengeance.

The book is essentially a narrative, but it leads you through Derek Bickerton's theories (and their origins and development) along the way. It's an extremely easy read; I think it's primarily for lay-people. That said, it does seem to summarize all his previous work on creoles pretty well, pointing out problems and changes in theory as it goes. Even if you're more into sociology than linguistics, this book will not disappoint. Creoles are largely the result of "the infernal machine" of slavery, so their very existence is a bit embarrassing to the powers that be. Maybe that's why we don't hear much about these languages in mainstream linguistics. They don't get no respect.

I won't try to summarize the whole book here. I'll just drop in some quotes that I find particularly amusing and/or enlightening. I've included page numbers in case you want look any of these up and see the context.

One of the differences between linguists and people is that people like words better than grammar and linguists like grammar better than words . . . (37)
The downside of higher education is that it gives you the confidence to maintain baseless fantasies in defiance of common sense. (221)
To really get at the heart of something, you can't have too little training. (148)
Fact may be the flesh of science; idealization is its lifeblood. (44)
. . . for some reason, the near certainty of causing harm by doing nothing is outweighed, for most people, by a remote risk of causing harm by doing something. (244)
And at the same time [the infernal machine of slavery] developed an essential ingredient of our modern world, the work discipline and the system of organization that, replacing the whip with economic necessity, kept countless millions working at sterile and repetitive tasks throughout their lifetimes. (165)
At all costs, you must stop the slobs from saying "ain't"! (198)
. . . I had to get force-fed with all the latest convoluted syntactic analyses from the Chomsky bunch. And I've wondered since if I wouldn't have been better off learning to read Dutch . . . (187)

Based primarily on his experience with pidgins and creoles, Bickerton proposes "the Bioprogram", a sort of weak Universal Grammar.

All the bioprogram did was make sure that any deficiencies in the input would be filled--that whatever was needed for a full human language would be available. And it did this by providing not a smorgasbord of choices, but a simple list of preferred options. (111-112)

Bickerton not only describes his present view on what a creole is and how it works, but also past views and views of others, highlighting their problems. One thing I found particularly interesting is how his view changed on the different strands of a given creole that typically exist, ranging from very close to standard English (or whatever the superstrate language may be) to very far and more in line with the Bioprogram. It's not that the creole changed for some speakers to become more like English over time; in a typical plantation situation, there would actually be two instances of creolization: the first mixing English with various African languages to creole a fairly English-like creole, and the next mixing this creole with the languages of many, many more slaves during the plantation's nearly inevitable expansion period.

A minority of relatively privileged slaves (house slaves and artisans) may have kept the original contact language alive among themselves, . . . (164)

The result, over time, is a sort of continuum of dialects of the creole, making the whole thing seem even more like "bad English". Another issue that makes it tough to collect data on creoles is that most of the speakers these days are bilingual and might code-switch between the creole and the superstrate language. So the whole thing's pretty hard to sort out, but what a payoff if you can sort it out! A pretty clear picture of the Bioprogram, our innate ideas about what a language is and how it works.

[Creoles] are the purest expression we know of the human capacity for language. (247)
References:

Labels: , , , , ,

Thursday, October 16, 2008

Language Acquisition Without Positive Evidence

One of my professors recently lent me a copy of Language Creation and Language Change, a compilation edited by Michel DeGraff. Lately, I've had something of an intellectual infatuation going on with Derek Bickerton, so I skipped right to his chapter (which happened to be chapter 2, so I didn't skip much). I'd like to summarize his main points (as I see them) here.

When adults of different linguistic backgrounds come together and have to figure out how to communicate with each other quickly and effectively (without the luxury of language classes), something called a pidgin might develop. This is basically just a mishmash of content words without much in the way of syntax. Pidgin utterances, much like the language of children under two, ". . . are devoid of hierarchical structure, . . . are extremely brief, . . . are incapable of expanding phrases, . . . regularly omit subcategorized constituents, . . . depend heavily on contextual and pragmatic clues for interpretation, and . . . are characterized by slow delivery punctuated by frequent pauses and hesitations," (DeGraff 2001: 64). Pidgins are highly variable both grammatically and lexically, and they tend to leave out pretty much all inflectional affixes and "grammar words".

These adult pidgin speakers also have access their native language and presumably speak it when they can find anybody around who'll understand it. But because the pidgin is what they have to use for day-to-day communication, and perhaps more importantly because this is what caretakers use when communicating with their children, this is what the children hear most often and, consequently, what should become their native language.

But the children aren't satisfied with this. They seem to have some pretty specific ideas of what they'd like to be to express already in their heads, semantic distinctions like [+/-anterior], [+/-realis], [+/-punctual], [+/-specific], and [+/-accomplished] (DeGraff 2001: 59). Bickerton calls these innate ideas "the Bioprogram". They use the inconsistent mess they hear around them mostly as a source of forms that they can adapt to express what they already want to. In the verb system, for example, it's ". . . as if they knew in advance what the parameter setting should be (nonpunctual) and then cast around for a word in the input that would appropriately express the setting. . . . These settings are not triggered lexically--they are triggered by the absence of TMA markers," (DeGraff 2001: 59-61).

Actually, the child is "under no particular pressure" to find words for these ideas if something similar enough is represented consistently in the target language (DeGraff 2001: 59), so these innate preferences don't necessarily show up as "unmarked parameter settings" in the world's languages (DeGraff 2001: 56). The language that these children of pidgin speakers develop is called a creole. Amazingly, the grammars of creoles around the world are essentially the same, though often quite different from that of their substrate and superstrate languages.

Now that I've given some explanation of what pidgins and creoles are and how they come to be, I'll cut to the chase. Somewhere "around the close of [their] twenty-fifth month," children progress very quickly from short, simple, pidgin-like utterances to longer, more complex utterances that make use of embedding and grammatical morphemes. Certain aspects of the grammar (such as irregular verb forms and proper distribution of pronouns and anaphors) take years to perfect, but the core of the grammar seems to be there almost overnight (DeGraff 2001: 61). At the same time, children experience unprecedented expansion of their vocabulary, but this does not seem to trigger their transition from a very limited communication system to a real language. If anything, the new grammar seems to trigger the growth of the vocabulary (DeGraff 2001: 63).

Actually, I'd love to see if this is indeed the case by looking at studies comparing normal to delayed acquisition. Do children who aren't exposed to any language until, say, their fourth year go through the same pidgin-like stage as normal children under two or do they progress much more rapidly having already developed their "language organ"?

So that's about the gist of it. Bickerton argues that ". . . the course of syntactic acquisition is consistent with the abrupt coming-on-line of a specific neurological module devoted to syntactic processing--a module that must be triggered within quite a narrow temporal window . . ." (DeGraff 2001: 65). Adults are unable to develop pidgins into creoles because they, like children under two, lack this Bioprogram for language. This also helps to explain why adults are such bad second language learners: they need "a rich and robust source of well-formed data" supplemented by "general cognitive capacities," (DeGraff 2001: 64).

This book was first published in 1999, and I'm not sure how long before that Bickerton's article was actually written, so it's just a bit dated. Bickerton's latest work, Bastard Tongues, was published earlier this year, and I highly recommend it as an introduction to and general overview of pidgins, creoles, and what they can teach us about how children learn language and what language actually is anyway.

References:

Labels: , , , ,

Monday, September 8, 2008

Learning to Communicate (3)

How Does It Happen?

A few days after birth, infants show signs of being able to discriminate between their native language and a foreign language and even between some pairs of foreign languages. Tests such as low-pass filtering and syllable scrambling have shown that utterance-level prosodic information (such as changing pitch or volume) allows infants to make such distinctions (Guasti 2004). Apparently, the earliest linguistic information that children acquire pertains to whole constructions, generally at the utterance level. No information about constituent words or phonemes is present at this point, only a very basic representation of constructions as tone contours. This makes sense as such constructions vary a great deal in their words and phonemes but stay relatively consistent in their tone contours. As weeks go by, infants seem to lose their ability to distinguish foreign language pairs (Guasti 2004). Early on, the infants probably lacked a sufficient amount of linguistic input to identify their native language with much certainty. If given a choice between two streams of audio, an infant might choose the one that sounds like it could be the target language.

Babbling is an important first step in language production. It gives infants a chance to practice forming the phonemes, syllables, and tone contours of the language they hear around them. Many adult (or semi-adult) words can be produced as a simple by-product of babbling ("mama", "dada", "baba", etc.). According to O'Grady (2005), "[. . .] 'mama'-like sounds have been detected in children's vocalizations starting from as early as two weeks of age up to around five months, usually in a 'wanting' context (wanting to be picked up, wanting food, and so on)." It seems that as young as two weeks old, many children are able to understand adult speech for its most basic intentional meaning: "Pay attention to me!" The simple act of parroting back known syllables with the goal of gaining someone's attention represents a very basic form of semantic knowledge.

In order for children to gain a better idea of what exactly is being said, certain strategies, or "operating principles", have been suggested (Aitchison 1983). Among them are instructions to associate only one form with each unit of meaning, pay attentions to the order of words, and avoid interruptions. Another obvious cue is prosodic stress; the most salient words in a sentence are often the most vital to its meaning. Young children seem especially good at understanding words referring to objects in their current attentional frame, which account for many of their first semantic associations (Tomasello 2003). There seems to be an inborn bias urging children to look for objects, as described by Elizabeth Spelke (Bloom 2002). Such objects possess cohesion, continuity, solidity, and (for inanimate objects) contact-driven motion. This object bias even seems to exist in such verb-heavy languages as Japanese and Korean. Tomasello feels that the object bias actually comes from the child's theory of mind, and Quine goes so far as to suggest that we get our very idea of objects from language (Bloom 2002). This seems a bit strong, but language certainly seems to influence our perception of reality. Bloom gives the example of one third of a line being called a "zoop" and the rest of it a "moop". If you were then shown three equally spaced dots under the line and told to put them into two groups, you would most likely group the two "moop" dots together, though no perceptual rationale exists for this decision.

Based on children's natural speech, there seem to be two basic styles of word learning: analytic and gestalt (O'Grady 2005). Children using the analytic style look for salient words in the utterances they hear and often assign them more meaning than they would possess in normal adult speech. For example, the word "up" could be used to mean, "Pick me up!" Those of the gestalt style attempt to mimic entire adult utterances. This results in unanalyzed chunks, which may sound fairly adult-like, as in /gimidat/ for "Give me that!" The difference is that such an utterance consists of a single morpheme in the child's lexicon. It may be a long time before the child figures out that /gimidat/ can be broken down as "Gimme that!" or even "Give me that!"

According to Tomasello (2003), children start by learning a few performatives (such as "hello", "please", and "no"), add a large number of nouns, and finally move on to verbs and modifiers. It is actually not true "nouns" and "verbs" that the child is learning but merely words referring to objects and words referring to actions. Adult-like categories emerge only later after a sufficient number of constructions are acquired. And even in adult language, many words, such as "kiss", actually belong to two or more of these categories at once. Children between the ages of 12 and 18 months are generally capable of uttering only one word or holophrase at a time (Tomasello 2003). In choosing which word to produce they follow the "Informativeness Principle" and choose the word that would most clearly convey their intended message (O'Grady 2005). Because subjects often consist of information understood by context, they are generally omitted.

Early constructions are item-based, the earliest containing only one variable element. This might be called the "two-word stage", but in the idiosyncratic language of young children, "words" are especially difficult to identify. The length of a child's utterance is traditionally measured by counting its morphemes (as defined by the adult language), which simply doesn't work for unanalyzed chunks and the gestalt style of learning. A better indication of the number of morphemes present in a given utterance comes from careful analysis of the child's naive grammar. Full utterances clearly possess meaning, so anything uttered in isolation is sure to comprise at least one morpheme. Furthermore, any element of a construction that the child replaces or omits with the apparent intention of affecting the meaning of the utterance as a whole can be said to possess meaning and comprise at least one morpheme. It is not a given that when a string of syllables meets the criteria for morpheme-hood in one construction, it will in other constructions.

Item-based constructions (usually centered around verbs) linger on in child speech long after the two-word stage. Children under three generally have trouble with tasks requiring the use of generalized verb patterns, but at around the age of four, children come to realize that verbs in English tend to follow the general pattern of "subject + verb + direct object" (O'Grady 2005). Many words such as "because", "tomorrow", "morning", and "of" are used only in frozen or at least formulaic phrases long before their meaning is extracted through generalization across such phrases (Tomasello 2003).

Once the basic constructions of a language have been learned and major patterns have been discovered through analysis of these constructions, all that remains is to increase vocabulary size and correct the many minor imperfections undoubtedly present in the child's speech. One method of correction that parents seem to find intuitively appealing involves repeating what the child has just said but adjusting it to be grammatically acceptable. However, such recasts have been shown to be effective only when the child is already using the desired form at least half the time (O'Grady 2005). A type of error correction that might be able to get children to this halfway mark is "indirect negative evidence" (Pinker 1996). The conspicuous absence of a particular form may be enough to make children avoid using it. To some degree, this process of error correction continues into adulthood.

Another way a person's language might be adjusted is through an imbalance in linguistic trust. During a lecture to a full classroom, one of my professors recently produced the sentence "They developed that own code of behavior themselves." I found this to be an interesting construction and decided to write it down. As I did so, a student in the class responded to what the professor had just said with "According to that own philosophy . . ." followed by a jumble of words, which I can't remember. This situation illustrates how even an adult speaker's grammar can be subconsciously affected by others. The professor most likely wanted to convey the idea that a certain group of people developed their own code of behavior while stressing that it was the same code of behavior just mentioned. In general, the phrase "that own" is simply not used in this way. "Own" must be preceded by a possessive such as "their", "his", or "our" to make sense to the average listener. The addition of the word "themselves" to the end of the sentence served to give "own" meaning but felt clumsy and unplanned. However, because the professor was in a position of intellectual authority compared to the student, the "that own" construction was not viewed as a mistake by the responding student and was in fact (if only for a short while) incorporated into the student's own grammar.

If language acquisition is described as the process of learning the words and constructions of a given language, then a language is never fully acquired. Not only are there simply too many words in a given natural language for any one person to know them all, but new words, and occasionally new syntactic constructions, are added all the time. And because of constant shifts in accepted usage, even the formal speech of well-educated adults may be challenged as ungrammatical. For a person to speak any language well, ey must never stop learning to communicate.

References:

Labels: , , , , ,

Saturday, September 6, 2008

Learning to Communicate (2)

How Is It Possible?

A linguist's approach to the problem of language acquisition is generally predictable from eir view of language in general. Those viewing language as a collection of constructions formed over time by historic processes tend to believe that language is acquired through general learning and social skills not specific to language (Tomasello 2003). Those insisting on a strict separation of language into elements of meaning and meaningless rules for combining them assert that our species needs some level of inborn knowledge to acquire these rules (Pinker 1996). Such dual-process models of language acquisition posit that words and elements of "the periphery" are acquired through normal learning processes, but a language's "core grammar" is unlearnable and therefore innate.

Nativist theories of language acquisition often make use of the "continuity assumption", which states that a child in the process of acquiring a language makes use of the same inborn grammatical knowledge as any adult. This leads to the view that the acquisition of a language actually completes at some point, namely when all the right switches have flipped in the brain resulting in full knowledge of the core grammar. If empirically feasible, it seems more continuous to assume that children learn new words and constructions in the same basic way as adults, only with a much greater incentive and less mental interference. This single process applies to morphology, syntax, and discourse all at once, an idea supported by the properties of language change over time (Tomasello 2003).

Before a baby even leaves the womb, ey use very powerful pattern recognition skills, which are not specific to language and are observable in various primates as well as adult humans (Armstrong 1999, Tomasello 2003), to recognize the rhythm and tone contours of eir native language (Guasti 2004). These same skills will later be used to recognize frequently occurring elements in fluid speech and thereby segment input into meaningful units (Bloom 2002). Suprasegmental features such as stress help to make the most important elements the most salient, thereby lowering the cognitive load (Pinker 1996). Such elements may be mimicked with very little understanding of their purpose but not without the understanding that they have a purpose.

Early on, children develop a naive theory of mind, an understanding that each person has a mental life which guides eir actions (Bloom 2002). This leads to an assumption that everything a person does, ey do for a reason. This understanding is crucial to the acquisition of language. If an adult looks at a given object, a child might look in the same direction, assuming that the adult must be looking at something of interest, a process known as "attention-sharing" (Tomasello 2003). According to a study conducted by Johnson, Slaughter, and Carey, 12-month-old babies will even follow the "gaze" (the front, reactive side) of a faceless robot, just so long as the robot interacts with them in some meaningful way (Bloom 2002). It seems that prelinguistic infants readily apply their theory of mind to non-humans, even non-living creatures, if the situation warrants.

As a part of their ever-expanding theory of mind, children make the crucial assumption that words have different forms because they differ in meaning (Tomasello 2003). If people are rational beings, as infants seem to assume, then it only makes sense that two noises would not be used interchangeably to do a job that a single noise could do. Certainly, some words are very similar in meaning, but there is generally some minor difference, at least in pragmatic function. This "principle of contrast" allows language learners to assume that if one object is called a "necklace", it is probably not also a "bracelet", despite the obvious physical similarity (Bloom 2002). A child may recognize the difference between a dog and a cat but, knowing only the word "dog", still refer to a cat in this way. Not being very confident in eir use of the word "dog" for this fuzzy little creature, the child quickly switches to "cat" once ey hear it used to reference the animal in question (O'Grady 2005). The more contrasts are made, the stronger the subconscious assumption that the "right words" have been found becomes, which may help to explain why adult learners of foreign languages have so much difficulty.

Once a child has made the assumption that the vocal noises of adults occur for a reason, ey are still left with the task of figuring out what this reason could be. The most basic motivation for speech is gaining, and often directing, the attention of others. Additionally, some sort of action on the part of the listener is often requested (or forbidden). Using subtle social cues such as eye gaze, facial expression, and tone of voice, children are able to read the communicative intentions of others surprisingly well (Tomasello 2003). When parsing speech, children attempt to find the smallest linguistic unit that can be consistently linked to a given intention. In this way, the child is faced with the two-step process of learning spoken forms and subsequently linking these forms to some sort of meaning (Guasti 2004). Most communicative intentions should be linked to the utterance level or higher, but a fairly good correlation can be found between the speaker’s intentions and certain words and set phrases. This is especially true of child-directed speech. According to Bloom (2002), words and facts given in the same context are learned with about the same proficiency; there is really nothing language-specific about the inductive reasoning skills used to learn words.

At this point, our hypothetical child is already able to understand the purpose of language, guess the approximate meanings of some adult utterances, and make the connection between certain words and their communicative function. What happens next is called "cultural learning" (Tomasello 2003) and is often confused with simple imitation or mimicry. If an adult turns a doorknob and the door opens, a child will assume that these two actions are related and perhaps attempt to open the door eirself. This differs from simple imitation in that the intentions of the adult are actually being replicated in the child. The child recognizes that the adult wants the door open, observes eir actions, and mimics these actions when the child has a similar desire for the door to open. Through the work of Meltzoff, Carpenter, Akhtar, and Tomasello, very young children have been shown to imitate the intentions of others, replicating purposeful behavior and ignoring accidents (Tomasello 2003). Thus, imitation through cultural learning might even lead to different actions than those modeled if the adult was unsuccessful in eir task.

Linguistic analysis going from the level of a whole utterance down to its component parts (as opposed to the reverse) has been shown to occur in all languages and is the "normal case" for synthetic languages such as Inuktitut (Tomasello 2003). For instance, the meanings of individual nouns must be extracted by generalizing across noun phrases, which typically serve as verbal participants (Bloom 2002). Vihman (1996) has even proposed that young children make use of holistic phonological templates, from which phonemes are later extracted. Certain phonemes are rarely (if ever) uttered in isolation, and the component features of phonemes couldn't possibly exist except as a part of a complete speech sound, so this whole-to-parts theory of phonological development is logically appealing.

Much research exists showing that children's early one-word utterances are really "holophrases" meant to convey complete communicative intentions (Tomasello 2003). Logically, such utterances would serve no purpose if this weren't the case. As an increasing number of holophrases (consisting of either single words or unanalyzed chunks) are learned, the processes of analogy, extraction, and categorization come into play. By noticing similarities between several set phrases such as "more cookie", "more drink", and "more play", the child forms a very simple construction: "more X", where X is the thing being requested. The words allowed to fill the X slot are thus extracted and put in an item-specific category (consisting of "cookie", "drink", and "play"). By comparison with another newly formed construction such as "allgone Y" and its item-specific category Y (consisting of "cereal", "cookie", "drink", and "Daddy"), a more general category Z (consisting of "cereal", "cookie", "drink", "play", and "Daddy") might be postulated, allowing the novel utterance "More cereal!" In this way, all the morphemes, words, set phrases, and complex constructions dictated by the historical tradition of a given language can be acquired.

Labels: , , , , ,

Saturday, August 30, 2008

Learning to Communicate (1)

What Must Be Learned?

In order to present an adequate description and analysis of language acquisition, language itself must first be defined. In debates such as how language is acquired and whether it can be taught to non-humans, opposing sides often disagree on exactly what qualifies as language and what is merely gesturing, mimicry, or some non-linguistic communication system. Aitchison (1983) defines language by means of ten necessary features: semanticity, creativity, displacement, arbitrariness in symbol-meaning relationships, organization into two or more layers, structure-dependence, spontaneous usage, turn-taking, cultural transmission, and even use of the vocal-auditory channel, which necessarily excludes written communication and signed languages. Aside from being a bit cumbersome, this definition of language simply excludes too much. What you are reading right now fails to meet at least two of the ten requirements.

It may seem obvious that languages have words, which can be used to form sentences, which express ideas, but defining such things as "words", "sentences", and "ideas" proves surprisingly difficult. For instance, how many words does the sentence "The soon-to-be-married cab driver bought 103 candles," contain in spoken English? Would there be any difference if "103" were spelled out as "a hundred and three"? A similar problem exists in defining sentences. Spoken language does not include punctuation indicating where one sentence ends and the next begins. Obviously, defining language in such nebulous terms as words and sentences is asking for trouble.

A much more easily defined linguistic unit is the signal. A signal is any bit of sound, text, or gesturing that, through social convention, conveys some sort of information. This includes morphemes, words, constructions, sentences, and even long speeches. For the purposes of this paper, the word "language" will refer to any system of communication in which two or more signals can be combined to create a larger signal with a new meaning that is somewhat (if not entirely) predictable from the original signals and the method used to combine them. This definition does not go so far as to include traffic lights (contra Aitchison 1983), but it does lump certain types of animal communication into the same category as human language. This alone is enough to make many researchers balk.

The purpose of language is communication. When performing any sort of linguistic analysis, it must be kept in mind that an ordinary language learner does not try to form a system that can generate "all the possible utterances" of the language (contra Aitchison 1983); ey simply try to make eirself understood. Eir primary concerns are guessing people's communicative intentions more accurately and making eir own intentions known more easily. The social conventions of language exist purely to serve these ends.

Despite the impressive communication systems of such species as dolphins and bonobos, humans seem to have a special capacity for language. Certain apes have been somewhat successful in learning human language systems (Washoe, Koko, and Kanzi, for example), but none have done so anywhere near as effortlessly and completely as an average human. Thus, humans must possess some degree of innate linguistic ability (Pinker 1996).

If we are genetically predisposed to use language and our ancestors weren't, this must have developed gradually. Language is a cultural phenomenon, not a physical adaptation that could be of any use to an individual in isolation. A random change in an individual's genetic code resulting in enhanced communication skills would prove useless unless some sort of cultural communication system were already in use. Therefore, the earliest form of language must have been possible without any sort of language-specific genetic adaptation. Perhaps we used to walk on all fours, occasionally standing up to make hand gestures. As gesturing became more of a necessity, those who could walk upright habitually would be at a distinct advantage. It is this practice of walking upright that probably led to the descension of the human larynx, making complex vocal communication much more feasible (Armstrong 1999). So, if our linguistic abilities developed gradually, then our view of language should reflect this. The communication systems of the great apes should be viewed as likely precursors to our own.

Of course, not everyone feels this way. Noam Chomsky proposed that human language is possible because of a genetic adaptation that endows our species with a "universal grammar" (Aitchison 1983). Chomsky explains the similarities across the natural languages of the world by positing a single, underlying grammar, which every child is born knowing and uses to acquire eir native language. The drawback of this view is that natural languages do show immense variation in both form and structure. In order for the hypothesized grammatical rules of a given language to stay (in a sense) universal, they must sometimes become incredibly complex (Pinker 1996). Another complication is that the hypothesized "language organ" of the brain is yet to be found. So-called "language genes" have been shown to affect language only indirectly (O'Grady 2005). The transformations used in Chomsky's original Standard Theory have been shown to be unrealistic (Aitchison 1983). Perhaps the strongly nativist idea of an inborn, universal grammar will prove similarly dubious.

On the opposite end of this debate is Benjamin Whorf, who viewed the world's languages as different right down to the concepts they encode. Stressing that the categories and types we encounter do not "stare every observer in the face" (Bloom 2002), Whorf proposed that the way a human views the world is dependent on eir native language. This idea seems to be verified to a degree by the work of Choi and Bowerman (1991), who observed that the spatial perceptions of two-year-olds are affected by the language they are in the process of acquiring. In this way, the acquisition of language guides cognitive development (Tomasello 2003). People everywhere possess the intuitive feeling that the objects and creatures of our world fall into natural categories. This "naive essentialism" seems to be universal, but it is scientifically ungrounded (Bloom 2002). Such classifications as colors, animal species, and the states of matter are ultimately man-made, no more a part of our natural environment than the words we use to describe them.

The linguistic distinctions between morphology, syntax, and discourse are similarly artificial. It is useful to put animals that can mate and produce fertile offspring into a single category and differentiate them from animals of other species, but separating the various systems of signal combination within a given language makes less sense. Morphology builds words from the smallest units of meaning, syntax builds sentences from words, and discourse builds larger constructions out of sentences. So the distinctions between these systems hinge on the ideas of "words" and "sentences". In simpler terms, all three systems combine signals to create larger signals.

The only reliable way to separate syntactic and morphological processes is through analysis of the language's written component. But even today, many languages see no reason to include word boundaries in their writing systems. Are the particles of Japanese really postpositions, or are they suffixes? Another problem with this method is that writing systems are based much more on tradition than on logic. Is the Spanish command "Damelo!" ("Give me it!") really one word as it is written, or should it be broken down into "da", "me", and "lo"? Certainly, words are seen as holding a more complete meaning than their component morphemes, but sentences convey a more complete message still. And in any case, part of knowing the meanings of such words as "a" and "some" is knowing how these words interact with nouns (Bloom 2002). There is so much overlap between morphology and syntax that Dabrowska (2000) has referred to syntactic constructions as simply "big words". Within a language, many systems can be described as operating primarily on their own set of rules (such as number generation, proper names and titles, and verb phrase construction), but these should all be viewed as subsystems of the language’s grammar.

When language is viewed in historical terms, the systems of morphology, syntax, and discourse blur together even more. As Talmy Givon (1979) puts it, yesterday's discourse is today's syntax. And in fact, yesterday's syntax is today's morphology. Loose discourse is syntacticized by such processes as concatenation, reduction, and reanalysis (Tomasello 2003). Pertinent examples can be found in contemporary American speech. The separate phrases of the utterance "If you would, sign here please," can, through imperfect imitation or massive repetition, be combined into the single tone contour of "If you would sign here please." This phrase is then reanalyzed as a simple conditional that functions as a request. This allows such phrases as "If I could get you to sign here please," to serve as complete sentences. In response to "What's the problem?", a person might answer, "What the problem is is a cow's on the tracks." Over time, such a common nominal construction as "what the problem is" can be reduced to simply "the problem is", resulting in the conventionally ungrammatical sentence "The problem is is a cow's on the tracks." For some reason, the repetition of the word "is" is noticed and applied to such statements as "The thing is is a cow's on the tracks," when a similar message is intended. And so, English phrases are being modified and reanalyzed into new grammatical structures even now.

If a language's grammar is nothing more than the cumulative result of millennia worth of discursive practices becoming formalized, then perhaps the so-called "universals" of language can be explained by our cultural, and not necessarily biological, similarities as human beings (Tomasello 2003).

Labels: , , , , ,

Tuesday, August 26, 2008

A Gender-Neutral Pronoun

The English language currently lacks a third-person pronoun indicating either a male or a female human being. Historically, this role was played by the masculine "he", but this usage has gradually fallen from favor. Today, the plural "they" is often used informally in this way, but this is nonstandard and often awkward.

In formal discussions of first language acquisition, the lack of a gender-neutral pronoun presents a special problem, as numerous references must be made to "the child". Phrases like "he or she" get the job done but can be bothersome, especially in sentences with numerous pronouns such as, "He or she hurt himself or herself on his or her bike."

Books on language acquisition tackle the problem in different ways. Some stick to the traditional masculine form, some use constructions like "he or she" and just try to keep them to a minimum (Pinker 1996), some alternate between "he" and "she" randomly (Aitchison 1983) or by chapter (O'Grady 2005), and others make all references to "the child" feminine, using the masculine only for contrast (Tomasello 2003).

I choose a different solution. Following the pattern of the third-person plural, I use the word "ey" to mean "he or she", "em" as the object form ("him or her"), and "eir" as the possessive ("his or her"). This is essentially the system popularized by Michael Spivak (1982) in The Joy of TeX.

Labels: , , , ,

Thursday, June 19, 2008

The Easiest Way to Learn a Language Fast

Because I study linguistics and teach English, people have often asked me, "What's the best way to learn a foreign language?" Of course, the best way is to move to a country that speaks that language, immerse yourself in it, and struggle until you get by (taking private lessons from a native speaker all the while). Since most of us can't really afford to learn a language that way, I reinterpret the question as, "What's the easiest way to learn a language fast?" In that case, I'd go with audio lessons.

A year or two ago, I found the kind of all-audio course I was looking for in a library. The only problem was that it was for French, and I really wanted to learn Japanese. I listened anyway, and it was great! It was by Michel Thomas and was basically just him teaching two students with little pauses inserted to give you a chance to answer before they do. (Michel suggests pausing the CD as needed.)

Anyway, now that I knew there was something like that on the market, I looked for the Japanese equivalent. (Michel Thomas doesn't make one.) I first got a Learn in Your Car product, which I regretted. It was just one translation after another with no participation required.

The English speaker says a word or phrase (twice), and then the Japanese speaker translates it (twice). That's all! This is by no means useless, but it's not a very good way to go about learning Japanese. I'd recommend this system for reviewing your vocabulary and filling in some gaps. For this to work, you should actually listen to the CDs multiple times. It's the sort of thing you would play while you sleep for subliminal learning.

Conversational Japanese: Learn to Speak and Understand Japanese with Pimsleur Language Programs actually engages you and makes you participate. You're asked questions non-stop, so you can't tune out even if you want to. When you hear a new term, you're not expected to grasp it all at once. It's broken down so you can clearly hear what it is you're supposed to repeat. Plus, you get real conversation at normal speed thrown in. I love it!

I'm not throwing out my LIYC audio, but I'd definitely recommend Pimsleur first. It's a great way to step up your Japanese (or Spanish or French or German...) while you drive or do laundry or whatever.

Labels: , , , , , ,

Sunday, May 11, 2008

It's About Bleedin' Time!

You've probably heard about Amazon Kindle, right? I had this idea years ago. Probably lots of people did. There might've been similar products that came out in the past without much notice because the technology just wasn't good enough to replace good ol' books yet.

Now, the screen is like paper, you can download books and stuff wirelessly basically anywhere (not just in WiFi hotspots), and the whole thing's really thin and light... like a little book or something. Oh yeah, and the books are totally cheap since there aren't any printing or storage costs. If you haven't seen it yet, check it out. It's pretty slick.

Labels: ,