How Is It Possible?
A linguist's approach to the problem of language acquisition is generally predictable from eir view of language in general. Those viewing language as a collection of constructions formed over time by historic processes tend to believe that language is acquired through general learning and social skills not specific to language (Tomasello 2003). Those insisting on a strict separation of language into elements of meaning and meaningless rules for combining them assert that our species needs some level of inborn knowledge to acquire these rules (Pinker 1996). Such dual-process models of language acquisition posit that words and elements of "the periphery" are acquired through normal learning processes, but a language's "core grammar" is unlearnable and therefore innate.
Nativist theories of language acquisition often make use of the "continuity assumption", which states that a child in the process of acquiring a language makes use of the same inborn grammatical knowledge as any adult. This leads to the view that the acquisition of a language actually completes at some point, namely when all the right switches have flipped in the brain resulting in full knowledge of the core grammar. If empirically feasible, it seems more continuous to assume that children learn new words and constructions in the same basic way as adults, only with a much greater incentive and less mental interference. This single process applies to morphology, syntax, and discourse all at once, an idea supported by the properties of language change over time (Tomasello 2003).
Before a baby even leaves the womb, ey use very powerful pattern recognition skills, which are not specific to language and are observable in various primates as well as adult humans (Armstrong 1999, Tomasello 2003), to recognize the rhythm and tone contours of eir native language (Guasti 2004). These same skills will later be used to recognize frequently occurring elements in fluid speech and thereby segment input into meaningful units (Bloom 2002). Suprasegmental features such as stress help to make the most important elements the most salient, thereby lowering the cognitive load (Pinker 1996). Such elements may be mimicked with very little understanding of their purpose but not without the understanding that they have a purpose.
Early on, children develop a naive theory of mind, an understanding that each person has a mental life which guides eir actions (Bloom 2002). This leads to an assumption that everything a person does, ey do for a reason. This understanding is crucial to the acquisition of language. If an adult looks at a given object, a child might look in the same direction, assuming that the adult must be looking at something of interest, a process known as "attention-sharing" (Tomasello 2003). According to a study conducted by Johnson, Slaughter, and Carey, 12-month-old babies will even follow the "gaze" (the front, reactive side) of a faceless robot, just so long as the robot interacts with them in some meaningful way (Bloom 2002). It seems that prelinguistic infants readily apply their theory of mind to non-humans, even non-living creatures, if the situation warrants.
As a part of their ever-expanding theory of mind, children make the crucial assumption that words have different forms because they differ in meaning (Tomasello 2003). If people are rational beings, as infants seem to assume, then it only makes sense that two noises would not be used interchangeably to do a job that a single noise could do. Certainly, some words are very similar in meaning, but there is generally some minor difference, at least in pragmatic function. This "principle of contrast" allows language learners to assume that if one object is called a "necklace", it is probably not also a "bracelet", despite the obvious physical similarity (Bloom 2002). A child may recognize the difference between a dog and a cat but, knowing only the word "dog", still refer to a cat in this way. Not being very confident in eir use of the word "dog" for this fuzzy little creature, the child quickly switches to "cat" once ey hear it used to reference the animal in question (O'Grady 2005). The more contrasts are made, the stronger the subconscious assumption that the "right words" have been found becomes, which may help to explain why adult learners of foreign languages have so much difficulty.
Once a child has made the assumption that the vocal noises of adults occur for a reason, ey are still left with the task of figuring out what this reason could be. The most basic motivation for speech is gaining, and often directing, the attention of others. Additionally, some sort of action on the part of the listener is often requested (or forbidden). Using subtle social cues such as eye gaze, facial expression, and tone of voice, children are able to read the communicative intentions of others surprisingly well (Tomasello 2003). When parsing speech, children attempt to find the smallest linguistic unit that can be consistently linked to a given intention. In this way, the child is faced with the two-step process of learning spoken forms and subsequently linking these forms to some sort of meaning (Guasti 2004). Most communicative intentions should be linked to the utterance level or higher, but a fairly good correlation can be found between the speaker’s intentions and certain words and set phrases. This is especially true of child-directed speech. According to Bloom (2002), words and facts given in the same context are learned with about the same proficiency; there is really nothing language-specific about the inductive reasoning skills used to learn words.
At this point, our hypothetical child is already able to understand the purpose of language, guess the approximate meanings of some adult utterances, and make the connection between certain words and their communicative function. What happens next is called "cultural learning" (Tomasello 2003) and is often confused with simple imitation or mimicry. If an adult turns a doorknob and the door opens, a child will assume that these two actions are related and perhaps attempt to open the door eirself. This differs from simple imitation in that the intentions of the adult are actually being replicated in the child. The child recognizes that the adult wants the door open, observes eir actions, and mimics these actions when the child has a similar desire for the door to open. Through the work of Meltzoff, Carpenter, Akhtar, and Tomasello, very young children have been shown to imitate the intentions of others, replicating purposeful behavior and ignoring accidents (Tomasello 2003). Thus, imitation through cultural learning might even lead to different actions than those modeled if the adult was unsuccessful in eir task.
Linguistic analysis going from the level of a whole utterance down to its component parts (as opposed to the reverse) has been shown to occur in all languages and is the "normal case" for synthetic languages such as Inuktitut (Tomasello 2003). For instance, the meanings of individual nouns must be extracted by generalizing across noun phrases, which typically serve as verbal participants (Bloom 2002). Vihman (1996) has even proposed that young children make use of holistic phonological templates, from which phonemes are later extracted. Certain phonemes are rarely (if ever) uttered in isolation, and the component features of phonemes couldn't possibly exist except as a part of a complete speech sound, so this whole-to-parts theory of phonological development is logically appealing.
Much research exists showing that children's early one-word utterances are really "holophrases" meant to convey complete communicative intentions (Tomasello 2003). Logically, such utterances would serve no purpose if this weren't the case. As an increasing number of holophrases (consisting of either single words or unanalyzed chunks) are learned, the processes of analogy, extraction, and categorization come into play. By noticing similarities between several set phrases such as "more cookie", "more drink", and "more play", the child forms a very simple construction: "more X", where X is the thing being requested. The words allowed to fill the X slot are thus extracted and put in an item-specific category (consisting of "cookie", "drink", and "play"). By comparison with another newly formed construction such as "allgone Y" and its item-specific category Y (consisting of "cereal", "cookie", "drink", and "Daddy"), a more general category Z (consisting of "cereal", "cookie", "drink", "play", and "Daddy") might be postulated, allowing the novel utterance "More cereal!" In this way, all the morphemes, words, set phrases, and complex constructions dictated by the historical tradition of a given language can be acquired.