Lexis (linguistics)
In linguistics, a lexis (from the Greek: λέξις "word") is the total word-stock or lexicon having items of lexical rather than grammatical, meaning. This notion contrasts starkly with the Chomskian proposition of a “Universal Grammar” as the prime mover for language; grammar still plays an integral role in lexis but it is the result of accumulated lexis, not its generator.
[edit] Lexicon
In short, the lexicon is
- Formulaic: it relies on partially-fixed expressions and highly probable word combinations
- Idiomatic: it follows conventions and patterns for usage
- Metaphoric: concepts such as time and money, business and sex, systems and water all share a large portion of the same vocabulary
- Grammatical: it uses rules based on sampling of the Lexicon
- Register-specific: it uses the same word differently and/or less frequently in different contexts
A major area of study psycholinguistics and neurolinguistics involves the question of how words are retrieved from the mental lexicon in online language processing and production. For example, the cohort model seeks to describe lexical retrieval in terms of segment-by-segment activation of competing lexical entries.[1][2]
[edit] Formulaic language
In recent years, the compilation of language databases using real samples from speech and writing has enabled researchers to take a fresh look at the composition of languages. Among other things, statistical research methods offer reliable insight into the ways in which words interact. The most interesting findings have taken place in the dichotomy between language use (how language is used) and language usage (how language could be used).
Language use shows which occurrences of words and their partners are most probable. The major finding of this research is that language users rely to a very high extent on ready-made language “lexical chunks”, which can be easily combined to form sentences. This eliminates the need for the speaker to analyze each sentence grammatically, yet deals with a situation effectively. Typical examples include “I see what you mean” or “Could you please hand me the…” or “Recent research shows that…”
Language usage, on the other hand, is what takes place when the ready-made chunks do not fulfill the speaker’s immediate needs; in other words, a new sentence is about to be formed and must be analyzed for correctness. Grammar rules have been internalized by native speakers, allowing them to determine the viability of new sentences. Language usage might be defined as a fall-back position when all other options have been exhausted.
[edit] Context and co-text
When analyzing the structure of language statistically, a useful place to start is with high frequency context words, or so-called Key Words in Context (KWICs). After millions of samples of spoken and written language have been stored in a database, these KWICs can be sorted and analyzed for their co-text, or words which commonly co-occur with them. Valuable principles with which KWICs can be analyzed include:
- Collocation: words and their co-occurrences (examples include “fulfill needs” and “fall-back position”)
- Semantic prosody: the connotation words carry (“pay attention” can be neutral or remonstrative, as when a teacher says to a pupil: “Pay attention!” (or else)
- Colligation: the grammar that words use (while “I hope that suits you” sounds natural, “I hope that you are suited by that” does not).
- Register: the text style a word is used in (“President vows to support allies” is most likely found in news headlines, whereas “vows” in speech most likely refer to “marriages”; in speech, the verb “vow” is most likely used as “promise”).
(partially adapted from Lewis, 1997)
Once data has been collected, it can be sorted to determine the probability of co-occurrences. One common and well-known way is with a concordance: the KWIC is centered and shown with dozens of examples of it in use, as with the example for “possibility” below.
[edit] Concordance for possibility
bout to be put on looks a real possibility. Now that Benn is no longer
Hiett, says that remains a real possibility: As part of the PLO, the PLF
Graham added. That's a possibility as well," Whitlock admitted.
Severe pain was always a possibility. Early in the century, both
that, when possible, every other possibility, including speeches by outside
that we can, that we use every possibility, including every possibility of
could be let separately. Another possibility is `constructive vandalism'
a people reject violence and the possibility of violence can the possibility
the French vote and now enjoy the possibility of winning two seats in the
immediately investigate the possibility of criminal charges and that her
Sri Lankan sources say that the possibility of negotiating with the Tamil
Sheikhdoms too there might be the possibility of encouraging agitation.
the twelve member states on the possibility of their threatening to
Marie had already looked into the possibility of persuading the [f]
a function of dependency, but the possibility of capitalist development,
were almost defenceless. The possibility of an invasion had been apparent
oddly and are worried about the possibility of drug use, say so. Tell them
was first convened to discuss the possibility of a coup d'état to return the
in the mi5 line and in the possibility of the state being used to smear
reasons behind the move was the possibility of a new market. Cheap terminals
be assessed individually. The possibility of genetic testing brings that
given the privilege. The other possibility, of course, is that the jaunt
All this undermines the possibility of economic reform and requires
get. (Knowing that there is no possibility of attempting coitus takes the
who was openly cynical about the possibility of achieving socialism 5
so that they can perceive the possibility of being citizens engaged in
poisoning and fire, facing the possibility of their own death just to be
hearing yesterday that the possibility of using the agency to gather
in 1903, and I don't foresee any possibility replacing that. The car we
a genetic factor at work here, a possibility supported by at least a few
refused even to entertain the possibility that any of the nations of the
has a long history, there is the possibility that the recent upsurge in
Police are investigating the possibility that she was seen a short time
any doctors who think there is a possibility that they may have been infected
are in a store, there is a good possibility that you are wearing moisturizer
living must be made. The possibility that a young adult will be
he'd completed his account of the possibility that there was a drug-smuggling
has been devoted to exploring the possibility that so-called ancient peoples
Once such a concordance has been created, the co-occurrences of other words with the KWIC can be analyzed. This is done by means of a t-score. If we take for example the word stranger (comparative adjective and noun), a t-score analysis will provide us with information such as word frequency in the corpus: words such as “no” and “to” are not surprisingly very frequent; a word such as “controversy” much less. It then calculates the occurrences of that word together with the KWIC (“joint frequency”) to determine if that combination is unusually common, in other words, if the word combination occurs significantly more often than would be expected by its frequency alone. If so, the collocation is considered strong, and is worth paying closer attention to.
In this example, “no stranger to” is a very frequent collocation; so are words such as “mysterious, handsome, and dark”. This comes as no surprise. More interesting, however, is “no stranger to controversy”. Perhaps the most interesting example, though, is the idiomatic “perfect stranger”. Such a word combination could not be predicted on its own, as it does not mean “a stranger who is perfect” as we should expect. Its unusually high frequency shows that the two words collocate strongly and as an expression are highly idiomatic.
The study of corpus linguistics provides us with many insights into the real nature of language, as shown above. In essence, the lexicon seems to be built on the premise that language use is best approached as an assembly process, whereby the brain links together ready-made chunks. Intuitively this makes sense: it is a natural short-cut to alleviate the burden of having to “re-invent the wheel” every time we speak. Additionally, using well-known expressions conveys loads of information rapidly, as the listener does not need to break down an utterance into its constituent parts. In "Words and Rules", Steven Pinker shows this process at work with regular and irregular verbs: We collect the former, which provide us with rules we can apply to unknown words (for example, the ‑ed ending for past tense verbs allows us to decline the neologism “to google” into “googled”). Other patterns, the irregular verbs, we store separately as unique items to be memorized.