相关热词搜索:
Re-exploring O. Henry’s Short Stories——A Corpus-Based Pilot Study
【摘 要】本文试图采用语料库的方法从文体学视角分析欧?亨利小说集《四百万》。研究揭示,通过语料库软件计算出的总体统计数据,为有关欧?亨利小说广泛认同的文学阐释提供了更为具体的描述基础。在探讨小说场景和基本主题方面,重现序列的搭配及频数信息发现了前人并未关注过的语言学特征。
【关键词】语料库;欧?亨利;场景;主题
Abstract:This article attempts to apply corpus-based method to a stylistic interpretation of O. Henry’s short story collection The Four Million. It is shown that the overall statistics computed by corpus software has provided a more detailed descriptive basis for widely accepted literary interpretations of his stories. In terms of story settings and general themes, the collocation and frequency information of recurrent sequences can identify valuable linguistic features which literary critics seem not to have noticed.
Key words:corpus; O.Henry; setting; theme
1.Introduction
O. Henry was called the American Guy De Maupassant. Both authors wrote twist endings, but O. Henry’s stories were much more playful and optimistic. Among the former studies on O.Henry’s short stories, there is consensus that O.Henry’s works are generally branded with such features as surprising endings, use of coincidence or chance to create humor, ingenious and exquisite layouts, smile-in-tears irony and so forth. Despite the detailed literary discussion, little work has been done to reveal its linguistic styles. Nor is there work with quantitative data as convincing evidence. In terms of the established description of his style, it seems unlikely that the corpus-based method can find anything original. However, the stylistic analysis in the present paper aims to illustrate the value of corpus empirical method in exploring the literary styles. On the one hand, statistic data help to confirm the canonical view on O.Henry’s short stories; on the other hand, stylistics is related to linguistic features of his works.
2.Data and Methodology
This paper is devoted to investigating the linguistic styles of O.Henry’s works in an empirical way, applying both quantitative and qualitative methods. The study adopts two corpora. One is O.Henry’s book The Four Million (a collection of stories), published in 1906, contains a series of short stories which took place in the New York City in the early years of the 20th century. The computer readable versions available on the internet are used to set up a minor working corpus for investigation (http://www.literaturepage.com/read/thefourmillion.html). The other one is Brown corpus used as a reference corpus.
The corpus concordance software used in this study is Wordsmith tools. Wordsmith can undertake more detailed analyses of frequencies of concordance items and extract collocational information. By use of corpora software, words with significant keyness in the book The Four Million will be sorted out first, and then concordance lines with a keyword and its collocates will be extracted. The corpora data will be processed by statistical instruments.
3.Overall Statistics
The overall statistics are one essential starting point for a systematic corpus-based textual analysis. Wordsmith Tools are used to provide the overall statistics of the two corpora and comparison is made as shown in Table 3.1.
Table 3.1 Comparison of Overall Statistics between the Two Corpora
text file
tokens (running words) in text
types (distinct words)
type/token ratio (TTR)
standardised TTR
standardised TTR basis
mean sentence length (in words) mean word length (in characters)
word length std.dev.
Overall of Mini
52,770
8,251
16
46.97
1,000.00
15
4
2.24
Overall of Brown
1,390,505
47,146
4
39.07
1,000.00
23
5
2.52
In terms of the number of tokens, the corpus for the present study seems rather small compared with that of Brown. However, the observed standardized type-token ratio (TTR) in the minor working corpus is higher than that of Brown. The higher the ratio is, the more lexical variation it reflects. The result is not quite surprising, for the fact that O.Henry was notable for writing stories with a persity of expressions. The mean sentence length shows that O.Henry wrote shorter sentences than those in general written texts. In each of his short stories, several incidents and clues are condensed into merely 750 words or so. Naturally, sentences in shorter length can deliver new information in a flash of time. It takes less time for readers to follow up the development of the dramatic plot. Fundamentally a product of his time, O. Henry"s work provides one of the best English examples of catching the entire flavor of an age. Whether roaming the cattle-lands of Texas, exploring the art of the ‘gentle grafter’, or investigating the tensions of class and wealth in turn-of-the-century New York, O. Henry had an inimitable hand for isolating some element of society and describing it with an incredible economy and grace of language.
4.Keyword Analysis for Settings
Setting is an indispensable constituent for story telling. Setting basically includes several closely related aspects – the time in which the event or action takes place, the place where the event or action takes place and social environment of the characters: the manners, customs, moral values that govern the characters’ society. In the present study, the information of the first two aspects can be provided by sorting out the keywords of the texts as Table 4.1 shows.
Table 4.1 Keywords Denoting Places and Time
Indoor places
Outdoor places
room, restaurant, Bogle’s, Cypher’s, store, flat, door, window, dresser, table, skylight, floor, desk,
bench, street, corner, sidewalk, Broadway, avenue, park,
Denoting time
evening, night, o’clock,
It is strikingly interesting that two out of three keywords explicitly refer to the time at or after sunset, including ‘evening’, and ‘night’. More interestingly, almost all the collocations of ‘o’clock’ indicate the right time from the late afternoon till the late night. The following concordance lines provide evidence.
1.s were few. The time was barely 10 o"clock at night, but chilly gusts of wind
2.ey Donovan"s paper-box girl. At 10 o"clock the jolly round face of "Big Mike"
3.heatres are stupid, anyway." At 11 o"clock that night somebody tapped lightly
4.ant to overshadow her friend. At 4 o"clock on the afternoon of the third day M
5.upper windows lighted? Well, at 6 o"clock I stood in that house with the youn
6.a strike-breaker"s motor car. At 6 o"clock the waiter brought her dinner and c
7.e to thinking. One evening about 6 o"clock my mistress ordered him to get busy
8.lar and eighty- seven cents?" At 7 o"clock the coffee was made and the frying-
9.ging mistress. It was most times 7 o"clock when he returned in the evening. At
10.coddled, praised and kissed at 7 o"clock. Art is an engaging mistress. It wa
11.We were married last evening at 8 o"clock in the Little Church Around the Cor
12.ew samples this morning. It"s 9.45 o"clock, and not a single picture hat or pi
13.g happiness to your son." At eight o"clock the next evening Aunt Ellen took a
14.by we moved out, for "twas eleven o"clock, and stands a bit upon the sidewalk
15.d you burn your hand, Dele?" "Five o"clock, I think," said Dele, plaintively.
16.be happening?" asked Ikey. "Nine o"clock," said Mr. McGowan. "Supper"s at se
17.ook to make him think it!" At nine o"clock Dulcie took a tin box of crackers a
18.he announced. "It was exactly ten o"clock when we parted here at the restaura
Further observation of the file names in the corpus leads to a conclusion that all stories except one happened to take place from dusk to the late night. Is there any coincidence? Perhaps in the daytime the common masses in New York were excessively busy with worrying about their means of livelihood. Only after work or after supper did they have time to take part in some social activities, such as ‘dinner’, ‘ball’, ‘café’, ‘park’, and ‘play’ which appear in the context of the three temporal keywords. This is a reflection of social and cultural contexts at the early 1920s in America. People, whether poor or rich, would like to take a cup of coffee in a café or have a glass of drink at a pub. Young girls dreamed of dancing with their beloved partners at the ball of the lower class.
The noun keywords about the physical places where the events took place can be roughly grouped into two types: indoor places and outdoor places. Some unexpected incidents might happen in a ‘room’, ‘restaurant’, ‘flat’, ‘store’, or on the second ‘floor’, or even near a ‘dresser’, ‘table’ and ‘desk’ inside. They are such common and everyday places that one can’t expect any conspicuous incident to happen at these places. However, the author was wise enough to arrange his unusual stories in quite ordinary settings. Other adventures might be experienced at the ‘street’, ‘corner’, ‘sidewalk’, or at the Second ‘Avenue’, or even on the ‘bench’ in a little ‘park’. In O.Henry’s stories there were no places of imposing scale or unique style. But they were places perfectly matched with the stories of the common people.
Whether in the indoor places like ‘restaurant’, ‘store’ and ‘flat’, or at the outdoor places like ‘street’, ‘corner’ and ‘sidewalk’, there embedded some unknown possibilities for the coming events. It was at these inconspicuous and ordinary places that the unexpected and accidental coincidences took place all the time. O.Henry is good at writing about ordinary people in everyday circumstances. He is quoted as once saying, ‘There are stories in everything. I’ve got some of my best yarns from park benches, lampposts and newspaper stands.’
5.Exploring Themes by Noun keywords
A range of frequent noun keywords help to figure out the major themes in O.Henry’s works. To better investigate what these noun keywords reveal, it is necessary to further categorize them according to their denotations. The following Table 5.1 shows the categorization.
Table 5.1 Frequent Noun Keywords in the minor corpus
Denoting people
Genders
Professions
Characteristics
Man, lady, girl, Miss, Mr., Missis,
policeman, palmist, waiter, broker, lumberman, typewriter,
cosmopolite, friend, longnecker, prince, gentleman,
Denoting things
cab, fare, hat, cents, twenty, odour, coat, cigar, dollars, dandelions, two, doggie, card, clock, cards, umbrella, money, art, adventure, hop, leaf, Rubberneck, dinner
Noun keywords can expose explicit topics rather than the underlying themes in the texts. As seen in Table 5.1, a majority of nouns are used to denote people, either distinguishing the characters’ genders or professions or describing their special traits. It is notable that such professions were usually taken up by people in the middle or lower social classes. Only a few of them were lucky enough to make their fortune and control their own lives; whereas most of them who earned their mere living were controlled by their miserable fates. Most of O. Henry"s stories took place in the New York City, and dealt for the most part with ordinary people: clerks, policemen, and waitresses. It is the author’s rich life experiences and burdens of life pressure that became the source of depicting such a variety of professions.
Another larger group of noun keywords denoting things are scattered in different stories. They are closely related with the theme of each story; therefore, it is unlikely to make any generalization. However, the concordance lines of the numeral words ‘twenty’ and ‘two’ sorted out some distinctive collocational sequences that attract our attention. Both frequently co-occur with temporal nouns. In the case of ‘twenty’, more than 40 per cent of its occurrences (nineteen out of 42 frequencies) are accompanied by temporal words, such as ‘years, minutes, hours, days, seconds’. The word ‘two’ is also likely to appear with temporal nouns, such as ‘weeks, hours, days, years, minutes’. Such sequences of numbers and temporal nouns indicate that most of O.Henry’s short stories develop for a short period. Some events took weeks or days to develop, and other anecdotes might come to an end in a moment. As discussed above, the longest passage of time turns out to be ‘twenty years’, which is entirely related with the story ‘After Twenty Years’. This reflects another realistic feature that in his stories events usually happened at a specific time. The merits of concrete description of time shorten the distance between the characters inside the story and the readers outside.
There is true love in his stories. ‘Love’ as one keyword stands out to praise the touching true love between a destitute couple. Either in ‘The Gift of the Magi’ or in ‘A Service of Love’, the couple in the humble circumstance helped and loved each other unselfishly and wholeheartedly. Their perfect and innocent love bound up in the fetters of money seemed even more impressive and moving.
6.Conclusion
Through corpus-based method, stylistics of O.Henry’s one collection of short stories has been explored. As Stubbs (2005: 21-22) proposes, ‘… observational data can provide more systematic evidence for unavoidable subjective interpretation’. Different from the former studies on O.Henry’s works, the present paper has attempted to delay interpretation and prolong analysis, giving weight to quantitative investigation. The keywords and concord functions of Wordsmith Tools attribute to find out or reconfirm some unique features of O.Henry’s works. The paper has come to the following conclusions:
(1)The higher TTR in the working corpus proves O.Henry’s lexical persity and witty narration. The shorter mean sentence length reveals that his stories progress rapidly, and show a rigid economy of words, which are customarily short – most of the stories run about eight hundred words.
(2)The frequent noun keywords give a clue to the setting of O.Henry’s short stories. The temporal keywords tell that all but one story happened in the evening and the duration of time was not long. Some accidents even took place in a few hours. The location keywords denote several ordinary places in a city where something accidental might occur with the author’s witty arrangement.
(3)Another group of noun keywords casts light on the general themes of O.Henry’s short stories. Some identify people by revealing their genders, professions and characteristics. Others identify things that give information of the themes of stories.
This is a tentative study of corpus-based stylistic analysis on O.Henry’s one collection of short stories. The basic statistic information computed by Wordsmith Tools can help reassess subjective assertions in stylistic criticism. A corpus-based further study is expected to explore his styles in terms of perspective, characterization and language features. What is more, the corpus-based method can be introduced into literature class to help students generalize the unique features of an author in terms of stylistics.
【References】
[1] 杨大中.‘荒寂森林和熔岩沙漠’的写照—欧?亨利都市系列小说简论[J].外国文学研究,1991.
[2] 阮温凌.人性艺术的动情力—欧?亨利小说餐馆与玫瑰艺术赏析[J]. 名作欣赏,1992.
[3] Burrows, J. F. Computation into Criticism [M]. Oxford: Clarendon Press, 1987.
[4] Hoover, D. L. Statistical stylistics and authorship attribution: an empirical investigation [J]. Literary and Linguistic Computing, 2001, 16: 421-44.
[5] Scott, M. Wordsmith Tools Manual [M]. Oxford: Oxford University Press, 1997.
[6] Sinclair, J. M. The Linguistic Basis of Style [A]. In H. Ringbom (ed.) Style and Text. Stockholm: Sprakforlaget Skriptov AB & Abo Akademi, 1975.
[7] Tabata, T. Narrative Style and the Frequencies of Very Common Words [J]. English Corpus Linguistics, 1995, 2: 91-109.
[8] Wynne, M. Stylistics: Corpus Approaches [A]. In K. Brown (ed.) Encyclopedia of Language and Linguistics (2nd ed.). Oxford: Elsevier, 2006, pp. 223-5.