This study is intended to reveal that Hyunbae Choi’s research of 『Frequency Research of Korean Vocabulary Use』 (1956) is a valuable achievement which was conducted long before the start of modern field of Korean language informatics based on computer and corpus.
First, we analyzed and described the structure and characteristics of the vocabulary list, and the vocabulary classification system and the description methods to distinguish meanings in 『Frequency Research of Korean Vocabulary Use』 (1956). Based on this result, we quantitatively analyzed the characteristics of the vocabulary of the 1950s. In order to compare the quantitative features of the vocabulary of the 1950s represented by this data with the features of the Korean written and spoken language at the end of the 20th century, we used three corpora in ‘New Yonsei Corpus (nYsc)’ (1 type of written language corpus and 2 types of spoken language corpus: each corpus contains one million words).
『Frequency Research of Korean Vocabulary Use』 (1956) and New Yonsei Corpus 1 (nYsc1) are both written language corpus, and New Yonsei Corpus 2 (nYsc2) and New Yonsei Corpus 3 (nYsc3) are both spoken language corpus.
Based on the data to be compared, we analyzed the quantitative characteristics of Korean language in two aspects: the distribution and composition of words by part of speech, and the distribution by etymology (type of words).
By comparing the distribution of vocabulary by part of speech in terms of the number of words and its ratio, we found a remarkable difference in nouns, pronouns, adverbs, and interjections. In the written language corpora, 『Frequency Research of Korean Vocabulary Use』 (1956) and nYsc1, the proportion of nouns was higher than that in the spoken language corpora (nYsc2, nYsc3). However, the ratios of pronouns, adverbs, and interjections were higher in the spoken language corpora at the end of the 20th century (nYsc2, nYsc3) than the written language corpora (1956, nYsc1).
In terms of the sum of frequency and its ratio of the distribution of vocabulary by part of speech, a large difference can be found in nouns, pronouns, adverbs, and interjections. This aspect is like numbers of words, but the difference is much larger. However, a new feature that has emerged in terms of the sum of frequency is that the ratio of particle’s sum of frequency is higher in written language (1956, nYsc1) than in spoken language. On the other hand, as a result of analysis from the etymological distribution, in terms of the number of words and its ratio, it was characterized that the ratio of Chinese words was significantly higher than that of Korean or foreign words. The ratio of Chinese words in written language is higher than that of in spoken language, and the ratio of Korean words in spoken language was higher than that of in written language.
However, it shows the opposite features in terms of the sum of frequency and its ratio. In other words, it can be seen that the relationship between Korean and Chinese words viewed by etymology is inversely proportional to each other. In other words, in terms of the number of words alone, Chinese words appear almost twice as many as Korean words, but when viewed as the sum of the number of frequency actually used, the relationship is the opposite, Korean words appear at a rate that is three times higher than that of Chinese words.
Another remarkable result is that the difference between written and spoken language is very clear. Therefore, in the spoken language corpus, whether it is the number of words or the sum of the number of frequency, it appears at a much higher rate than the written language.
Through this analysis, we once again confirm that it is a very important achievement to analyze the characteristics of Korean language in the 1950s reflected in this data, along with the Korean language informational value of 『Frequency Research of Korean Vocabulary Use』 (1956).
: 어문학분야 > 국어학
한국학술정보㈜의 모든 학술 자료는 각 학회 및 기관과 저작권 계약을 통해 제공하고 있습니다.
이에 본 자료를 상업적 이용, 무단 배포 등 불법적으로 이용할 시에는 저작권법 및 관계법령에 따른 책임을 질 수 있습니다.