논문 상세보기

KCI등재

Correlation plot for a contingency table

Chong Sun Hong , Tae Gyu Oh
  • : 한국통계학회
  • : CSAM(Communications for Statistical Applications and Methods) 28권3호
  • : 연속간행물
  • : 2021년 05월
  • : 295-305(11pages)
CSAM(Communications for Statistical Applications and Methods)

DOI


목차

1. Introduction
2. Correlation coefficients for a contingency table
3. Correlation plot for the correlation coefficients matrix
4. Correlation plot for an illustrated example
5. Correlation plot for high-dimensional contingency tables
6. Conclusions
References

키워드 보기


초록 보기

Most graphical representation methods for two-dimensional contingency tables are based on the frequencies, probabilities, association measures, and goodness-of-fit statistics. In this work, a method is proposed to represent the correlation coefficients for each of the two selected levels of the row and column variables. Using the correlation coefficients, one can obtain the vector-matrix that represents the angle corresponding to each cell. Thus, these vectors are represented as a unit circle with angles. This is called a CC plot, which is a correlation plot for a contingency table. When the CC plot is used with other graphical methods as well as statistical models, more advanced analyses including the relationship among the cells of the row or column variables could be derived.

UCI(KEPA)

간행물정보

  • : 자연과학분야  > 통계학
  • : KCI등재
  • :
  • : 격월
  • : 2287-7843
  • : 2383-4757
  • : 학술지
  • : 연속간행물
  • : 1994-2021
  • : 1950


저작권 안내

한국학술정보㈜의 모든 학술 자료는 각 학회 및 기관과 저작권 계약을 통해 제공하고 있습니다.

이에 본 자료를 상업적 이용, 무단 배포 등 불법적으로 이용할 시에는 저작권법 및 관계법령에 따른 책임을 질 수 있습니다.

28권4호(2021년 07월) 수록논문
최근 권호 논문
| | | |

KCI등재

1Generalized Bayes estimation for a SAR model with linear restrictions binding the coefficients

저자 : Anoop Chaturvedi , Sandeep Mishra

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 315-327 (13 pages)

다운로드

(기관인증 필요)

초록보기

The Spatial Autoregressive (SAR) models have drawn considerable attention in recent econometrics literature because of their capability to model the spatial spill overs in a feasible way. While considering the Bayesian analysis of these models, one may face the problem of lack of robustness with respect to underlying prior assumptions. The generalized Bayes estimators provide a viable alternative to incorporate prior belief and are more robust with respect to underlying prior assumptions. The present paper considers the SAR model with a set of linear restrictions binding the regression coefficients and derives restricted generalized Bayes estimator for the coefficients vector. The minimaxity of the restricted generalized Bayes estimator has been established. Using a simulation study, it has been demonstrated that the estimator dominates the restricted least squares as well as restricted Stein rule estimators.

KCI등재

2Is it possible to forecast KOSPI direction using deep learning methods?

저자 : Songa Choi , Jongwoo Song

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 329-338 (10 pages)

다운로드

(기관인증 필요)

초록보기

Deep learning methods have been developed, used in various fields, and they have shown outstanding performances in many cases. Many studies predicted a daily stock return, a classic example of time-series data, using deep learning methods. We also tried to apply deep learning methods to Korea's stock market data. We used Korea's stock market index (KOSPI) and several individual stocks to forecast daily returns and directions. We compared several deep learning models with other machine learning methods, including random forest and XGBoost. In regression, long short term memory (LSTM) and gated recurrent unit (GRU) models are better than other prediction models. For the classification applications, there is no clear winner. However, even the best deep learning models cannot predict significantly better than the simple base model. We believe that it is challenging to predict daily stock return data even if we use the latest deep learning methods.

KCI등재

3Identification of risk factors and development of the nomogram for delirium

저자 : Min-seok Shin , Ji-eun Jang , Jea-young Lee

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 339-350 (12 pages)

다운로드

(기관인증 필요)

초록보기

In medical research, the risk factors associated with human diseases need to be identified to predict the incidence rate and determine the treatment plan. Logistic regression analysis is primarily used in order to select risk factors. However, individuals who are unfamiliar with statistics outcomes have trouble using these methods. In this study, we develop a nomogram that graphically represents the numerical association between the disease and risk factors in order to identify the risk factors for delirium and to interpret and use the results more effectively. By using the logistic regression model, we identify risk factors related to delirium, construct a nomogram and predict incidence rates. Additionally, we verify the developed nomogram using a receiver operation characteristics (ROC) curve and calibration plot. Nursing home, stroke/epilepsy, metabolic abnormality, hemodynamic instability, and analgesics were selected as risk factors. The validation results of the nomogram, built with the factors of training set and the test set of the AUC showed a statistically significant determination of 0.893 and 0.717, respectively. As a result of drawing the calibration plot, the coefficient of determination was 0.820. By using the nomogram developed in this paper, health professionals can easily predict the incidence rate of delirium for individual patients. Based on this information, the nomogram could be used as a useful tool to establish an individual's treatment plan.

KCI등재

4An alternative method for estimating lognormal means

저자 : Yeil Kwon

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 351-368 (18 pages)

다운로드

(기관인증 필요)

초록보기

For a probabilistic model with positively skewed data, a lognormal distribution is one of the key distributions that play a critical role. Several lognormal models can be found in various areas, such as medical science, engineering, and finance. In this paper, we propose a new estimator for a lognormal mean and depict the performance of the proposed estimator in terms of the relative mean squared error (RMSE) compared with Shen's estimator (Shen et al., 2006), which is considered the best estimator among the existing methods. The proposed estimator includes a tuning parameter. By finding the optimal value of the tuning parameter, we can improve the average performance of the proposed estimator over the typical range of σ2. The bias reduction of the proposed estimator tends to exceed the increased variance, and it results in a smaller RMSE than Shen's estimator. A numerical study reveals that the proposed estimator has performance comparable with Shen's estimator when σ2 is small and exhibits a meaningful decrease in the RMSE under moderate and large σ2 values.

KCI등재

5vlda: An R package for statistical visualization of multidimensional longitudinal data

저자 : Bo-hui Lee , Seongwon Ryu , Yong-seok Choi

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 369-391 (23 pages)

다운로드

(기관인증 필요)

초록보기

The vlda is an R (R Development Core team et al., 2011) package which provides functions for visualization of multidimensional longitudinal data. In particular, the R package vlda was developed to assist in producing a plot that more effectively expresses changes over time for two different types (long format and wide format) and uses a consistent calling scheme for longitudinal data. The main features of this package allow us to identify the relationship between categories and objects using an indicator matrix with object information, as well as to cluster objects. The R package vlda can be used to understand trends in observations over time in addition to identifying relative relationships at a simple visualization level. It also offers a new interactive implementation to perform additional interpretation, therefore it is useful for longitudinal data visual analysis. Due to the synergistic relationship between the existing VLDA plot and interactive features, the user is empowered by a refined observe the visual aspects of the VLDA plot layout. Furthermore, it allows the projection of supplementary information (supplementary objects and variables) that often occurs in longitudinal data of graphs. In this study, practical examples are provided to highlight the implemented methods of real applications.

KCI등재

6On the models for the distribution of examination score for projecting the demand for Korean Long-Term Care Insurance

저자 : Sophia Nicole Javal , Hyuk-sung Kwon

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 4호 발행 연도 : 2021 페이지 : pp. 393-409 (17 pages)

다운로드

(기관인증 필요)

초록보기

The Korean Long-Term Care Insurance (K-LTCI) provides financial support for long-term care service to people who need various types of assistance with daily activities. As the number of elderly people in Korea is expected to increase in the future, the demand for long-term care insurance would also increase over time. Projection of future expenditure on K-LTCI depends on the number of beneficiaries within the grading system of K-LTCI based on the test scores of applicants. This study investigated the suitability of mixture distributions to the model K-LTCI score distribution using recent empirical data on K-LTCI, provided by the National Health Insurance Service (NHIS). Based on the developed mixture models, the number of beneficiaries in each grade and its variability under the current grading system were estimated by simulation. It was observed that a mixture model is suitable for K-LTCI score distribution and may prove useful in devising a funding plan for K-LTCI benefit payment and investigating the effects of any possible revision in the K-LTCI grading system.

1
권호별 보기
같은 권호 다른 논문
| | | | 다운로드

KCI등재

1Predicting movie audience with stacked generalization by combining machine learning algorithms

저자 : Junghoon Park , Changwon Lim

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 217-232 (16 pages)

다운로드

(기관인증 필요)

초록보기

The Korea film industry has matured and the number of movie-watching per capita has reached the highest level in the world. Since then, movie industry growth rate is decreasing and even the total sales of movies per year slightly decreased in 2018. The number of moviegoers is the first factor of sales in movie industry and also an important factor influencing additional sales. Thus it is important to predict the number of movie audiences. In this study, we predict the cumulative number of audiences of films using stacking, an ensemble method. Stacking is a kind of ensemble method that combines all the algorithms used in the prediction. We use box office data from Korea Film Council and web comment data from Daum Movie (www.movie.daum.net). This paper describes the process of collecting and preprocessing of explanatory variables and explains regression models used in stacking. Final stacking model outperforms in the prediction of test set in terms of RMSE.

KCI등재

2Comparison of accuracy between LC model and 4-PFM when COVID-19 impacts mortality structure

저자 : Janghoon Choi

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 233-250 (18 pages)

다운로드

(기관인증 필요)

초록보기

This paper studies if the accuracies of mortality models (LC model vs. 4-parametric model) are aggravated if a mortality structure changes due to the impact of COVID-19. LC model (LCM) uses dimension reduction for fitting to the log mortality matrix so that the performance of the dimension reduction method may not be good when the matrix structure changes. On the other hand, 4-parametric factor model (4-PFM) is designed to use factors for fitting to log mortality data by age groups so that it would be less affected by the change of the mortality structure. In fact, the forecast accuracies of LCM are better than those of 4-PFM when life-tables are used whereas those of 4-PFM are better when the mortality structure changes. Thus this result shows that 4-PFM is more reliable in performance to the structural changes of the mortality. To support the accuracy changes of LCM the functional aspect is explained by computing eigenvalues produced by singular vector decomposition

KCI등재

3Semi closed-form pricing autocallable ELS using Brownian Bridge

저자 : Minha Lee , Jimin Hong

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 251-265 (15 pages)

다운로드

(기관인증 필요)

초록보기

This paper discusses the pricing of autocallable structured product with knock-in (KI) feature using the exit probability with the Brownian Bridge technique. The explicit pricing formula of autocallable ELS derived in the existing paper handles the part including the minimum of the Brownian motion using the inclusion-exclusion principle. This has the disadvantage that the pricing formula is complicate because of the probability with minimum value and the computational volume increases dramatically as the number of autocall chances increases. To solve this problem, we applied an effcient and robust simulation method called the Brownian Bridge technique, which provides the probability of touching the predetermined barrier when the initial and terminal values of the process following the Brownian motion in a certain interval are specified. We rewrite the existing pricing formula and provide a brief theoretical background and computational algorithm for the technique. We also provide several numerical examples computed in three different ways: explicit pricing formula, the Crude Monte Carlo simulation method and the Brownian Bridge technique.

KCI등재

4Fused inverse regression with multi-dimensional responses

저자 : Youyoung Cho , Hyoseon Han , Jae Keun Yoo

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 267-279 (13 pages)

다운로드

(기관인증 필요)

초록보기

A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

KCI등재

5Stable activation-based regression with localizing property

저자 : Jae-Kyung Shin , Jae-Hwan Jhong , Ja-Yong Koo

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 281-294 (14 pages)

다운로드

(기관인증 필요)

초록보기

In this paper, we propose an adaptive regression method based on the single-layer neural network structure. We adopt a symmetric activation function as units of the structure. The activation function has a flexibility of its form with a parametrization and has a localizing property that is useful to improve the quality of estimation. In order to provide a spatially adaptive estimator, we regularize coefficients of the activation functions via ℓ1-penalization, through which the activation functions to be regarded as unnecessary are removed. In implementation, an efficient coordinate descent algorithm is applied for the proposed estimator. To obtain the stable results of estimation, we present an initialization scheme suited for our structure. Model selection procedure based on the Akaike information criterion is described. The simulation results show that the proposed estimator performs favorably in relation to existing methods and recovers the local structure of the underlying function based on the sample.

KCI등재

6Correlation plot for a contingency table

저자 : Chong Sun Hong , Tae Gyu Oh

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 295-305 (11 pages)

다운로드

(기관인증 필요)

초록보기

Most graphical representation methods for two-dimensional contingency tables are based on the frequencies, probabilities, association measures, and goodness-of-fit statistics. In this work, a method is proposed to represent the correlation coefficients for each of the two selected levels of the row and column variables. Using the correlation coefficients, one can obtain the vector-matrix that represents the angle corresponding to each cell. Thus, these vectors are represented as a unit circle with angles. This is called a CC plot, which is a correlation plot for a contingency table. When the CC plot is used with other graphical methods as well as statistical models, more advanced analyses including the relationship among the cells of the row or column variables could be derived.

KCI등재

7Non-identifiability and testability of missing mechanisms in incomplete two-way contingency tables

저자 : Yousung Park , Seung Mo Oh , Tae Yeon Kwon

발행기관 : 한국통계학회 간행물 : CSAM(Communications for Statistical Applications and Methods) 28권 3호 발행 연도 : 2021 페이지 : pp. 307-314 (8 pages)

다운로드

(기관인증 필요)

초록보기

We showed that any missing mechanism is reproduced by EMAR or MNAR with equal fit for observed likelihood if there are non-negative solutions of maximum likelihood equations. This is a generalization of Molenberghs et al. (2008) and Jeon et al. (2019). Nonetheless, as MCAR becomes a nested model of MNAR, a natural question is whether or not MNAR and MCAR are testable by using the well-known three statistics, LR (Likelihood ratio), Wald, and Score test statistics. Through simulation studies, we compared these three statistics. We investigated to what extent the boundary solution affect tesing MCAR against MNAR, which is the only testable pair of missing mechanisms based on observed likelihood. We showed that all three statistics are useful as long as the boundary proximity is far from 1.

1
발행기관 최신논문
자료제공: 네이버학술정보
발행기관 최신논문
자료제공: 네이버학술정보

내가 찾은 최근 검색어

최근 열람 자료

맞춤 논문

보관함

내 보관함
공유한 보관함

1:1문의

닫기