야구 경기에서 빅데이터 분석과 마르코프 연쇄를 이용한 득점 예측 모형
A prediction of runs using big data analysis and Markov chain in baseball game
- 원문 URL
This paper presents a new model for predicting the number of runs scored in a baseball game on the basis of a big data analysis and a Markov chain. To this end, a database model was designed to implement a systematic management of the large amount of baseball game data. The MapReduce technique in the Hadoop framework, a method widely used in big data analyses, was used for effective storage and systematic management of the large amount of game score data consisting of unstructured text data. For efficient configuration of the proposed Markov chain-based score prediction model, the probabilities of advancing and hitting were redefined to accurately simulate the real-world baseball game situations. Using the probabilities of advancing and hitting, we obtained the score distributions and the number of batters for each inning, and constructed the Markov chain model to predict the scoring runs in each game. A -test was used for verifying the difference in the probabilities of advancing and hitting between right- and left-handed pitchers, and a score prediction model reflecting the characteristics of right- and left-handed pitchers was constructed. Real game data from korean professional baseball were used for experimentally proving the efficiency of the proposed prediction model. The experiment also included a score prediction model that takes into account the characteristics of left- or right-handed pitchers by reflecting their respective probabilities of advancing and hitting. The proposed model is expected to be useful in establishing strategies for deciding the batting order or enhancing game performance through efficient predictions of the scoring runs and the winning odds.