from statistical theory, which is suitable for solving
small-sample, nonlinear and high-dimensional
pattern recognition problems (Li & Wang, 2018).
Machine learning was applied in the domestic
financial field 10 years ago. Wang Dong used the
SVM model and BP neural network to predict the
SSE 50 Index (Wang, 2007). The result shows that
the deviation of the SVM model is smaller than that
of the BP model and it has a higher direction
prediction accuracy. Until now, scholars have been
exploring and strengthening the application of
machine learning to stock forecasting. In the research
on the short-term stock price prediction of 10 stocks
using the SVM model with the RBF kernel function
(Liu et al., 2020), it was found that the prediction
model based on the support vector machine has
higher accuracy and better prediction effect than the
original prediction algorithm.
The XGBoost model is a machine learning model
proposed by Chen Tianqi (Chen & Guestrin, 2016). It
is a decision tree algorithm proposed based on the
idea of boosting, which has the advantages of fast
training speed, high training accuracy, and not easy to
overfit. Many scholars apply the XGBoost model to
the financial field to test whether the model can
achieve better results. Li Xiang (Li, 2017) applies the
XGBoost model to quantitative stock selection. The
research results show that the designed quantitative
stock selection program can outperform the market
return, and the total return of the selected stock
portfolio is 287%. In comparing the prediction effects
of neural network, SVM, and XGBoost models on 1-
minute high-frequency financial data (Huang & Xie,
2018), they selected CSI 300 stock index future as the
research data. It found that the predictive ability of the
XGBoost model is better than the traditional neural
network and SVM model. In the study of applying
XGBoost model to stock selection strategy (Li &
Zhang, 2019), it constructs a dynamic weighted
multi-factor stock selection strategy. The result
indicate that the model can improve the performance
of multi-factor stock selection strategies. Yan Wang
and Yuankai Guo (Wang & Guo, 2020) proposed the
DWT-ARIMA GSXGB hybrid model, which made
improvements on the XGBoost model. They found
that compared with the original model, the model has
better approximation ability and generalization
ability in stock price prediction. Yang Yang (Yang,
2021) proposed a predictive model of stock trading
behaviour selection and hyperparameter optimization
based on XGBoost model. Research has found that it
can effectively analyse attributes of different
dimensions and predict stock prices.
Stock index futures play an important role in risk
hedging in investment strategies. It has the
characteristics of hedging, value discovery and
investment arbitrage. The CSI 300 Index covers a
wide range. Its price changes are mainly affected by
systemic risks, so it is easier to predict than stocks.
Changes in stock index prices play an important
guiding role in the investment of individual stocks
and futures. Therefore, it is of great practical
significance to make accurate judgments on the rise
and fall of stock indexes. This paper will use the
XGBoost model to predict the rise and fall of the CSI
300 Index Futures prices. Then, according to the
forecast results, an investment strategy is constructed
to trade the main CSI 300 contracts. In the design of
price fluctuation prediction, the price fluctuation
judgment is designed into three situations, namely
"rising", "falling", and "fluctuating", rather than just
judging price rises and falls. Then, through empirical
testing, it studies the prediction effect of the XGBoost
model on the three types of price fluctuations of the
CSI 300 Index Futures and the profitability of the
constructed investment strategy.
The structure of this article is as follows. Chapter
One is Introduction. It briefly describes the research
background and status of applying machine learning
to the financial field, as well as the main research
content and structural arrangement of this paper. The
second chapter mainly introduces the relevant
theories of the XGBoost model. Chapter 3 describes
the construction and optimization of the model and
explains the investment strategy designed. The fourth
chapter uses historical data to train the model and
empirically test the model in the actual financial
market, and analyses the experimental results. The
fifth chapter summarizes this paper.
2 XGBoost MODEL
The XGBoost model is based on the idea that all base
classifiers of the Boosting algorithm are connected.
Each split of the tree in the model will generate a new
tree in the direction of the negative gradient of the
previous training. After training, the sum of all tree
scores is taken as the sample prediction value. The
goal of the algorithm is to have a considerable
generalization ability while the error of the
algorithm's predicted value is small. Because the
XGBoost model is based on the engineering of the
GBDT decision tree algorithm, it is also an additive
model composed of multiple decision trees. When
each leaf node of the tree is split, the model
enumerates all different tree structures. It uses a