fulfilled. With regards to the first objective (identify-
ing the most effective feature set), we extracted differ-
ent features, namely: demographic features, “global”
statistics that are relative to the entire observation
window, dynamic monthly statistics for each month
in the observation window, a vector containing daily
purchase amounts and another vector containing the
number of purchases done towards each MCC. The
best results were achieved when utilising all the fea-
tures except for demographic features.
We fulfilled the second objective (build a setup
that can model and predict customer churn) by ap-
plying the Artificial Neural Network and the Gra-
dient Boosting Model for this problem. The GBM
classifier resulted in the best machine learning frame-
work of this study, obtaining an AUROC score of
0.6927. In addition, we also observed that our learn-
ing framework is capable of correctly identifying 70%
of “Churners”, potentially making it a suitable solu-
tion in Customer Relationship Management.
When handling the third objective (to determine
the minimum amount of customer activity needed in
order to predict its likelihood of churning), exper-
iment results show that decreasing the observation
window to a month’s length does not extensively af-
fect the predictive performance of the classifier, giv-
ing the ability to negotiate between prediction accu-
racy and amount of data observed.
For our final objective (attempting to handle the
cold-start problem using a customer’s demographic
features that can be made accessible upon registra-
tion), we attempted to predict customer churn using
only demographic information and in time, combine
any new purchase data. This experiment showed that
for the current dataset, predicting churn behaviour
using only customer demographics (the customer’s
age, country and currency information), is not any-
where sufficient enough to be able to predict whether
a newly registered customer is going to default or not
in the coming month.
The work described in this paper can be further
improved by augmenting the constructed framework
to a tree-based model in order to extract meaningful
behavioural rules. These can be used to capture the
actual characteristics of churning customers. Further-
more, after addressing the problem of customer churn
prediction, it now makes sense to tackle the problem
of predicting the next purchases of customers. The
approaches performed in collaborative recommenda-
tion systems can be adopted and tweaked to our pur-
Predicting Customer Behavioural Patterns using a Virtual Credit Card Transactions Dataset