Research Article: Churn prediction of mobile and online casual games using play log data

Date Published: July 5, 2017

Publisher: Public Library of Science

Author(s): Seungwook Kim, Daeyoung Choi, Eunjung Lee, Wonjong Rhee, Wei-Xing Zhou.

http://doi.org/10.1371/journal.pone.0180735

Abstract

Internet-connected devices, especially mobile devices such as smartphones, have become widely accessible in the past decade. Interaction with such devices has evolved into frequent and short-duration usage, and this phenomenon has resulted in a pervasive popularity of casual games in the game sector. On the other hand, development of casual games has become easier than ever as a result of the advancement of development tools. With the resulting fierce competition, now both acquisition and retention of users are the prime concerns in the field. In this study, we focus on churn prediction of mobile and online casual games. While churn prediction and analysis can provide important insights and action cues on retention, its application using play log data has been primitive or very limited in the casual game area. Most of the existing methods cannot be applied to casual games because casual game players tend to churn very quickly and they do not pay periodic subscription fees. Therefore, we focus on the new players and formally define churn using observation period (OP) and churn prediction period (CP). Using the definition, we develop a standard churn analysis process for casual games. We cover essential topics such as pre-processing of raw data, feature engineering including feature analysis, churn prediction modeling using traditional machine learning algorithms (logistic regression, gradient boosting, and random forests) and two deep learning algorithms (CNN and LSTM), and sensitivity analysis for OP and CP. Play log data of three different casual games are considered by analyzing a total of 193,443 unique player records and 10,874,958 play log records. While the analysis results provide useful insights, the overall results indicate that a small number of well-chosen features used as performance metrics might be sufficient for making important action decisions and that OP and CP should be properly chosen depending on the analysis goal.

Partial Text

The global game market continues to grow, and it is expected to grow 9.6 percent per year between 2013 and 2018 with a stable revenue stream [1]. As the number of published games increase exponentially, game market is becoming too complicated to understand as a single sector. Therefore, researchers often divide the market into multiple sectors and analyze each sector separately. One way of categorizing games is by checking the type of platform: console, handheld, personal computer, online, smartphone, tablet, or television [1]. Another way is to base the categorization on the complexity and intensity. According to [2], hardcore games are the traditional and complex ones that require players to spend substantial time and resources, and casual games, such as Tetris, Pac-man, and Wii, are the simple ones that are easy to play and require little time and few resources. According to [3], casual games can be characterized by short and less complex gameplay sessions, easiness for learning and playing, and platforms that are ordinary devices like personal computers and mobile phones. According to [4], casual games are ‘normalization of digital play’ and have interfaces that use only few buttons. While it is difficult, if not impossible, to unambiguously define casual games in a concise way, obviously casual games have become very popular with a high growth rate since 2000 [5] and they are the primary subject of this study.

Early user churn prediction research has been conducted in a variety of commercial fields such as telecommunication, newspaper, bank, credit card, and insurance [20] [21] [22] [23]. In each field, churn prediction has evolved in accordance to the industry development.

We focus on the new players of casual games (no subscription). To study churn, we predict if a new player will stop playing after a certain number of days since the first day of play. As addressed in the introduction, this definition of churn is quite different from the case of targeting ‘the entire user base’ where the question is if an existing user, regardless of the length of in-service period, will still pay in the following month. To formally define a churner for casual games, we first introduce a few notations. For a player u, let si and tiabs represent the score and timestamp, respectively, of the player’s ith play. Because we focus on new players only, it is convenient to define tirel, relative time with respect to the player’s first play time. For the simplicity of the notation, we use ti to represent tirel in the rest of the paper.

The three casual games chosen for churn prediction are Dodge the Mud, a casual racing game with an undisclosed title (per data owner’s request), and TagPro. Dodge the Mud is a casual action game on Android. In the game, a player dodges mud coming down from the sky by touching buttons on the phone screen or tilting the phone; the player receives scores, which are proportional to the number of muds the player dodges. Game #2 is a casual racing game on Android and iOS. A player rides a vehicle over challenging terrain with a specified time limit. Then, the player receives game money, which can be used to upgrade the vehicle. Lastly, TagPro is a casual CTF (capture-the-flag) game on a web browser. It is a team game in which a player seizes the other team’s flag and brings it back to his or her own team. The player then receives individual play scores and team points, which can be earned only when the player’s team wins a game. Table 1 summarizes the three games, with data descriptions of each game.

For traditional machine learning algorithms to work well, an appropriate feature engineering needs to be performed. When done properly, feature engineering not only improves the algorithm performance but also enables discovery of important insights. In this section, intuitively meaningful features are extracted first, and their importance is evaluated by analyzing single feature ranking and by identifying the best feature combinations.

Churn prediction performance can be affected by the choice of machine learning algorithm and the choice of OP and CP in the churn definition. In this section, we investigate the effects of algorithm choice and OP/CP choice on the AUC performance.

The behaviors of game players cannot be the same as the behaviors of the telecom or newspaper subscribers. Moreover, behaviors of casual game players can be even more different because of the light and temporal nature of playing casual games. People play for fun and easily quit without worrying about a contract or a subscription fee. By deep diving into the play log data of three casual games, we revealed the key insights that characterize casual games. The results are discussed further in this section.

In this work, we analyzed three casual games by performing churn predictions. Through the analysis, we demonstrated how play log data can be used to conduct a fundamental set of churn analyses. Churn was carefully defined for casual games, and a standard process for analysis was established. The definition and process are directly applicable not only to casual games, but also to any new user analysis of non-subscription-based services. Overall, the main findings in this study are due to the simple nature of casual games. If the findings generally hold true for most of casual games, choosing an appropriate churn definition and identifying a few core features can be the most important parts of understanding and preventing churn.

 

Source:

http://doi.org/10.1371/journal.pone.0180735

 

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments