My name is Dominic Teo and I'm currently an analytics manager with the Ministry of National Development (MND). I'm interested in the intersection of public policy and technology as well as the application of Big Data and Civic Tech in the public sphere.
I graduated Cum Laude from Sciences Po Paris with a BA in Social Sciences & Economics followed by a MS in Computational Analysis and Public Policy (MSCAPP) at the University of Chicago. The MSCAPP degree is a 2 year dual degree offered by the Schools of Computer Science and Public Policy.
Take a look at my resumé and various projects! (click on titles of projects for more details)
Hosted on GitHub Pages — Theme by orderedlist
This project was developed by Dominic Teo, Grace Prakaisriroj and Alessandro Luciano as our capstone group project for the undergraduate final year Statistics course ST309.
The more detailed report on the entire project can be found in the pdf document. Unfortunately, the Rmarkdown file has been lost.
The global app economy is estimated to be worth 6.3 trillion USD by 2021, up from 1.3 trillion USD in 2018 (AppAnnie,2019). Over this period the user base will almost double from 3.4 billion people using apps to around 6.3 billion (AppAnnie, 2019). However, the majority of developers are still struggling to break even (AppSurvey, 2013). For those unsuccessful app developers, a clear analysis of the characteristics of existing successful apps would provide a useful insight into creating apps that users want (Tian,2015).
Therefore, the main goal of this project is to identify the characteristics that successful apps share and investigate which of these factors is important for success.
Our final cleaned-dataset used in modelling has 13 predictor-variables (Android.Ver has been removed through EDA) in 8 dimensions and 6738 obervations in total.
The 8 dimensions we used for our analysis are:
Rating: The rating factor is the overall user rating of the app. Higher rated apps have been shown tohave more downloads (Lanza,2012). Apps are rated from 1 to 5.
Size: The “Size of App” factor captures various information on the app. Large apps might contain more features or better functionality. Thus, they might have better ratings. On the other hand, larger apps also imply a higher probability to contain a bug and therein might have lower ratings (Zimmermann,2007).
Category: For the category the app belongs to, we choose to recode the categories so that there were simply 4 categories: “Hobbies”, “Entertainment”, “Lifestyle”, and “Productivity”. We coded these 4 categories as 4 mutually exclusive and collectively exhaustive binary variables.
Reviews: For the number of reviews, we simply coded this variable as a numeric value. There is a positive correlation between the number of reviews and the number of app downloads.
Days Since Last Updated: As the variable “Last Updated” could impact the number of installs via appearing in the ‘newly updated tab’(thus attracting more traffic as there are more appstore users through time) in the google app store, we expect this variable to be significant. As the original “Last Updated” variable is a character class, we transformed it into a date and created a new variable, “DaysSinceLastup”. This variable shows how many days ago the app was last updated (negative days) since 10/12/18, the day we downloaded the dataset.
Our project has two research questions:
We applied 3 different classification modeling approaches to the data in order to find significant variables in determining the success of mobile apps.
Logistic Regression
Decision Trees and Pruned Decision Trees
Random Forest
From our study and use of the 4 different models, we can conclude that the random forest model is most effective in predicting the success of mobile applications. Our random forest model demonstrates that Price, Rating, DaysSinceLastUpdated and Size are the most important variables used in the construction of the model.
This is important for developers who want to develop successful mobile applications. They should hence develop free applications with the aim of receiving high ratings in the Google Play Store. This could mean that developers trying to make money from their apps could stand to make their apps free but concentrate on in-app monetization opportunities instead. Making consumers pay for the app itself seems to be a major deterrent for users to download the app.
More interestingly, developers should also continuously and consistently update the app as we found that apps that were more recently updated tend to be more successful. We also found that overall, apps that were larger in size were more successful (although less significant than the other three variables). We think that size could be a good proxy for complexity and sophistication, hence apps that are more complex and developed tend to be more popular.