Research Article: A comparative study on machine learning based algorithms for prediction of motorcycle crash severity

Date Published: April 4, 2019

Publisher: Public Library of Science

Author(s): Lukuman Wahab, Haobin Jiang, Yanyong Guo.


Motorcycle crash severity is under-researched in Ghana. Thus, the probable risk factors and association between these factors and motorcycle crash severity outcomes is not known. Traditional statistical models have intrinsic assumptions and pre-defined correlations that, if flouted, can generate inaccurate results. In this study, machine learning based algorithms were employed to predict and classify motorcycle crash severity. Machine learning based techniques are non-parametric models without the presumption of relationships between endogenous and exogenous variables. The main aim of this research is to evaluate and compare different approaches to modeling motorcycle crash severity as well as investigating the effect of risk factors on the injury outcomes of motorcycle crashes. Motorcycle crash dataset between 2011 and 2015 was extracted from the National Road Traffic Crash Database at the Building and Road Research Institute (BRRI) in Ghana. The dataset was classified into four injury severity categories: fatal, hospitalized, injured, and damage-only. Three machine learning based models were developed: J48 Decision Tree Classifier, Random Forest (RF) and Instance-Based learning with parameter k (IBk) were employed to model the severity of injury in a motorcycle crash. These machine learning algorithms were validated using 10-fold cross-validation technique. The three machine learning based algorithms were compared with one another and the statistical model: multinomial logit model (MNLM). Also, the relative importance analysis of the attribute was conducted to determine the impact of these attributes on injury severity outcomes. The results of the study reveal that the predictions of machine learning algorithms are superior to the MNLM in accuracy and effectiveness, and the RF-based algorithms show the overall best agreement with the experimental data out of the three machine learning algorithms, for its global optimization and extrapolation ability. Location type, time of the crash, settlement type, collision partner, collision type, road separation, road surface type, the day of the week, and road shoulder condition were found as the critical determinants of motorcycle crash injury severity.

Partial Text

Globally, injuries resulting from road traffic crashes are a significant cause of death and disability with a disproportionate number occurring in African countries. The World Health Organization (WHO) stated that upwards of 1.2 million people die each year on the world’s roads, nearly half of which are those with the least protection: motorcyclists, cyclists, and pedestrians [1]. Fast economic growth in low- and middle-income nations has been associated with a surge in motorization and road traffic injuries [2]. In Ghana, studies have shown that one of the leading causes of death and injury is road traffic crashes, most of which occur in rural areas [3].

Over the years, numerous studies have applied a number of methodological techniques to explore the relationship between motor vehicle crash severity and its contributing factors [37]. Some of the methods that have been used include a binary logit models [38], binary probit model [39], ordered logit model [40], ordered probit model [41], bivariate probit model [42], multinomial logit [43], random parameter logit [44], artificial neural network [45], Bayesian [46,47], semi-nonparametric [48] (see [49,50] for comprehensive reviews of these models).

In this study, classification algorithms were applied to model the motorcycle crash severity. The injury severity attribute is used as the class attribute. This attribute takes four values as target values. The spreading of values in the dataset is presented in Table 1.

Traffic crash analysis is one crucial task of road safety organization. Machine learning methods are non-parametric techniques that have been widely used in transportation research but are still relatively underutilized in motorcycle crash severity analysis. After a thorough literature review, we found a gap in the published studies on the methodology in motorcycle crash-injury severity research. Most research focuses on traditional statistical methods; this study focuses on machine-learning techniques.




Leave a Reply

Your email address will not be published.