By Topic

Identification of rear-end crash patterns on instrumented freeways: a data mining approach

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Pande, A. ; Dept. of Civil & Environ. Eng., Central Florida Univ., Orlando, FL, USA ; Abdel-Aty, M.

Data mining is the analysis of large "observational" datasets to find unsuspected relationships that might be useful to the data owner. It typically involves analysis where objectives of the mining exercise have no bearing on the data collection strategy. Freeway traffic surveillance data collected through underground loop detectors is one such "observational" database maintained for various ITS (intelligent transportation systems) applications such as travel time prediction etc. In this research data mining process is used to relate this surrogate measure of traffic conditions (data from freeway loop detectors) with occurrence of rear-end crashes on freeways. The results from this analysis are envisioned to be the first step in the development of a functional proactive traffic management system. The dataset under consideration includes information on crashes and corresponding traffic data collected from detectors neighboring the crash locations just prior to the time of the crash. The problem is setup as a classification problem for a crash being rear-end vs. not. Three types of classification tree involving different splitting criterion were attempted for variable selection. It was found that the classification tree with chi sq. test as the splitting criterion resulted in the most inclusive list of variables. The variable selection was followed by two neural network architectures, namely, the RBF (radial basis function) and MLP (multi-layer perceptron) to model the binary target variable. The two neural network models were then combined based on their output to achieve any possible improvement in the classification accuracy. It was found, however, that the classification tree model with chi sq. test as splitting criterion (with more than 65% classification accuracy) was better than any of the individual or combined neural network models (54-55% classification accuracy). Since the decision tree model also provides simple interpretable rules to classify the data in a real-time application it was recommended as the final classification model.

Published in:

Intelligent Transportation Systems, 2005. Proceedings. 2005 IEEE

Date of Conference:

13-15 Sept. 2005