Details:
-
Creators:
-
Corporate Creators:
-
Contributors:
-
Corporate Contributors:
-
Subject/TRT Terms:
-
Publication/ Report Number:
-
Resource Type:
-
Geographical Coverage:
-
Edition:Final report
-
Corporate Publisher:
-
Abstract:The effectiveness of traditional incident detection is often limited by sparse sensor coverage, and reporting incidents to emergency response systems
is labor-intensive. This research project mines tweet texts to extract incident information on both highways and arterials as an efficient and cost-effective
alternative to existing data sources. This research report presents a methodology to crawl, process and filter tweets that are accessible by
the public for free. Tweets are acquired from Twitter using the REST API in real time. The process of adaptive data acquisition establishes a
dictionary of important keywords and their combinations that can imply traffic incidents (TI). A tweet is then mapped into a high dimensional binary
vector in a feature space formed by the dictionary, and classified into either TI related or not. All the TI tweets are then geocoded to determine their
locations, and further classified into one of the five incident categories. We apply the methodology in two regions, the Pittsburgh and Philadelphia
Metropolitan Areas. Overall, mining tweets holds great potentials to complement existing traffic incident data in a very cheap way. A small sample of
tweets acquired from the Twitter API cover most of the incidents reported in the existing data set, and additional incidents can be identified through
analyzing tweets text. Twitter also provides ample additional information with a reasonable coverage on arterials. A tweet that is related to TI and
geocodable accounts for approximately 10% of all the acquired tweets. Of those geocodable TI tweets, the majority are posted by influential users
(IU), namely public Twitter accounts owned by public agencies and media, while a small number is contributed by individual users. There is more
incident information provided by Twitter on weekends than on weekdays. Within the same day, both individuals and IUs tend to report incidents more
frequently during the day time than at night, especially during traffic peak hours. Individual tweets are more likely to report incidents near the center of
a city, and the volume of information significantly decays outwards from the center. We develop a prototype web application to allow users extract
both real-time and historical incident information and visualize it on the map. The web application will be tested in PennDOT transportation
management centers.
Author ORCID information: http://orcid.org/0000-0001-8716-8989
-
Format:
-
Funding:
-
Collection(s):
-
Main Document Checksum:
-
Download URL:
-
File Type: