Welcome to ROSA P |
Stacks Logo
Advanced Search
Select up to three search categories and corresponding keywords using the fields to the right. Refer to the Help section for more detailed instructions.
 
 
Help
Clear All Simple Search
Advanced Search
Big data analytics : predicting traffic flow regimes from simulated connected vehicle messages using data analytics and machine learning.
  • Published Date:
    2016-12-25
  • Language:
    English
Filetype[PDF-3.42 MB]


Details:
  • Alternative Title:
    Predicting traffic flow regimes From simulated connected vehicle messages using data analytics and machine learning.
  • Corporate Creators:
  • Publication/ Report Number:
  • Resource Type:
  • Geographical Coverage:
  • TRIS Online Accession Number:
    1644091
  • Abstract:
    The key objectives of this study were to: 1. Develop advanced analytical techniques that make use of a dynamically configurable connected vehicle message protocol to predict traffic flow regimes in near-real time in a virtual environment and examine accuracy for various levels of market penetration 2. Examine the tradeoff between information insight and cost of data processing and management Data from a virtual (simulated) testbed for the I-405 corridor in Seattle was used to conduct the study. The field data and VISSIM simulation model were obtained from WSDOT. The simulation model went through rigorous calibration and validation process as part of a separate study conducted by Noblis for FHWA Traffic Analysis Tools Program. The Trajectory Conversion Algorithm (TCA V2.3), an open source tool developed by Noblis for the USDOT, was used to emulate SAE J2735 Basic Safety Messages (BSM). Traffic flow regimes (free flow, speed at capacity, and congested) were predicted for 100’ x 100’ boxes overlaid on the I-405 traffic network, every 5 minutes an hour ahead of time using the simulated BSMs. The study made use of Apache Spark’s machine learning libraries for Logistic Regression, Decision Tree and Random Forest to develop models to predict the traffic flow regimes. The computational resources and analytic environment used for this work were provisioned via the Microsoft Azure cloud environment. The computing cluster used for the analysis consisted of four nodes in total: 2 head nodes for job submission and management and 2 worker nodes for computation. Prediction accuracy was tested for two types of communication technologies (Cellular, Dedicated Short Range Communications (DSRC)), two market penetrations (20%, 75%), and six traffic operational conditions. The three algorithms were tested for 6, 8, and 11 principal components. In addition, the Decision Trees and Random Forest algorithms were tested using two node impurity metrics (entropy, Gini), and Random Forest was tested for multiple ensembles of trees (10, 250, 1000). The model that used the Random Forest algorithm with 11 principal components, 250-tree ensemble, and the Gini node impurity metric, had the best results with an average F1 score of 0.83 over all scenarios. The F1 scores were 0.87 for free flow, 0.67 for at capacity and 0.95 for congested traffic regimes. The model was able to fully process an hour’s worth of BSMs into the 100’ x 100’ grid boxes, and make a prediction for the following hour, at 5-minute intervals for each of the 100’ x 100’ boxes in 6 to 16 minutes.
  • Format:
  • Main Document Checksum:
  • Supporting Files:
    No Additional Files
No Related Documents.
You May Also Like: