Modeling Crash Severity and Collision Types Using Machine Learning [Supporting Dataset]
-
2022-01-01
Details:
-
Creators:
-
Corporate Creators:
-
Corporate Contributors:
-
Subject/TRT Terms:
-
Publication/ Report Number:
-
Resource Type:
-
Geographical Coverage:
-
Corporate Publisher:
-
Abstract:Traffic safety analysis is the fundamental step for reducing economic, social, and environmental cost incurred due to traffic accidents. The essence of traffic safety is understanding the factors affecting crash occurrence, injury severity and collision type and their underlying relationships and predict-prevent future crash instances. Crash injury severity studies in past have utilized numerous statistical, econometric and Machine Learning (ML) and Artificial Intelligence (AI) tools to extract the underlying relationship between the crash causal factors and the consequent severity or collision type. The study aims to explore the Multi-Label Classification (MLC) tool from the domain of Artificial Intelligence (AI) for classification problems in the setting of traffic safety. MLC finds its application primarily in protein function, semantic scene, and music categorization problems. In the real world, multiple heterogenous subjective factors decide the extent of damage/severity of a particular crash instance. Theoretically, the traffic collision type and crash severity type can be correlated, and thus, it is intuitive to model them simultaneously. The ability of MLC to categorize an entity under analysis to more than one labels, correlated or uncorrelated, provides the approach an edge over the single-class (binary) or multi-class classification approach. The MLC based classification model was calibrated and tested using the historical crash data extracted for the state of Texas. The selection of study area was based on a link-level unsupervised principal component analysis-based clustering approach. Similar clustering approach was also tested at the county-level to understand the spatial behavior and thus transferability of the MLC approach to other key cities in the state. The performance of the proposed approach was tested, compared, and quantified with the conventional binary/multi-class classification tools used in the traffic safety domain. Inferences from the preliminary numerical analysis indicates that the proposed multi-label classification approach has promising performance compared to the traditional classification approaches, specifically found in traffic safety literatures. The total size of the described zip file is 16.6 MB. Files with the .xlsx extension are Microsoft Excel spreadsheet files. These can be opened in Excel or open-source spreadsheet programs. The .csv, Comma Separated Value, file is a simple format that is designed for a database table and supported by many applications. The .csv file is often used for moving tabular data between two different computer programs, due to its open format. Any text editor or spreadsheet program will open .csv files.
-
Content Notes:National Transportation Library (NTL) Curation Note: As this dataset is preserved in a repository outside U.S. DOT control, as allowed by the U.S. DOT's Public Access Plan (https://doi.org/10.21949/1503647) Section 7.4.2 Data, the NTL staff has performed NO additional curation actions on this dataset. The current level of dataset documentation is the responsibility of the dataset creator. NTL staff last accessed this dataset at its repository URL on 2022-11-11. If, in the future, you have trouble accessing this dataset at the host repository, please email NTLDataCurator@dot.gov describing your problem. NTL staff will do its best to assist you at that time.
-
Format:
-
Funding:
-
Collection(s):
-
Main Document Checksum: