Rescuing Legacy Data: Using Optical Character Recognition Technologies to Make Airline Consumer Data Accessible
-
2025-06-16
Details:
-
Creators:
-
Subject/TRT Terms:
-
DOI:
-
Resource Type:
-
Right Statement:
-
Geographical Coverage:
-
Corporate Publisher:
-
Abstract:Since 1971, The U.S. Department of Transportation (USDOT) has produced Air Travel Consumer Report data tables as physical documents and online as PDFs. These documents contain information and data tables collected by USDOT tabulating grievances from consumer letters and filings against airlines. These data tables are now being extracted and converted into an accessible, tabular format for publication in the Repository and Open Science Access Portal, ROSA P. Using ABBYY FineReader PDF software, this project transforms and rescues PDF-locked data tables into machine-readable formats, ensuring greater accessibility and usability for researchers and the public. This accessibility not only adheres to the FAIR principles but also makes data more accessible for screen readers. Through rescue efforts such as these, other legacy data projects can be executed efficiently by data professionals to provide data accessibility.
This poster was presented at the Open Repositories 2025 Conference in Chicago, Illinois on June 16, 2025.
-
Format:
-
Collection(s):
-
Main Document Checksum:
-
Download URL:
-
File Type: