ContextVLM: Zero-Shot and Few-Shot Context Understanding for Autonomous Driving using Vision Language Models [IEEE ITSC 2024] [supporting dataset]
-
2024-09-01
Details
-
Creators:
-
Corporate Creators:
-
Corporate Contributors:
-
Subject/TRT Terms:
-
Resource Type:
-
Right Statement:
-
Geographical Coverage:
-
Corporate Publisher:
-
Abstract:In this work, we introduce a dataset called DrivingContexts for the detection of relevant driving contexts for autonomous driving. Additionally, we propose the use of vision language models such as LLaVa and ViLT with zero-shot and few-shot approaches, to solve the problem of detecting such contexts. With our approach, we reduce the need for fully supervised training on large annotated datasets to detect these contexts and also the need for hand-crafted approaches to understand specific contexts of importance.
-
Content Notes:This item is made available under the terms of the Creative Commons Attribution 1.0 Universal (CC0 1.0) license https://creativecommons.org/publicdomain/zero/1.0/. External Repository Note: This dataset additional data and software in a GitHub repository. This repository is accessible online: https://github.com/ssuralcmu/ContextVLM.git
-
Format:
-
Funding:
-
Collection(s):
-
Main Document Checksum:urn:sha-512:afef8c9690e9b4c1c77408b79bb8ed04bf726739f46816bf06cb10506d2cb4e71245f6fb32de2a1fa398a7e5deebd418156f1bde666f6c6e49e6a014ef635455
-
Download URL:
-
File Type: