A large-scale dual-view driving video dataset for understanding the influence of road and traffic conditions on the ego vehicle’s driving behavior in complex (dense, heterogeneous, and unstructured) traffic scenarios. This enables explainable driving decision-making for safe and efficient navigation of intelligent vehicle systems. The dataset contains 3634 annotated driving scenarios in 1140 untrimmed videos. The annotations include 697K important object bounding boxes (9K object tracks), 1-12 objects per driving scenario covering 10 categories, and 19 explanation label categories.

The dataset consists of images collected in an unstructured road scenario, driving in adverse weather conditions of rain, fog, lowlight and snow. Each individual RGB image has a more detailed near infrared image (NIR), captured simultaneously. The images are collected using JAI FS-3200D-10GE camera.

Curated for identifying missing traffic signs on roads, the MTSVD consists of 1590 videos, having traffic-sign bounding box annotations with 10,000+ unique track IDs spread across more than 70 traffic sign categories, and 7000+ frame-level context interval markings for various traffic signs categories.

8 - MissingTSMini

I - MissingTSMini (2.2 GB)

II - MissingTSMini Test (280 MB)

MissingTSMiniTest is a test set of subset of a large video dataset named 'Missing Traffic Signs Video Dataset (MTSVD)'. MissingTSMiniTest has 200 images for each task.

Autonomous driving technology has made significant strides over the past decade, but most advancements have been tailored to structured environments. However, the real world is far from structured; it's chaotic, unpredictable, and diverse. This is where IDD-3D comes into play. This groundbreaking dataset aims to bridge the gap by providing a comprehensive collection of unstructured driving scenarios, enabling more robust and adaptable autonomous driving systems.

Fine-Grained Vehicle Detection (FGVD) dataset in the wild, captured from a moving camera mounted on a car. It contains 5502 scene images with 210 unique fine-grained labels of multiple vehicle types organized in a three-level hierarchy.

5 - IDD Temporal

I - IDD Temporal Train - Part I (23.7 GB)

II - IDD Temporal Train - Part II (21.1 GB)

III - IDD Temporal Train - Part III (14.2 GB)

IV - IDD Temporal Train - Part IV (12.1 GB)

V IDD Temporal Val (9.6 GB)

VI - IDD Temporal Test - Part I (10.8 GB)

VII - IDD Temporal Test - Part II (7.8 GB)

The IDD temporal dataset consists of the temporally nearby frames (+- 15 frames) from the IDD Segmentation and Detection Datasets. The 30 nearby frames for many of the frames in the IDD Segmentation dataset, is provided in the correspondingly named subfolders in the IDD temporal dataset.

4 - IDD Multimodal

I - IDD Multimodal - Primary (6.5 GB)

II - IDD Multimodal - Secondary (6.6 GB)

III - IDD Multimodal - Supplement (3.0 GB)

IDD Multimodal - Primary, Secondary and Supplemental containing (i) stereo images from front camera at 15 fps, (ii) GPS points at 15 Hz – latitude & longitude, and (iii) 16-channel LIDAR and OBD data

Subsampled version for IDD for use in resource constrained training/deployment, architecture search

40,000 images with bounding box annotations; released 2018

1 - IDD Segmentation

I - IDD Segmentation (IDD 20k Part I) (18.5 GB)

10,000 images and fine semantic segmentation annotation; released in 2018.

II - IDD Segmentation (IDD 20k Part II) (5.5 GB)

20,000 images and fine semantic segmentation annotation (14K Train, 2K Val, 4K Test) from 350 drive sequences, in 2 part downloads - this data and the data released in 2018 as 'IDD - Segmentation (IDD 20k Part I)'