Songdo Vision: Vehicle Annotations from High-Altitude BeV Drone Imagery in a Smart City

Fonod, RobertCho, HaechanYeo, HwasooGeroliminis, Nikolaos2025-03-212025-03-212025-07-162025-03-212025-03-1710.5281/zenodo.13828408https://infoscience.epfl.ch/handle/20.500.14299/248163The Songdo Vision dataset provides high-resolution (4K, 3840×2160 pixels) RGB images annotated with categorized axis-aligned bounding boxes (BBs) for vehicle detection from a high-altitude bird’s-eye view (BeV) perspective. Captured over Songdo International Business District, South Korea, this dataset consists of 5,419 annotated video frames, featuring approximately 300,000 vehicle instances categorized into four classes: - Car (including vans and light-duty vehicles) - Bus - Truck - Motorcycle This dataset can serve as a benchmark for aerial vehicle detection, supporting research and real-world applications in intelligent transportation systems, traffic monitoring, and aerial vision-based mobility analytics. It was developed in the context of a multi-drone experiment aimed at enhancing geo-referenced vehicle trajectory extraction.enComputer visionAerial Vehicle DetectionDrone ImageryBird's-Eye View (BeV)Traffic MonitoringSmart City AnalyticsDeep learningObject Detection DatasetBounding Box AnnotationsHigh-Altitude UAVVehicle DetectionCOCO Dataset FormatYOLO AnnotationsPascal VOC FormatUrban Traffic AnalysisMulti-Class Object DetectionMachine Learning DatasetSongdo Vision: Vehicle Annotations from High-Altitude BeV Drone Imagery in a Smart Citydatasetf7f0f4b8-53b5-492f-80e3-a388078ce1cc