DiverSim: an Inclusive Simulator for Diverse Synthetic Pedestrian Data Generation

25 / 10 / 2024

One of the critical challenges in autonomous driving and AI-based systems is the lack of diverse and representative datasets. Existing datasets often fail to reflect the broad range of pedestrians found in real-world scenarios—especially underrepresented groups such as people with disabilities or individuals from various ethnic minorities. This gap in representation can lead to the development of AI systems that are not equipped to handle the full diversity of pedestrians, potentially compromising the safety and inclusivity of autonomous driving technologies.

At Vicomtech, we have developed DiverSim, a cutting-edge simulator to solve this issue by generating synthetic data for pedestrians with a strong focus on diversity and inclusion. Built on the Unreal Engine 5 platform, known for its photorealistic capabilities, and leveraging the AirSim simulator as its foundation, this tool creates balanced datasets of pedestrians. It simulates a wide variety of individuals, including equal proportions of genders, different ethnic minorities, and individuals with various disabilities such as those using wheelchairs, crutches, or white canes for visual impairments. This simulator ensures that no minority is underrepresented, thus providing a balanced dataset that avoids biases that are common in most openly available datasets.

Users can also configure various simulation parameters, including atmospheric conditions, time of day, and the proportion of pedestrian types in specific scenarios. To ensure that the dataset is also visually diverse, some elements of the simulation are randomised each time it is executed, such as the size and shape of the background buildings, the position of cars and angle of the sun. The system supports two key use cases: crosswalks and parking areas. A customizable fisheye camera model is also integrated, a feature often missing in other simulators despite its prevalence in ADAS (Advanced Driver Assistance Systems). All generated data is annotated in ASAM OpenLABEL format.

Moreover, the simulator and its data generation tools are now open source, allowing researchers and developers worldwide to use them freely. All the assets and animations used in this simulator have been carefully chosen so their licenses allow their use for AI training and validation purposes. This means that the data generated is readily usable for training and validating AI models designed to handle a wide range of pedestrian types and scenarios.

As future work, we aim to expand the simulator's capabilities by introducing 3D annotations and LiDAR sensor recordings, which will enable even more detailed data generation. We also plan to add additional use cases, like the VRU pick-up scenario, further enhancing the diversity of pedestrian interactions that can be simulated.

The simulator is designed to contribute to addressing dataset representation issues within the AWARE2ALL project, which aims to develop inclusive HMI (Human-Machine Interface) and safety systems for autonomous vehicles. By generating diverse, balanced datasets, the data produced will assist in re-training and validating AI models to better accommodate the needs of diverse individuals. This effort is integral to enhancing the safety and reliability of autonomous driving technologies for all users.

News

Enhancing AV Localization in AWARE2ALL DEMO2: Safer Travel for People with Reduced Mobility

10 / 04 / 25

Automated vehicles rely heavily on precise positioning systems to safely navigate roads, avoid obstacles, and reach their destinations effectively. This precision becomes critically important when transporting individuals with mobility challenges, whose disabilities often restrict their ability to compensate for unexpected movements or reposition themselves comfortably. Any inaccuracies in positioning could result in abrupt vehicle maneuvers, […]

News

AWARE2ALL Contributes to the SINFONICA Workshop at IRT SystemX

19 / 03 / 25

On March 18, 2025, AWARE2ALL had the privilege of participating in a dynamic and insightful workshop organized by the SINFONICA project at IRT SystemX in Paris, France. This gathering provided a valuable opportunity to exchange knowledge, explore synergies between EU-funded projects, and showcase the latest advancements in Connected, Cooperative, and Automated Mobility (CCAM). A Platform […]

News

Advancing Situation Awareness estimation: a driving simulator experiment at TNO

19 / 12 / 24

In our news article of October 2023 we explained the why behind the research into driver Situation Awareness (SA): to make automated driving safer by tracking the driver’s SA and either urge the driver to regain SA, or take system safety measures. The latter can range from taking a different approach to Transition of Control to […]