Challenges of Driver Monitoring Systems

Driver Monitoring Systems (DMS) are "intelligent" algorithms that monitor the state of the occupants in a vehicle, which can be used for security, comfort, and entertainment purposes. As a researcher in this topic, I find it fascinating to explore the different functionalities that are being developed among the many possibilities one can think of. The priority right now are security-related algorithms to prevent the most common dangerous situations such as driver fatigue or distraction.

I put “intelligent” in quotation marks at the beginning of this text because we are preparing these algorithms to monitor something that is actually intelligent: humans. And in this case, intelligence can also mean stubbornness and arbitrary behaviour, making it an arduous task. A competition arises between the human driver and the algorithm.

DMS have the challenge of monitoring something that is constantly reminding them how rudimentary they still are. Humans can be unpredictable and even rebellious. A blueberry detector in a greenhouse may enjoy more inner peace; the blueberry won’t change its mood, won’t try to defy the machine, won’t get mad at it in any case. It may encounter occlusions (when the subject under analysis is not visible because it is covered or unrecognizable) and fail, but it can justify that occlusions are a common and known challenge of computer vision. DMS, on the other hand, face the occlusion of our minds; they can’t know what we are thinking or what our intentions are. Humans can get lost in their own thoughts, get distracted, and cause an accident. We may be wondering how much Brad Pitt earns a year and not notice a red light - a non-intentional mistake but definitely something worth mitigating.

The Euro NCAP brought some understanding when it defined distraction as not looking at the road for 3 seconds or having a cumulative 10 seconds of short distractions (not looking at the road for less than 3 seconds) within a 30-second time period[1]. This gave some relief to the algorithms; they could attribute distraction to humans only by measuring their gaze. Many algorithms are well prepared for this; they base their calculations on the eye pupil and cornea, estimating a vector that represents the direction of the driver’s gaze[2]. Others just learn what region the driver is looking at (e.g., left mirror, front, centre mirror, etc)[3]. Over the years, these algorithms have become very precise. But then, humans wore glasses.

An alternative is to detect driver’s actions to assess distraction[4]. There are algorithms trained to identify actions that can lead to an accident or at least indicate distraction, such as texting, eating, smoking, among others. And in this “among others”, there could fit million others. I encourage the reader to think of all the situations beyond those three that they have done and that could represent the same risk. I can think of some: blowing your nose, scratching that mosquito bite on your ankle, holding your significant other’s hand while driving, maybe trying to give them a quick kiss or preparing the money to pay the toll; and I could go on, but I have a one-page limit.

Human creativity on getting distracted is higher than the number of classes an algorithm can take. Maybe today’s algorithms have demonstrated to be able to handle hundreds of classes, but DMS are required to function at low latency and, please, be very light so you can go on the same hardware as the light controller. To give them a break, the industry is focusing on detecting the most common human distractions that cause accidents. Additionally, object detection algorithms are being included to detect distracting objects (e.g., phones, food, cigarettes) instead.

Fatigue detection is also a complex task as well. Microsleeps at high speed can lead to lethal accidents, so the algorithms are putting much effort into detecting them. Analysing the level of aperture of the eye over the time can help estimate drowsiness determined using the PERCLOS (PERcentage of eyelid CLOSure)[5][6] metric, which indicates the proportion of time in a minute that the eyes are at least 80 percent closed, or the level of sleepiness (e.g., Karolinska Sleepiness Scale)[7]. Euro NCAP defines microsleeps as lasting less than 3 seconds, otherwise, it is considered sleep. They are short durations of eye closure of about 1 or 2 seconds long. As can be inferred, algorithms must be precise in calculating the level of aperture of the eyes to do such analysis and do it fast or in a blink of an eye.

Emotions are staring to be included in algorithms conversations[8], knowing humans emotional state can help interpret stress levels and provide better assistance to the driver. But are we really aware of our own emotions? My psychologist might disagree with me. There are other health-related functionalities[9] that could indicate if the driver had a health incident that prevents them from driving, including impairment due to drug/alcohol consumptions[10].

We make it difficult to be monitored. Not being sufficient with a seatbelt-on-buckle detector, in the training of algorithms, seat belts are also being taught, since drivers may have it buckled but not be wearing it[11]. Humans are complex, and as they continue to get comfortable in their cars, algorithms must constantly surpass their limits and learn from human behaviour. This involves not only improving their intelligence but also their perception by incorporating cameras, radars, wearables, smart steering wheels or seats, microphones, and other specific sensors.

This human-machine cooperation can bring safety advantages and, in the future, also comfort and entertainment functionalities. We just need to continue researching on humans, understanding and including our diversity. In the Aware2All project, there are people developing technology for people, with an approach rooted in human-centred design. Together with algorithms, we can make more intelligent and safer vehicles accessible 2All.

[1] Euro NCAP. (2022). Euro NCAP Assessment Protocol – SA (Safe Driving) v10.01. Retrieved from

[2] Ghosh, S., Dhall, A., Hayat, M., Knibbe, J., & Ji, Q. (2021). Automatic Gaze Analysis: A Survey of Deep Learning based Approaches. arXiv preprint arXiv:2108.05479. Retrieved from

[3] Naqvi, R., Arsalan, M., Batchuluun, G., Yoon, H., & Park, K. (2018). Deep Learning-Based Gaze Detection System for Automobile Drivers Using a NIR Camera Sensor. Sensors, 18(2), 456.

[4] Cañas P., Ortega J., Nieto M. and Otaegui O. (2021). Detection of Distraction-related Actions on DMD: An Image and a Video-based Approach Comparison.In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, ISBN 978-989-758-488-6, pages 458-465. DOI: 10.5220/0010244504580465

[5] Junaedi, S., & Akbar, H. (2018). Driver drowsiness detection based on face feature and PERCLOS. Journal of Physics: Conference Series, 1090(1), 012037.

[6] Lock, N. K., Ng, W. M., Jusoh, N. A., Kamarudin, N. H., Ramli, R., & Zulkoffli, Z. (2022). Drowsiness detection for safe driving using PERCLOS and YOLOv2 method. In S. A. Bakar, M. A. M. Radzi, & M. F. M. Yusof (Eds.), Technological advancement in instrumentation & human engineering (pp. 97-107). Springer Singapore.

[7] Akerstedt, T., & Gillberg, M. (1990). Subjective and objective sleepiness in the active individual. International Journal of Neuroscience, 52(1-2), 29-37.

[8] Oh, G., Ryu, J., Jeong, E., Yang, J. H., Hwang, S., Lee, S., & Lim, S. (2021). DRER: Deep Learning–Based Driver’s Real Emotion Recognizer. Sensors, 21(6), 2166.

[9] Babusiak, B., Hajducik, A., Medvecky, S., Lukac, M., & Klarak, J. (2021). Design of Smart Steering Wheel for Unobtrusive Health and Drowsiness Monitoring. Sensors, 21(16), 5285.

[10] Attia, H. A., Takruri, M., & Ali, H. Y. (2016, December). Electronic monitoring and protection system for drunk driver based on breath sample testing. In 2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA) (pp. 1-4). IEEE.

[11] Khan, M. A., & Kim, J. (2020). Seat Belt Fastness Detection Based on Image Analysis from Vehicle In-Cabin Camera. In 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia) (pp. 1-4)

Written by Paola Natalia Cañas