Objects Motion Detection in Domain-adapted Assisted Driving

Francesco Rundo

1 a

, Roberto Leotta

and Sebastiano Battiato

2 b

STMicroelectronics, ADG Central R&D, Catania, Italy

Department of Mathematics and Computer Science, University of Catania, Catania, Italy

Keywords:

ADAS, Automotive, Deep Learning, Road Classiﬁcation, Intelligent Suspension.

Abstract:

The modern Advanced Driver Assistance Systems (ADAS) contributed to reduce road accidents due to the

driver’s inexperience or unexpected scenarios. ADAS technologies allow the intelligent monitoring of the

driving scenario. Recently, estimation of the visual saliency i.e. the part of the visual scene in which the driver

put high visual attention has received signiﬁcant research interests. This work makes further contributions

to video saliency investigation for automotive applications. The difﬁculty to collect robust labeled data as

well as the several features of the driving scenarios require the usage of such domain adaptation methods.

A new approach to Gradient-Reversal domain adaptation in deep architectures is proposed. More in detail,

the proposed pipeline enables an intelligent identiﬁcation and segmentation of the motion salient objects in

different driving scenarios and domains. The performed test results conﬁrmed the effectiveness of the overall

proposed pipeline.

1 INTRODUCTION

The term Advanced Driver Assistance Systems

(ADAS) includes different type of intelligent solu-

tions including systems providing driver assistance,

advice and warnings, self autonomous driving and

so on (Okuda et al., 2014). In this context the

car assisting-information systems or LiDAR/RADAR

based applications can be included (Spelt and Tufano,

1998). Recent ADAS technology enhancement in-

cludes Intelligent Speed Adaptation systems, colli-

sion warning systems, car driver drowsiness monitor-

ing and pedestrian tracking systems (Ogitsu and Mi-

zoguchi, 2015; Wang et al., 2019; Ganin and Lem-

pitsky, 2015). The ADAS warnings system may be

auditory, visual or haptic, covering such level of such

standard automotive (Zhan et al., 2020). Deep Learn-

ing solutions have signiﬁcantly improved the ability

of algorithms to address several issues in automotive

and ADAS ﬁelds.

Often, the problem of lacking labeled data, can

impact the performance of such artiﬁcial intelligence

based solutions. To address this relevant issue, ad-

hoc intelligent domain adaptation approaches have

been implemented and published in scientiﬁc litera-

ture database (Ganin and Lempitsky, 2015). The pro-

https://orcid.org/0000-0003-1766-3065

https://orcid.org/0000-0001-6127-2470

posed pipeline embeds innovative domain adaptation

approach based on the usage of the Gradient Rever-

sal algorithm. More in detail, the authors propose an

overall ADAS system embedding a physio-based car

driver drowsiness tracking system combined with a

domain-adapted intelligent risk assessment of the as-

sociated driving scenario. Speciﬁcally, by means of

the designed innovative domain adaptation method,

the proposed pipeline will be able to detect and track

the driving motion objects, providing an associated

overall driving risk assessment. About objects motion

estimation, different solutions have been proposed. A

summary about scientiﬁc state of the arts is reported.

In (Zheng et al., 2018) the authors proposed an

approach based on odometry for object motion esti-

mation to be extended to automotive market. The re-

ported performance conﬁrmed the effectiveness of the

implemented pipeline. In (Barjenbruch et al., 2015)

the authors implemented an interesting motion detec-

tion pipeline based on the usage of doppler effect over

radar technologies. Even the investigated approach

showed very interesting results, the drawback to need

the radar equipment was highlighted. In (Hee Lee

et al., 2013) the authors proposed a visual ego-motion

estimation algorithm for a self-driving car equipped

with a commercial multi-camera system. The results

obtained over a large dataset conﬁrmed the robustness

of the proposed architecture (Hee Lee et al., 2013).

Rundo, F., Leotta, R. and Battiato, S.

Objects Motion Detection in Domain-adapted Assisted Driving.

DOI: 10.5220/0010973100003209

In Proceedings of the 2nd International Conference on Image Processing and Vision Engineering (IMPROVE 2022), pages 101-108

ISBN: 978-989-758-563-0; ISSN: 2795-4943

101

Further supervised and unsupervised based deep solu-

tions were widely investigated with aim to address the

issue of the efﬁcient driving object tracking and mo-

tion estimation (Grigorescu et al., 2019; Alletto et al.,

2018; Wang et al., 2021). Considerable interest has

been found by the unsupervised and semi-supervised

domain adaptation techniques.

In (Singh et al., 2021) the authors proposed a

semi-supervised domain adaptation approach which

leveraged limited labeled target samples with unla-

beled data to manage the distribution shift across the

source and target domains. The proposed approach

contributed signiﬁcantly in bridging the domain gap

as conﬁrmed by the experimental results reported in

(Singh et al., 2021). The survey reported in (Carr

et al., 2018) showed the considerable advantages that

domain adaptation techniques entail for ADAS and

automotive applications. About driver attention mon-

itoring systems, the authors of the pipeline herein pro-

posed have deeply investigated that issue(Vinciguerra

et al., 2018; Conoci et al., 2018; Rundo et al., 2018a;

Rundo et al., 2018b; Trenta et al., 2019; Rundo et al.,

2019a; Rundo et al., 2020a; Rundo et al., 2020b).

More in detail, the performed scientiﬁc investiga-

tion has conﬁrmed that the car driver physiological

signals, especially the Photoplethysmography (PPG),

can be efﬁciently used to real-time monitoring of the

subject drowsiness (Rundo et al., 2020b; Rundo et al.,

2019b; Lee et al., 2019). The proposed full solution

can be contextualized in the intelligent driving assis-

tance approaches in different driving scenarios and

domains.

2 DOMAIN ADAPTATION FOR

MOTION ASSESSMENT

As introduced, the core of the proposed approach re-

gards a domain-adapted driving risk assessment sys-

tem for ad-hoc object motion tracking. In Fig. 1 the

overall diagram of the proposed pipeline.

The system reported in Fig. 1 allow a robust

driving risk level assessment through an intelligent

processing of the sampled driving visual frames.

The sampled driving scene frames will be processed

by ad-hoc Semantic Segmentation Fully Convolu-

tional Neural Network embedding a Gradient Re-

versal layer (SS-FCN-GRL) (Ganin and Lempitsky,

2015). Through a semantic segmentation of the cap-

tured driving visual frames, the saliency map of the

analyzed source driving scene will be reconstructed.

This saliency map will be fed as input data of the

downstream driving safety assessment sub-system to

retrieve the correlated motion dynamic.

The proposed SS-FCN-GRL architecture will be

described in detail. The designed encoder block (En-

coding) composed by 5 layers is able to process the

visual features of the captured driving frames. The

ﬁrst two blocks embed (for each block) two separa-

ble convolution layers with 3 × 3 kernel ﬁlters fol-

lowed by a batch normalization, ReLU layer and a

downstream 2 × 2 max-pooling layer. The remain-

ing three blocks include two separable convolution

layers with 3 × 3 kernel ﬁlter followed by a batch

normalization, another convolutional layer with 3 × 3

kernel, batch normalization and ReLU with a down-

stream 2 × 2 max-pooling layer. The Decoder stage

of the proposed pipeline is composed as per encoder

structure i.e. up-sampling the encoded visual features

through ad-hoc decoding processing. The decoder is

composed by ﬁve blocks including 2D convolutional

layers with 3 × 3 kernel, batch normalization layers,

ReLU. Classical skip-connections through convolu-

tional block were embedded in the backbone. In the

decoder side we have interpolated such up-sampling

blocks (with bi-cubic algorithm) to adjust the size of

the generated feature maps.

To improve the domain adaptation capability, the

authors have embedded the mentioned Gradient Re-

versal Layer block (Ganin and Lempitsky, 2015) as

per Fig. 1. The designed he model works with in-

put samples x ∈ X, where X is the input space while

y (label data) from the label space Y . Let’s deﬁned

a classiﬁcation problems where Y is a ﬁnite set (Y =

{1,2,...,L}), handling any output label space. We fur-

ther assumed that there exist two distributions S(x,y)

and T (x, y) X ⊗ Y, which will be referred to as the

source distribution and the target distribution (or the

source domain and the target domain). Both distribu-

tions were assumed as unknown and speciﬁcally S is

“shifted” from T through a not speciﬁed domain re-

mapping. Our objective function is to predict labels y

given the input x for the target distribution. At training

time, we supposed to collect large training samples

,...,x

} from both the source and the target do-

mains distributed according to the deﬁned statistical

distributions S(x) and T (x).

We denote with d

the binary variable (domain la-

bel) for the i-th example, which means that (x

∼ S(x)

if d

= 0) or (x

∼ T (x) if d

= 1). We now deﬁne a

custom deep feed-forward architecture which for each

input x predicts its label y ∈ Y and its domain label

d ∈ {0,1}. The authors assumed that the input x is

re/mapped through the function G

(a feature extrac-

tor) to a D-dimensional feature vector f ∈ R

. The

proposed feature mapping includes feed-forward lay-

ers and we denote the vector of parameters of all lay-

ers in this mapping as θ

, i.e. f = G

(x;θ

). Then,

IMPROVE 2022 - 2nd International Conference on Image Processing and Vision Engineering

102

Figure 1: The proposed domain adapted motion assessment pipeline.

the feature vector f is mapped by a mapping G

(label

predictor) to the label y, and we denote the parameters

of this mapping with θ

. Finally, the same feature vec-

tor f is mapped to the domain label d by a mapping

(domain classiﬁer) with the parameters θ

(Fig. 1).

During the learning session, the implemented

deep system tries to minimize the label prediction loss

on the labeled part (i.e. the source part) of the train-

ing set. At the same time, the parameters of both the

feature extractor and the label predictor are thus opti-

mized in order to minimize the empirical loss for the

source domain unlabeled samples. In the following

section, a mathematical formalization of the proposed

GRL approach:

E(θ

,θ

) =

∑

i=1...N

;θ

);θ

),y

)−

∑

i=1...N

;θ

);θ

),y

) =

∑

i=1...N

(θ

,θ

) − λ

∑

i=1...N

(θ

,θ

)

(1)

where L

(·,·) is the loss for the label prediction while

(·,·) is the loss for the domain classiﬁcation. The

terms L

and L

denote the corresponding loss func-

tions evaluated at the i-th training input frames. Based

on our proposed architecture, we are seeking the pa-

rameters θ

,θ

that ﬁnd a saddle point of the Eq. 1:

(θ

,θ

) = argmin

,θ

E(θ

,θ

) (2)

= argmax

E(θ

,θ

) (3)

At the saddle point, the parameters θ

of the do-

main classiﬁer minimizes the domain classiﬁcation

loss while the parameters θ

of the label predictor

minimizes the label prediction loss. The feature map-

ping parameters θ

minimizes the label prediction

loss, while maximizing the domain classiﬁcation loss

(i.e. the features are domain-invariant). The learning

rate λ modulates the two objective dynamics. The au-

thors whose designed the GRL approach have showed

that the classical Stocastic Gradient Descent (SGD)

learning is able to ﬁnd the needed saddle point (Ganin

and Lempitsky, 2015).

After that SGD-based learning, the label predic-

tor y(x) = G

(x;θ

);θ

) can be used to predict la-

bels for samples from the target domain (and clearly

the source domain). The output of the so designed

SS-FCN-GRL is the feature saliency map of the sam-

pled driving frame. Speciﬁcally, a no-zero saliency

map will be generated in case of motion objects while

null saliency map will be generated in case of slow-

motion or static objects in the sampled driving sce-

nario frames.

Through the action of the GRL the proposed so-

lutions are able to train the deep network both with

label data and target ones (different driving scenario

frames) suitable to retrieve a robust domain-invariant

motion-objects saliency map, thus characterizing the

risk of driving accordingly.

In Fig. 2 the authors have reported some instances

of the so generated saliency maps for moving and

slow-motion objects in the analyzed driving scenario.

The proposed SS-FCN-GRL architecture has been

validated and tested on the DHF1K dataset (Min and

Corso, 2019). The proposed solution has showed

acceptable performance on DHF1K dataset (Rundo

Objects Motion Detection in Domain-adapted Assisted Driving

103

Figure 2: Intelligent Driving Objects Tracking and Motion

Detection: (a) High moving saliency objects tracking; (b)

Slow motion objects tracking and detection (Null saliency

map).

et al., 2019b) (Area Under the Curve: 0.892; Sim-

ilarity: 0.415; Correlation Coefﬁcient: 0.480; Nor-

malized Scanpath Saliency: 2.598) with respect to

similar intelligent backbones. A careful porting of

the proposed deep solution as per Fig. 1 is running

over ad-hoc hardware with graphic/AI accelerations

and provided by STMicroelectronics (MCUs frame-

work based STA1295A Accordo5 and SPC5x Chorus

devices) (Rundo et al., 2021).

3 THE PHYSIO-BASED CAR

DRIVER DROWSINESS

MONITORING

As introduced, the proposed ADAS solution embeds

a robust intelligent car driver drowsiness monitoring

system. Speciﬁcally, we proposed a car-driver at-

tention level monitoring based on the usage of the

driver’s Photoplethysmographic (PPG) signal.

The PPG signal can be considered as a less-

invasive physio-signal suitable to monitor cardiovas-

cular dynamics of a human subject. Both heart pulse

and respiratory rate as well as vascular and cardiac

disorders may be monitored by means of ad-hoc anal-

ysis of the PPG dynamic (Ganin and Lempitsky,

2015). Through the PPG signal the authors were

able to perform less-invasive measure of the subject

blood volume changes. A classical PPG waveform

embeds a pulsatile (‘AC’) physiological signal which

is correlated to cardiac-synchronous changes in the

blood volume superimposed with a slowly varying

(‘DC’) component containing lower frequency sub-

signals correlated to respiration and other physiologi-

cal parameters. The change in volume caused by the

periodic heart pressure pulses can be tracked by il-

luminating the skin of the subject and then by mea-

suring the amount of light either transmitted or back-

scattered by means of ad-hoc combined detector (Og-

itsu and Mizoguchi, 2015; Wang et al., 2019). More

detail about PPG patter formation in (Okuda et al.,

2014; Spelt and Tufano, 1998; Panagiotopoulos and

Dimitrakopoulos, 2019; Ogitsu and Mizoguchi, 2015;

Wang et al., 2019; Ganin and Lempitsky, 2015; Zhan

et al., 2020). For the proposed pipeline, the authors

have used the PPG sampling embedding a Silicon

Photomultiplier (SiPM) device provided by STMicro-

electronics (Vinciguerra et al., 2018; Conoci et al.,

2018; Rundo et al., 2018a).

The proposed PPG sensing probes includes a large

area n-on-p Silicon Photomultipliers (SiPMs) fabri-

cated at STMicroelectronics (Conoci et al., 2018;

Rundo et al., 2018a). 4.0 × 4.5 mm

and 4871 square

microcells with 60 µm pitch. The devices have a ge-

ometrical ﬁll factor of 67.4% and are packaged in a

surface mount housing (SMD) with about 5.1 × 5.1

total area (Conoci et al., 2018). We propose

the usage of Pixelteq dichroic bandpass ﬁlter with a

pass band centered at about 840 nm nm with a Full

Width at Half Maximum (FWHM) of 70 nm and an

optical transmission higher than 90 − 95% in the pass

band range was glued on the SMD package by using

a Loctite 352TM adhesive. With the dichroic ﬁlter at

3V-OV the SiPM has a maximum detection efﬁciency

of about 30% at 565 nm and a PDE of about 27.5%

at 830 nm (central wavelength in the ﬁlter pass band).

We have applied a dichroic ﬁlter to reduce the absorp-

tion of environmental light of more than 60% when

the detector works in the linear range in Geiger mode

above its breakdown voltage (∼27 V).

As described, the so designed PPG probe embeds

a set of OSRAM LT M673 LEDs in SMD package

emitting at 830 nm and based on InGaN technology

(Conoci et al., 2018). The used LEDs devices have an

area of 2.3 × 1.5 mm

, viewing angle of 120°, spec-

tral bandwidth of 33 nm and lower power emission

(mW) in the standard operation range. The authors

designed an embedded motherboard populated by a

4 V portable battery, a power management circuits,

a conditioning circuit for output SiPMs signals, sev-

eral USB connectors for PPG probes and related SMA

output connectors (Conoci et al., 2018; Rundo et al.,

2018b). We designed to embed several PPG sensing

probes on the car steering.

IMPROVE 2022 - 2nd International Conference on Image Processing and Vision Engineering

104

Figure 3: The designed PPG sensing platform.

In Fig. 3 we report an overall scheme of the pro-

posed PPG sensing framework. The ﬁltering and sta-

bilization of the collected raw PPG signal will be

performed by the developed algorithms running as

ﬁrmware in the SPC5x Chorus MCU (Conoci et al.,

2018; Rundo et al., 2018b; Trenta et al., 2019; Rundo

et al., 2019a; Rundo et al., 2020a; Rundo et al.,

2020b; Rundo et al., 2019b).

The designed hyper-ﬁltering approach (Rundo

et al., 2018b; Rundo et al., 2019b) will be applied

to the collected steady-state PPG raw data in order to

retrieve such discriminative features to be correlated

to the driver attention level.

More in detail, the idea inside the hyper-ﬁltering

approach was inspired by hyper-spectral method usu-

ally applied to 2D data (Rundo et al., 2019b). Basi-

cally, the authors investigated the discrimination level

of the features retrieved by the ”hyper-ﬁltering” of

the source car driver PPG signal. More in detail, in-

stead of applying a single ﬁlter setup (low pass and

high pass) having a well-deﬁned cut-off frequency,

we have analyzed a range of dynamic frequencies in

which the PPG signal shows useful information. Con-

sidering that the useful frequency range is included in

the 0.5 −10Hz, we have investigated the performance

of an hyper-ﬁltered PPG-based classiﬁcation system

in which the signal frequency spectrum (0.5 − 10Hz)

was divided into several sub-bands. We have conﬁg-

ured two spectral layers of hyper-ﬁltering layer. A

ﬁrst layer changes the frequencies in the low-pass ﬁl-

ter maintaining instead the cut-off frequency of the

high-pass ﬁlter (Hyper low-pass ﬁltering layer) and

vice versa a layer that changes the cut-off frequencies

of the high-pass ﬁlter while maintaining ﬁxed the fre-

quency setup of the low pass ﬁlter ((Hyper high-pass

ﬁltering layer). Due to an efﬁcient noise-modulations

in the bandwidth, we adopted the Butterworth ﬁlter

types in both layers of Hyper ﬁltering (Rundo et al.,

2020a; Rundo et al., 2020b; Rundo et al., 2019b).

Through a Reinforcement Learning algorithm (Rundo

et al., 2020b; Rundo et al., 2019b) we optimized the

setup of hyper-ﬁltering layer to be applied to the col-

Figure 4: The proposed downstream PPG deep 1D-CNN

classiﬁer.

lected car driver PPG signal. This setup is reported in

the following Table 1 e Table 2.

Table 1: Hyper Low-pass ﬁltering setup (in Hz).

F F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11

HP 0.5 / / / / / / / / / /

LP 0.0 1.2 3.3 3.5 3.6 3.8 4.0 4.2 5.0 5.1 6.1

Table 2: Hyper High-pass ﬁltering setup (in Hz).

F F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11

HP 0.0 1.2 2.3 2.6 3.1 3.5 4.0 4.3 5.0 5.5 6.2

LP 0.6 / / / / / / / / / /

At this point, once the hyper-ﬁltering conﬁgura-

tion has been completed, the car driver PPG raw sig-

nal that is gradually sampled will be processed ac-

cordingly to the so conﬁgured hyper-ﬁltered frame-

work. For each sample of the single PPG waveform,

a dataset of hyper ﬁltered signals will be generated,

each having a temporal dynamics represented by the

time-dynamic of that signal-sample. Formally, if we

indicate with W

PPG

(t,k) the single segmented PPG

compliant waveform of each hyper-ﬁltered PPG time-

serie, we proceed computing for each sample s(t, k) of

the waveform a signal-pattern depending on how that

signal samples s(t,k) changes in intensity in each of

the previously generated hyper-ﬁltered signals.

Through the above detailed hyper-ﬁltered

pipeline, a large dataset of hyper-ﬁltered signals will

be generated and collected. The above RL algorithms

conﬁrmed that a well optimization is reached by

11 sub-bands of hyper-ﬁltering frequencies. The so

generated signal-patterns will be fed into the Deep

Learning block as described in the Fig. 4.

The signal-patterns generated by the previous

hyper-ﬁltering PPG processing pipeline (Rundo et al.,

2019b) will be preliminary encoded by a Long-

Short-Term Memory (LSTM) network. The LSTM

composed by 6 layers of vanilla-unit backbone

(Trenta et al., 2019), is able to encode the hyper-

ﬁltered signal-patterns in order to provide a folding-

Objects Motion Detection in Domain-adapted Assisted Driving

105

embedding of the input signals. These features will

be fed as input to the deep 1D downstream classiﬁer

(Rundo et al., 2018b; Trenta et al., 2019). The Deep

1D Temporal Dilated Convolutional Neural Network

(1D-CNN) with residual block (Rundo et al., 2020b).

Speciﬁcally, we have implemented a 1D-CNN em-

bedding 36 blocks with a downstream softmax layer.

The output of the deep network is a classiﬁcation of

the input hyper-ﬁltered PPG patterns i.e. a classiﬁca-

tion of drowsy (0.0, 0.5) or wakeful (0.51,1.0) driver.

Each of the 1D-CNN block consists of a dilated

convolution layer having 3 × 3 kernel ﬁlters, a spa-

tial dropout layer, another dilated convolution layer,

ReLU layer and a ﬁnal spatial drop. The dilation size

start from 2 and increase (power of 2) for each block

till the max value of 32. A softmax layer completes

the proposed pipeline. The so designed Deep Learn-

ing framework is able to estimate and monitor the car

driver drowsiness level. As reported in Fig. 4, the pro-

posed deep classiﬁer is running over the STA1295A

Accordo5 MCU with ad-hoc Graphics accelerator and

Linux YOCTO and OpenCV based software frame-

work (Rundo et al., 2020b; Rundo et al., 2019b).

4 EXPERIMENTAL RESULTS

AND CONCLUSION

We tested the proposed pipeline, ﬁrstly validating

each of the implemented sub-systems and then ar-

ranging a combined testing scenario. Speciﬁcally,

we have considered the following risk assessment

in relation to the tracking of the salient moving ob-

jects: detected no-zero map for salient moving ob-

jects (medium/high driving risk) against a scenario

with detected zero-map slow moving salient objects

with associated null generated saliency map (low risk

driving scenario).

Therefore, the proposed full pipeline provide

an overall driving risk assessment comparing the

saliency-motion-based risk evaluation with the PPG

physio-based drowsiness monitoring retrieving if that

attention level is adequate or not.

More in detail, if high or medium risk level is de-

tected, the proposed driving monitoring system will

check if the designed 1D-TCNN detects a correspond-

ing ”wakeful driver” classiﬁcation. Otherwise, acous-

tic alert-signal will be generated. In the scenario in

which the driver’s PPG signal is not available for

some reasons, the authors have developed a Visual-

to-PPG replacement algorithm (Trenta et al., 2019).

About the physio-based car driver drowsiness assess-

ment, we have validated the proposed pipeline by col-

lecting several PPG measurements of different sub-

Table 3: Car Driver Drowsiness monitoring performance.

Method

Driver Drowsiness Monitoring

Drowsy

Driver

Wakeful

Driver

Proposed 99.76% 99.89%

1D-

Temporal

CNN w/o

LSTM

98.71% 99.03%

(Rundo et al., 2019b) 96.50% 98.40%

jects in different scenarios (Drowsy driver vs Wakeful

driver) under authorization of the Ethical Committee

CT1 authorization Nr. 113 / 2018 / PO. The PPG

sampling sessions have been supervised by experi-

enced physicians. We have collected data from 70 pa-

tients with different ages, sex, and so on (Rundo et al.,

2019a). We have used the herein introduced PPG

hardware setup with sampling frequency of 1 kHz.

For each condition (Drowsy vs Wakeful subject) we

have collected 5 minutes of PPG signals. All ac-

quired PPG time-series have been arranged as fol-

low: 70% for the training and validation phase of the

Deep learning framework while the remaining 30%

have been used for testing. We have used different

driving scenarios: some labeled driving scenarios and

some unlabeled target domain scenarios for testing

the proposed GRL approach. The results reported in

Table 3 conﬁrmed that the physio-sensing system for

car driver drowsiness monitoring outperformed simi-

lar pipelines in terms of accuracy.

The SS-FCN-GRL and the 1D-TCNN have been

trained with a classic SGD algorithm with dropout

factor of 0.75 and initial learning rate of 0.001. The

LSTM layer was trained with an initial learning rate

of 0.002. The Table 3 shows the performance of

the proposed pipeline compared with similar pipeline

based on deep learning (Rundo et al., 2019b) both in

labeled and target domain adapted driving scenarios.

We consider such interval of about 8/12 seconds of

PPG sampling in order to show the near real-time

performance of the proposed pipeline. Finally, we

have tested the combined full system. Speciﬁcally, we

have validated the proposed low/high risk assessment

of the analyzed driving scenarios. As highlighted by

the experimental results reported in Table 4, the archi-

tecture that exports the domain adaptation GRL layer

shows high performance in risk assessment compared

to the benchmark architectures. The use of the GRL

signiﬁcantly improves the characterization of the mo-

tion of the tracked objects and therefore the conse-

quent risk assessment (accuracy on average of 96%

against 91% of the system without GRL).

The collected performance conﬁrmed the robust-

ness and the effectiveness of the proposed overall ap-

proach.

IMPROVE 2022 - 2nd International Conference on Image Processing and Vision Engineering

106

Table 4: Proposed Intelligent Saliency-Motion Driving risk

assessment system.

Method

Intelligent Domain Adapted Risk

Assessment Performance

Low Risk

(Static Salient Objects)

High Risk

(Salient Moving Objects)

Proposed 96.78% 96.66%

Proposed w/o

GRL

91.87% 91.09%

Classic SS-FCN

With Attention

(Rundo et al., 2021)

91.65% 90.90%

Classic SS-FCN

DenseNet

Backbone

89.30% 90.11%

Classic SS-FCN

ResNet-101

backbone

(Min and Corso, 2019)

87.79% 90.01%

Future works aim on embedding such features of

deep LSTM with Attention to further improve the per-

formance of overall pipeline (Rundo, 2019).

ACKNOWLEDGEMENTS

The authors thank the physiologists belonging to the

Department of Biomedical and Biotechnological Sci-

ences (BIOMETEC) of the University of Catania,

who collaborated in this work in the context of the

clinical study Ethical Committee CT1 authorization

n.113 / 2018 / PO. This research was funded by

the National Funded Program 2014-2020 under grant

agreement n. 1733, (ADAS + Project). The reported

information is covered by the following registered

patents: IT Patent Nr. 102017000120714, 24 Octo-

ber 2017. IT Patent Nr. 102019000005868, 16 April

2018; IT Patent Nr. 102019000000133, 07 January

2019.

REFERENCES

Alletto, S., Abati, D., Calderara, S., Cucchiara, R., and

Rigazio, L. (2018). Self-supervised optical ﬂow esti-

mation by projective bootstrap. IEEE Transactions on

Intelligent Transportation Systems, 20(9):3294–3302.

Barjenbruch, M., Kellner, D., Klappstein, J., Dickmann, J.,

and Dietmayer, K. (2015). Joint spatial-and doppler-

based ego-motion estimation for automotive radars. In

2015 IEEE Intelligent Vehicles Symposium (IV), pages

839–844. IEEE.

Carr

e, M., Exposito, E., and Ibanez-Guzman, J. (2018).

Challenges for the self-safety in autonomous vehicles.

In 2018 13th Annual Conference on System of Systems

Engineering (SoSE), pages 181–188. IEEE.

Conoci, S., Rundo, F., Fallica, G., Lena, D., Buraioli, I., and

Demarchi, D. (2018). Live demonstration of portable

systems based on silicon sensors for the monitoring

of physiological parameters of driver drowsiness and

pulse wave velocity. In 2018 IEEE Biomedical Cir-

cuits and Systems Conference (BioCAS), pages 1–3.

IEEE.

Ganin, Y. and Lempitsky, V. (2015). Unsupervised do-

main adaptation by backpropagation. In International

conference on machine learning, pages 1180–1189.

PMLR.

Grigorescu, S. M., Trasnea, B., Marina, L., Vasilcoi, A.,

and Cocias, T. (2019). Neurotrajectory: A neuroevo-

lutionary approach to local state trajectory learning for

autonomous vehicles. IEEE Robotics and Automation

Letters, 4(4):3441–3448.

Hee Lee, G., Faundorfer, F., and Pollefeys, M. (2013). Mo-

tion estimation for self-driving cars with a general-

ized camera. In Proceedings of the IEEE Conference

on Computer Vision and Pattern Recognition, pages

2746–2753.

Lee, H., Lee, J., and Shin, M. (2019). Using wearable

ecg/ppg sensors for driver drowsiness detection based

on distinguishable pattern of recurrence plots. Elec-

tronics, 8(2):192.

Min, K. and Corso, J. J. (2019). Tased-net: Temporally-

aggregating spatial encoder-decoder network for

video saliency detection. In Proceedings of the

IEEE/CVF International Conference on Computer Vi-

sion, pages 2394–2403.

Ogitsu, T. and Mizoguchi, H. (2015). A study on driver

training on advanced driver assistance systems by us-

ing a driving simulator. In 2015 International Confer-

ence on Connected Vehicles and Expo (ICCVE), pages

352–353. IEEE.

Okuda, R., Kajiwara, Y., and Terashima, K. (2014). A sur-

vey of technical trend of adas and autonomous driv-

ing. In Technical Papers of 2014 International Sym-

posium on VLSI Design, Automation and Test, pages

1–4. IEEE.

Panagiotopoulos, I. and Dimitrakopoulos, G. (2019). Cog-

nitive infotainment systems for intelligent vehicles. In

2019 10th International Conference on Information,

Intelligence, Systems and Applications (IISA), pages

1–8. IEEE.

Rundo, F. (2019). Deep lstm with reinforcement

learning layer for ﬁnancial trend prediction in fx

high frequency trading systems. Applied Sciences,

9(20):4460.

Rundo, F., Conoci, S., Battiato, S., Trenta, F., and Spamp-

inato, C. (2020a). Innovative saliency based deep driv-

ing scene understanding system for automatic safety

assessment in next-generation cars. In 2020 AEIT In-

ternational Conference of Electrical and Electronic

Technologies for Automotive (AEIT AUTOMOTIVE),

pages 1–6. IEEE.

Rundo, F., Conoci, S., Ortis, A., and Battiato, S.

(2018a). An advanced bio-inspired photoplethysmog-

raphy (ppg) and ecg pattern recognition system for

medical assessment. Sensors, 18(2):405.

Rundo, F., Leotta, R., and Battiato, S. (2021). Real-time

deep neuro-vision embedded processing system for

saliency-based car driving safety monitoring. In 2021

4th International Conference on Circuits, Systems and

Simulation (ICCSS), pages 218–224. IEEE.

Objects Motion Detection in Domain-adapted Assisted Driving

107

Rundo, F., Petralia, S., Fallica, G., and Conoci, S. (2018b).

A nonlinear pattern recognition pipeline for ppg/ecg

medical assessments. In Convegno Nazionale Sensori,

pages 473–480. Springer.

Rundo, F., Rinella, S., Massimino, S., Coco, M., Fallica, G.,

Parenti, R., Conoci, S., and Perciavalle, V. (2019a).

An innovative deep learning algorithm for drowsiness

detection from eeg signal. Computation, 7(1):13.

Rundo, F., Spampinato, C., Battiato, S., Trenta, F., and

Conoci, S. (2020b). Advanced 1d temporal deep di-

lated convolutional embedded perceptual system for

fast car-driver drowsiness monitoring. In 2020 AEIT

International Conference of Electrical and Electronic

Technologies for Automotive (AEIT AUTOMOTIVE),

pages 1–6. IEEE.

Rundo, F., Spampinato, C., and Conoci, S. (2019b). Ad-hoc

shallow neural network to learn hyper ﬁltered photo-

plethysmographic (ppg) signal for efﬁcient car-driver

drowsiness monitoring. Electronics, 8(8):890.

Singh, A., Doraiswamy, N., Takamuku, S., Bhalerao, M.,

Dutta, T., Biswas, S., Chepuri, A., Vengatesan, B., and

Natori, N. (2021). Improving semi-supervised domain

adaptation using effective target selection and seman-

tics. In Proceedings of the IEEE/CVF Conference

on Computer Vision and Pattern Recognition, pages

2709–2718.

Spelt, P. and Tufano, D. (1998). An in-vehicle infor-

mation system for its information management. In

17th DASC. AIAA/IEEE/SAE. Digital Avionics Sys-

tems Conference. Proceedings (Cat. No. 98CH36267),

volume 2, pages I31–1. IEEE.

Trenta, F., Conoci, S., Rundo, F., and Battiato, S. (2019).

Advanced motion-tracking system with multi-layers

deep learning framework for innovative car-driver

drowsiness monitoring. In 2019 14th IEEE Inter-

national Conference on Automatic Face & Gesture

Recognition (FG 2019), pages 1–5. IEEE.

Vinciguerra, V., Ambra, E., Maddiona, L., Romeo, M.,

Mazzillo, M., Rundo, F., Fallica, G., di Pompeo,

F., Chiarelli, A. M., Zappasodi, F., et al. (2018).

Ppg/ecg multisite combo system based on sipm tech-

nology. In Convegno Nazionale Sensori, pages 353–

360. Springer.

Wang, C., Sun, Q., Guo, Y., Fu, R., and Yuan, W. (2019).

Improving the user acceptability of advanced driver

assistance systems based on different driving styles:

A case study of lane change warning systems. IEEE

Transactions on Intelligent Transportation Systems,

21(10):4196–4208.

Wang, K., Jiasheng, N., and Yanqiang, L. (2021). A ro-

bust lidar state estimation and map building approach

for urban road. In 2021 IEEE 2nd International Con-

ference on Big Data, Artiﬁcial Intelligence and Inter-

net of Things Engineering (ICBAIE), pages 502–506.

IEEE.

Zhan, H., Wan, D., and Huang, Z. (2020). On the responsi-

ble subjects of self-driving cars under the sae system:

An improvement scheme. In 2020 IEEE International

Symposium on Circuits and Systems (ISCAS), pages

1–5. IEEE.

Zheng, F., Tang, H., and Liu, Y.-H. (2018). Odometry-

vision-based ground vehicle motion estimation with

se (2)-constrained se (3) poses. IEEE transactions on

cybernetics, 49(7):2652–2663.

IMPROVE 2022 - 2nd International Conference on Image Processing and Vision Engineering

108