and identically distributed (i.i.d.) with regards to the
training set. Instead, one ought to use one or sev-
eral datasets for testing and evaluation, that have sys-
tematic differences from the training dataset, making
them out-of-distribution (o.o.d.).
Now, as mentioned previously this is unfeasible,
which means that the final model evaluation is sus-
ceptible to shortcut learning. The evaluation data is
related to the training data in two ways. First off, the
evaluation data comes from the same dataset mak-
ing it i.i.d. with respect to the motions it contains.
Secondly, the source of the noise used to generate
the input motions is not the actual noise introduced,
when going through a webcam-based pose estimation
pipeline, but instead the same noise estimation used
as when training the network.
As such, the validation data used represent the
best possible effort, given the limited data availabil-
ity for this task. However, should datasets of skinned
human motion augmentation become generally avail-
able, it would then be desirable to re-evaluate the
model on those o.o.d. datasets.
An LSTM-based prediction model is constructed and
shown to be competitive with prior work on the task
of predicting human motion. The same approach is
then used to train an augmentation model, that is ca-
pable of cleaning up and merging two noisy motions
into a single motion. This shows that an LSTM-based
architecture is viable for augmenting human motions,
when evaluated on generated data. The lack of an-
notated data to evaluate on, means that it is unclear
how the model performs on real life data. Overcom-
ing this limitation and implementing various potential
improvements is a topic for future work.
Neural Network-based Human Motion Smoother