This avoids the need to change huge parts
of the implementation when replacing the DB.
8 SUMMARY
At the beginning of the paper, we showed that spe-
cific access patterns arise when DBs are integrated
into DSP applications. Based on this knowledge,
we’ve investigated which DBs are well suited for
which types of access and how well they interact with
Apache Flink for sensor data processing use cases.
We also examined the impact of windowing mecha-
nisms on data processing and the usefulness of Hazel-
cast, which we used as a data cache and write buffer.
The results indicate that the suitability of DBs
depends heavily on the access pattern that is typi-
cal for the particular use case. Benchmarking realis-
tic heatmap queries, for example, showed throughput
differences of a factor up to 46.2 (MariaDB: 55,135
tuple/s, MongoDB: 1,193 tuple/s). Therefore, we can-
not make a general recommendation for a certain DB,
but instead advise to determine the most frequent ac-
cess pattern of the considered use case and to make
the choice of DB dependent on this.
The use of Hazelcast as a data cache hardly
brought any advantages for read access with regard to
our use case, but a higher throughput could often be
achieved for write access. Whether the use of Hazel-
cast is beneficial or not depends on a large number of
factors that influence each other and should therefore
always be examined on a use-case-specific basis.
Finally, we have presented a set of recommenda-
tions for the integration of DBs into DSP applica-
tions, based on knowledge that we developed during
our analyses. These can help to avoid implementa-
tion problems from the very beginning and to achieve
quick optimization gains in case the implementation
does not meet the performance requirements given by
the use case.
8.1 Future Work
We found that the use of Hazelcast caused changes
in the access patterns to the DB systems, which had
positive or negative consequences for throughput de-
pending on the particular database used. In further
research, we plan to investigate this component inter-
action further to determine if higher processing per-
formance can be achieved for specific access patterns
through the targeted use of in-memory data grids. It
may also be possible to achieve a (partial) decoupling
of the database access patterns from the cyclic pro-
cesses in DSP with this approach.
ACKNOWLEDGEMENTS
This work is financed by the German Federal Ministry
of Transport and Digital Infrastructure (BMVI) within
the research initiative mFUND (FKZ: 19F2011A).
We would like to thank Hannes Hilbert, who pro-
vided us with great support in the implementation
and execution of the experiments. Finally, we would
like to thank the Center for Information Services and
High Performance Computing (ZIH) for providing
the servers used for the measurements.
REFERENCES
Abramova, V. and Bernardino, J. (2013). Nosql databases:
Mongodb vs cassandra. In Proceedings of the In-
ternational C* Conference on Computer Science and
Software Engineering, C3S2E ’13, pages 14–22, New
York, NY, USA. ACM.
Ahamed, A. (2016). Benchmarking top nosql databases.
Master’s thesis, Institute of Computer Science, TU
Clausthal.
Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach,
D. A., Burrows, M., Chandra, T., Fikes, A., and Gru-
ber, R. E. (2008). Bigtable: A distributed storage sys-
tem for structured data. ACM Transactions on Com-
puter Systems (TOCS), 26(2):4.
Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R.,
and Sears, R. (2010). Benchmarking cloud serving
systems with ycsb. In Proceedings of the 1st ACM
Symposium on Cloud Computing, SoCC ’10, page
143–154, New York, NY, USA. Association for Com-
puting Machinery.
Fiannaca, A. J. (2015). Benchmarking of relational and
nosql databases to determine constraints for querying
robot execution logs [ final report ].
Klein, J., Gorton, I., Ernst, N., Donohoe, P., Pham, K., and
Matser, C. (2015). Performance evaluation of nosql
databases: A case study. In Proceedings of the 1st
Workshop on Performance Analysis of Big Data Sys-
tems, PABS ’15, pages 5–10, New York, NY, USA.
ACM.
Nelubin, D. and Engber, B. (2013). Ultra-high performance
nosql benchmarking: Analyzing durability and perfor-
mance tradeoffs. White Paper.
Niyizamwiyitira, C. and Lundberg, L. (2017). Performance
evaluation of sql and nosql database management sys-
tems in a cluster. International Journal of Database
Management Systems, 9:01–24.
Stonebraker, M., C¸ etintemel, U., and Zdonik, S. (2005).
The 8 requirements of real-time stream processing.
SIGMOD Rec., 34(4):42–47.
Weißbach, M., Hilbert, H., and Springer, T. (2020). Perfor-
mance analysis of continuous binary data processing
using distributed databases within stream processing
environments. In CLOSER, pages 138–149.
Westoby, L. (2019). Apache cassandra™: Four interesting
facts. letzter Zugriff 03. Juni 2021.
CLOSER 2022 - 12th International Conference on Cloud Computing and Services Science
26