First we consider the case where the attacker has
access to a BF encoded database E and the same
dataset in plaintext P. Clearly their overlap rate is
100%. Table 5 shows results for the graph matching
attack in this scenario. It is apparent that
• without salting, the attacker could re-identify
records with a maximum F-measure approaching
or exceeding 98%;
• With salting, the performance of the attacks drops,
regardless which measure is used for comparison.
Interestingly, this drop decreases with increasing
sample size.
However, a stable salt value, almost unique for each
record, could effectively thwart the graph-matching
attack. Salt-ALL reduced the maximum F-measure
of the re-identification from over 97% to below 0.5%.
Among the other salt variants, salt-YOB performs
best, especially for smaller samples.
As second example, we consider a case where
r
overlap
6= 100% but r
overlap
> 80%. Table 6 shows the
importance of the overlap rate for the success of the
graph matching attack. Given an overlap rate above
80%, both the maximum number of correct record
re-identifications and the maximum F-measure drops
strongly compared to the previous example given per-
fect overlap. For example, the maximum F-measure
drops from above 98% to about 1% ∼ 4%.
8 CONCLUSION
We studied the effect of salting on the resilience
against pattern mining and graph matching attacks in
PPRL. Salting was shown to be an effective counter-
measure against pattern mining attacks while less ef-
fective against graph matching attacks (unless a sta-
ble salt that is almost unique to each record value is
available). As an additional safeguard, salting is rec-
ommended for perturbation-based PPRL whenever a
stable salt is available.
REFERENCES
Antoni, M. and Schnell, R. (2019). The past, present
and future of the German Record Linkage Cen-
ter (GRLC). Jahrb
¨
ucher f
¨
ur National
¨
okonomie und
Statistik, 239(2):319 – 331.
Bloom, B. H. (1970). Space/time trade-offs in hash coding
with allowable errors. Communications of the ACM,
13:422–426.
Boyd, J., Randall, S., and Ferrante, A. (2015). Application
of privacy preserving techniques in operational record
linkage centres. In Medical Data Privacy Handbook,
pages 267–287. Springer, Netherlands.
Christen, P., Ranbaduge, T., and Schnell, R. (2020). Linking
Sensitive Data. Methods and Techniques for Practical
Privacy-Preserving Information Sharing. Springer,
Cham, Switzerland.
Christen, P., Ranbaduge, T., Vatsalan, D., and Schnell, R.
(2019). Precise and fast cryptanalysis for Bloom fil-
ter based privacy-preserving record linkage. IEEE
Transactions on Knowledge and Data Engineering,
31(11):2164–2177.
Christen, P., Vidanage, A., Ranbaduge, T., and Schnell, R.
(2018). Pattern-mining based cryptanalysis of Bloom
filters for privacy-preserving record linkage. In Ad-
vances in Knowledge Discovery and Data Mining,
pages 530–542, Cham. Springer Int. Publishing.
Cover, T. M. and Thomas, J. A. (2006). Elements of Infor-
mation Theory. Wiley-Interscience, USA.
Culnane, C., Rubinstein, B. I. P., and Teague, V. (2017).
Vulnerabilities in the use of similarity tables in com-
bination with pseudonymisation to preserve data pri-
vacy in the UK Office for National Statistics’ privacy-
preserving record linkage. CoRR, abs/1712.00871.
Durham, E. A., Kantarcioglu, M., Xue, Y., Toth, C., Kuzu,
M., and Malin, B. (2014). Composite Bloom filters for
secure record linkage. IEEE Transactions on Knowl-
edge and Data Engineering, 26(12):2956–2968.
Franke, M., Sehili, Z., Rohde, F., and Rahm, E. (2021).
Evaluation of hardening techniques for privacy-
preserving record linkage. In Proceedings of the
24th International Conference on Extending Database
Technology, EDBT 2021, Nicosia, Cyprus, March 23 -
26, 2021, pages 289–300. OpenProceedings.org.
Hand, D. and Christen, P. (2017). A note on using the
F-measure for evaluating record linkage algorithms.
Statistics and Computing, 28:539–547.
Kuzu, M., Kantarcioglu, M., Durham, E., and Malin, B.
(2011). A constraint satisfaction cryptanalysis of
Bloom filters in private record linkage. In Privacy En-
hancing Technologies 11th International Symposium,
PETS 2011 Waterloo, ON, Canada, July 27-29, 2011,
volume 6794, pages 226–245, Heidelberg. Springer.
Niedermeyer, F., Steinmetzer, S., Kroll, M., and Schnell, R.
(2014). Cryptanalysis of basic Bloom filters used for
privacy preserving record linkage. Journal of Privacy
and Confidentiality, 6(2):59–69.
Schnell, R., Bachteler, T., and Reiher, J. (2009). Privacy-
preserving Record Linkage Using Bloom filters. BMC
Medical Informatics & Decision Making, 9:41.
Schnell, R., Bachteler, T., and Reiher, J. (2011). A
novel error-tolerant anonymous linking code. Work-
ing Paper WP-GRLC-2011-02, German Record Link-
age Center, Duisburg.
Vatsalan, D., Christen, P., and Verykios, V. S. (2013). A
taxonomy of privacy-preserving record linkage tech-
niques. Information Systems, 38(6):946–969.
Vidanage, A., Christen, P., Ranbaduge, T., and Schnell,
R. (2020). A graph matching attack on privacy-
preserving record linkage. In CIKM ’20: The 29th
ACM International Conference on Information and
Knowledge Management, Virtual Event, Ireland, Oc-
tober 19-23, 2020, pages 1485–1494. ACM.
HEALTHINF 2022 - 15th International Conference on Health Informatics
360