but during training to generate pseudo-labels. This in-
creases the precision of examples labeled not only at
the root, but also at inner nodes of the class hierarchy.
To implement this self-supervised learning
scheme, we describe several possible strategies of
deciding which candidate pseudo-labels are reli-
able enough for training. These strategies employ
heuristic, structural and statistical criteria. Our
experiments show that an increase in accuracy of
around one percent point can be expected by simply
using one of our self-supervised strategies on top of
CHILLAX. This improvement comes without any
requirement of fine-tuning unrelated parameters or
undue computational efforts.
Future Work. In the future, these methods could
also be applied to semi-supervised learning tasks in
general, e.g.,, by assigning a root label to the unla-
beled images as long as a closed-world scenario can
be assumed. Furthermore, the individual heuristics
could be combined into a meta-heuristic. In contrast,
relaxing the closed-world assumption is another im-
portant research direction. Asking a hierarchical clas-
sifier for its confidence in the root node is a first step
towards open-set models from a semantic perspective,
as long as the predicted confidence has a reasonable
basis. A fixed hierarchy is a further limiting assump-
tion, which could be relaxed, e.g.,, in a lifelong learn-
ing setting.
The research on semantically imprecise data in
general could be expanded to domains beyond nat-
ural images. For example, we expect source code
to have a stronger feature-semantic correspondence,
which is crucial for the hierarchical classifier. In par-
ticular, human-made hierarchies such as the Common
Weakness Enumeration (CWE, (The MITRE Corpo-
ration, 2021)) explicitly consider certain features of
program code to determine categories. And even in
the visual domain, there are efforts to construct more
visual-feature-oriented hierarchies, e.g.,, accompany-
ing WikiChurches (Barz and Denzler, 2021).
ACKNOWLEDGMENTS
The computational experiments were performed on
resources of Friedrich Schiller University Jena sup-
ported in part by DFG grants INST 275/334-1 FUGG
and INST 275/363-1 FUGG.
REFERENCES
Abassi, L. and Boukhris, I. (2020). Imprecise label ag-
gregation approach under the belief function theory.
In Abraham, A., Cherukuri, A. K., Melin, P., and
Gandhi, N., editors, Intelligent Systems Design and
Applications, pages 607–616. Springer International
Publishing.
Ambroise, C., Denœux, T., and Govaert, G. (2001). Learn-
ing from an imprecise teacher: probabilistic and ev-
idential approaches. Applied Stochastic Models and
Data Analysis, page 6.
Barz, B. and Denzler, J. (2021). Wikichurches: A fine-
grained dataset of architectural styles with real-world
challenges. In Neural Information Processing Systems
(NeurIPS).
Brust, C.-A., Barz, B., and Denzler, J. (2020). Making ev-
ery label count: Handling semantic imprecision by in-
tegrating domain knowledge. In International Confer-
ence on Pattern Recognition (ICPR).
Brust, C.-A. and Denzler, J. (2019). Integrating domain
knowledge: using hierarchies to improve deep clas-
sifiers. In Asian Conference on Pattern Recognition
(ACPR).
Deng, J., Ding, N., Jia, Y., Frome, A., Murphy, K., Bengio,
S., Li, Y., Neven, H., and Adam, H. (2014). Large-
scale object classification using label relation graphs.
In European Conference on Computer Vision (ECCV).
Deng, J., Krause, J., Berg, A. C., and Fei-Fei, L. (2012).
Hedging your bets: Optimizing accuracy-specificity
trade-offs in large scale visual recognition. In Com-
puter Vision and Pattern Recognition (CVPR).
Harispe, S., Ranwez, S., Janaqi, S., and Montmain, J.
(2015). Semantic similarity from natural language and
ontology analysis. Synthesis Lectures on Human Lan-
guage Technologies, 8(1):1–254.
Kolesnikov, A., Zhai, X., and Beyer, L. (2019). Revisiting
self-supervised visual representation learning. In Pro-
ceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR).
Lee, D.-H. (2013). Pseudo-label : The simple and effi-
cient semi-supervised learning method for deep neu-
ral networks. In International Conference on Machine
Learning Workshops (ICML-WS), page 6.
McAuley, J. J., Ramisa, A., and Caetano, T. S. (2013). Op-
timization of robust loss functions for weakly-labeled
image taxonomies. International Journal of Computer
Vision (IJCV), 104(3):343–361.
Silla, C. N. and Freitas, A. A. (2011-01). A survey of
hierarchical classification across different application
domains. Data Mining and Knowledge Discovery,
22(1):31–72.
Sohn, K., Berthelot, D., Li, C.-L., Zhang, Z., Carlini, N.,
Cubuk, E. D., Kurakin, A., Zhang, H., and Raffel,
C. (2020). FixMatch: Simplifying semi-supervised
learning with consistency and confidence.
The MITRE Corporation (2021). Common Weakness Enu-
meration (CWE).
Van Horn, G., Branson, S., Farrell, R., Haber, S., Barry,
J., Ipeirotis, P., Perona, P., and Belongie, S. (2015).
VISAPP 2022 - 17th International Conference on Computer Vision Theory and Applications
34