
deep into convolutional nets. arXiv preprint
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K. and Fei-
Fei, L., 2009. Imagenet: A large-scale hierarchical
image database. In Computer Vision and Pattern
Recognition, 2009. CVPR 2009. IEEE Conference on
(pp. 248-255). IEEE.
Everingham, M., Van Gool, L., Williams, C.K., Winn, J.
and Zisserman, A., 2010. The pascal visual object
classes (voc) challenge. International journal of
computer vision, 88(2), pp.303-338.
Girshick, R., Donahue, J., Darrell, T. and Malik, J., 2014.
Rich feature hierarchies for accurate object detection
and semantic segmentation. In Proceedings of the
IEEE conference on computer vision and pattern
recognition (pp. 580-587).
Girshick, R., 2015. Fast r-cnn. In Proceedings of the IEEE
international conference on computer vision (pp.
He, K., Zhang, X., Ren, S. and Sun, J., 2015. Spatial
pyramid pooling in deep convolutional networks for
visual recognition. IEEE transactions on pattern
analysis and machine intelligence, 37(9), pp.1904-
Krizhevsky, A., Sutskever, I. and Hinton, G.E., 2012.
Imagenet classification with deep convolutional neural
networks. In Advances in neural information
processing systems (pp. 1097-1105).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.Y. and Berg, A.C., 2016. Ssd: Single shot
multibox detector. In European conference on
computer vision (pp. 21-37). Springer, Cham.
Massa, F., Marlet, R. and Aubry, M., 2016. Crafting a
multi-task CNN for viewpoint estimation. In BMVC
(pp. 1-10).
Oquab, M., Bottou, L., Laptev, I. and Sivic, J., 2014.
Learning and transferring mid-level image
representations using convolutional neural networks.
In Proceedings of the IEEE conference on computer
vision and pattern recognition (pp. 1717-1724).
Penedones, H., Collobert, R., Fleuret, F. and Grangier, D.,
2012. Improving object classification using pose
information (No. EPFL-REPORT-192574). Idiap.
Pepik, B., Stark, M., Gehler, P. and Schiele, B., 2012,
June. Teaching 3d geometry to deformable part
models. In Computer Vision and Pattern Recognition
(CVPR), 2012 IEEE Conference on(pp. 3362-3369).
Poirson, P., Ammirato, P., Fu, C.Y., Liu, W., Kosecka, J.
and Berg, A.C., 2016. Fast single shot detection and
pose estimation. In 3D Vision (3DV), 2016 Fourth
International Conference on (pp. 676-684). IEEE.
Schwarz, M., Schulz, H. and Behnke, S., 2015. RGB-D
object recognition and pose estimation based on pre-
trained convolutional neural network features. In
Robotics and Automation (ICRA), 2015 IEEE
International Conference on (pp. 1329-1335). IEEE.
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus,
R. and LeCun, Y., 2013. Overfeat: Integrated
recognition, localization and detection using
convolutional networks. arXiv preprint arXiv:
Sharif Razavian, A., Azizpour, H., Sullivan, J. and
Carlsson, S., 2014. CNN features off-the-shelf: an
astounding baseline for recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition workshops (pp. 806-813).
Shelhamer, E., Long, J. and Darrell, T., 2017. Fully
convolutional networks for semantic segmentation.
IEEE transactions on pattern analysis and machine
intelligence, 39(4), pp.640-651.
Simonyan, K. and Zisserman, A., 2014. Very deep
convolutional networks for large-scale image
recognition. arXiv preprint arXiv:1409.1556.
Su, Y., Allan, M. and Jurie, F., 2010. Improving object
classification using semantic attributes. In BMVC (pp.
Su, H., Qi, C.R., Li, Y. and Guibas, L.J., 2015. Render for
cnn: Viewpoint estimation in images using cnns
trained with rendered 3d model views. In Proceedings
of the IEEE International Conference on Computer
Vision (pp. 2686-2694).
Tulsiani, S. and Malik, J., 2015. Viewpoints and
keypoints. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition (pp. 1510-
Van de Sande, K.E., Uijlings, J.R., Gevers, T. and
Smeulders, A.W., 2011. Segmentation as selective
search for object recognition. In Computer Vision
(ICCV), 2011 IEEE International Conference on (pp.
1879-1886). IEEE.
Vedaldi, A. and Lenc, K., 2015. Matconvnet:
Convolutional neural networks for matlab. In
Proceedings of the 23rd ACM international
conference on Multimedia (pp. 689-692). ACM.
Wu, J., Yu, Y., Huang, C. and Yu, K., 2015. Deep
multiple instance learning for image classification and
auto-annotation. In Proceedings of the IEEE
Conference on Computer Vision and Pattern
Recognition (pp. 3460-3469).
Xiang, Y., Mottaghi, R. and Savarese, S., 2014. Beyond
pascal: A benchmark for 3d object detection in the
wild. In Applications of Computer Vision (WACV),
2014 IEEE Winter Conference on(pp. 75-82). IEEE.
Yosinski, J., Clune, J., Bengio, Y. and Lipson, H., 2014.
How transferable are features in deep neural
networks?. In Advances in neural information
processing systems (pp. 3320-3328).
Zhang, H., El-Gaaly, T., Elgammal, A.M. and Jiang, Z.,
2013, July. Joint Object and Pose Recognition Using
Homeomorphic Manifold Analysis. In AAAI(Vol. 2, p.
Zhang, H., El-Gaaly, T., Elgammal, A. and Jiang, Z.,
2015. Factorization of view-object manifolds for joint
object recognition and pose estimation. Computer
Vision and Image Understanding, 139, pp.89-103.
VISAPP 2018 - International Conference on Computer Vision Theory and Applications