
dealing with more challenging dataset and supported
it with further experiments. The extension is to use M
feature vectors of size p as input to the recurrent struc-
ture instead of one feature vector. The proposed fra-
mework can easily adapt itself to other scenarios like
multi-label image classification without adding extra
layers to the network architecture.
This work has been partially supported by the Spa-
nish project TIN2016-74946-P (MINECO/FEDER,
UE) and CERCA Programme / Generalitat de Cata-
Bengio, Y., Courville, A., and Vincent, P. (2013). Represen-
tation learning: A review and new perspectives. IEEE
Trans. Pattern Anal. Mach. Intell., 35(8):1798–1828.
Buch, N., Orwell, J., and Velastin, S. A. (2009). 3d extended
histogram of oriented gradients (3dhog) for classifica-
tion of road users in urban scenes. In Proceedings of
the British Machine Vision Conference, pages 15.1–
15.11. BMVA Press. doi:10.5244/C.23.15.
Chen, Z. and Ellis, T. (2011). Multi-shape descriptor vehi-
cle classification for urban traffic. In 2011 Internatio-
nal Conference on Digital Image Computing: Techni-
ques and Applications, pages 456–461.
Cohen, J. (1960). A coefficient of agreement for nominal
scales. Educational and Psychological Measurement,
Dalal, N. and Triggs, B. (2005). Histograms of oriented
gradients for human detection. In Proceedings of the
2005 IEEE Computer Society Conference on Compu-
ter Vision and Pattern Recognition (CVPR’05) - Vo-
lume 1 - Volume 01, CVPR ’05, pages 886–893, Wa-
shington, DC, USA. IEEE Computer Society.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). ImageNet: A Large-Scale Hierarchical
Image Database. In CVPR09.
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N.,
Tzeng, E., and Darrell, T. (2014). Decaf: A deep con-
volutional activation feature for generic visual recog-
nition. In International Conference in Machine Lear-
ning (ICML).
Dong, Z., Pei, M., He, Y., Liu, T., Dong, Y., and Jia, Y.
(2014). Vehicle type classification using unsupervi-
sed convolutional neural network. In 2014 22nd In-
ternational Conference on Pattern Recognition, pages
Fei-Fei, L., Fergus, R., and Perona, P. (2007). Lear-
ning generative visual models from few training ex-
amples: An incremental bayesian approach tested on
101 object categories. Comput. Vis. Image Underst.,
Gupte, S., Masoud, O., Martin, R. F., and Papanikolopou-
los, N. P. (2002). Detection and classification of vehi-
cles. Trans. Intell. Transport. Sys., 3(1):37–47.
Hasegawa, O. and Kanade, T. (2005). Type classification,
color estimation, and specific target detection of mo-
ving targets on public streets. Machine Vision and Ap-
plications, 16(2):116–121.
He, D., Lang, C., Feng, S., Du, X., and Zhang, C. (2015).
Vehicle detection and classification based on convolu-
tional neural network. In Proceedings of the 7th In-
ternational Conference on Internet Multimedia Com-
puting and Service, ICIMCS ’15, pages 3:1–3:5, New
York, NY, USA. ACM.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep re-
sidual learning for image recognition. In The IEEE
Conference on Computer Vision and Pattern Recogni-
tion (CVPR).
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. Neural Comput., 9(8):1735–1780.
Hsieh, J.-W., Yu, S.-H., Chen, Y.-S., and Hu, W.-F.
(2006). Automatic traffic surveillance system for vehi-
cle tracking and classification. IEEE Transactions on
Intelligent Transportation Systems, 7(2):175–187.
Huang, D., Shan, C., Ardabilian, M., Wang, Y., and Chen,
L. (2011). Local binary patterns and its application to
facial image analysis: A survey. IEEE Transactions on
Systems, Man, and Cybernetics, Part C (Applications
and Reviews), 41(6):765–781.
Huo, Z., Xia, Y., and Zhang, B. (2016). Vehicle type classi-
fication and attribute prediction using multi-task rcnn.
In 2016 9th International Congress on Image and Sig-
nal Processing, BioMedical Engineering and Infor-
matics (CISP-BMEI), pages 564–569.
Ioffe, S. and Szegedy, C. (2015). Batch normalization:
Accelerating deep network training by reducing in-
ternal covariate shift. In Proceedings of the 32nd In-
ternational Conference on Machine Learning, ICML
2015, Lille, France, 6-11 July 2015, pages 448–456.
Jiang, C. and Zhang, B. (2016). Weakly-supervised vehi-
cle detection and classification by convolutional neu-
ral network. In 2016 9th International Congress on
Image and Signal Processing, BioMedical Engineer-
ing and Informatics (CISP-BMEI), pages 570–575.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012).
Imagenet classification with deep convolutional neu-
ral networks. In Pereira, F., Burges, C. J. C., Bottou,
L., and Weinberger, K. Q., editors, Advances in Neu-
ral Information Processing Systems 25, pages 1097–
1105. Curran Associates, Inc.
Lai, A. H. S., Fung, G. S. K., and Yung, N. H. C.
(2001). Vehicle type classification from visual-based
dimension estimation. In ITSC 2001. 2001 IEEE In-
telligent Transportation Systems. Proceedings (Cat.
No.01TH8585), pages 201–206.
Li, X., Zhao, F., and Guo, Y. (2014). Multi-label image
classification with a probabilistic label enhancement
model. In Proceedings of the Thirtieth Conference on
Uncertainty in Artificial Intelligence, UAI’14, pages
430–439, Arlington, Virginia, United States. AUAI
CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification