OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

Journal of Signal and Information Processing 2018

DM-L Based Feature Extraction and Classifier Ensemble for Object Recognition

DOI: 10.4236/jsip.2018.92006, PP. 92-110

Hamayun A. Khan

Keywords: Deep Learning, Object Recognition, CNN, Deep Multi-Layer Feature Extraction, Principal Component Analysis, Classifier Ensemble, Caltech-101 Benchmark Database

Full-Text Cite this paper Add to My Lib

Abstract:

Deep Learning is a powerful technique that is widely applied to Image Recognition and Natural Language Processing tasks amongst many other tasks. In this work, we propose an efficient technique to utilize pre-trained Convolutional Neural Network (CNN) architectures to extract powerful features from images for object recognition purposes. We have built on the existing concept of extending the learning from pre-trained CNNs to new databases through activations by proposing to consider multiple deep layers. We have exploited the progressive learning that happens at the various intermediate layers of the CNNs to construct Deep Multi-Layer (DM-L) based Feature Extraction vectors to achieve excellent object recognition performance. Two popular pre-trained CNN architecture models i.e. the VGG_16 and VGG_19 have been used in this work to extract the feature sets from 3 deep fully connected multiple layers namely “fc6”, “fc7” and “fc8” from inside the models for object recognition purposes. Using the Principal Component Analysis (PCA) technique, the Dimensionality of the DM-L feature vectors has been reduced to form powerful feature vectors that have been fed to an external Classifier Ensemble for classification instead of the Softmax based classification layers of the two original pre-trained CNN models. The proposed DM-L technique has been applied to the Benchmark Caltech-101 object recognition database. Conventional wisdom may suggest that feature extractions based on the deepest layer i.e. “fc8” compared to “fc6” will result in the best recognition performance but our results have proved it otherwise for the two considered models. Our experiments have revealed that for the two models under consideration, the “fc6” based feature vectors have achieved the best recognition performance. State-of-the-Art recognition performances of 91.17% and 91.35% have been achieved by utilizing the “fc6” based feature vectors for the VGG_16 and VGG_19 models respectively. The recognition performance has been achieved by considering 30 sample images per class whereas the proposed system is capable of achieving improved performance by considering all sample images per class. Our research shows that for feature extraction based on CNNs, multiple layers should be considered and then the best layer can be selected that maximizes the recognition performance.

References

[1]	Buduma, N. (2017) Fundamentals of Deep Learning Designing Next-Generation Machine Intelligence Algorithms. O’Reilly Media, Sebastopol, CA.
[2]	Bengio, Y.I., Goodfellow, J. and Courville, A. (2016) Deep Learning. MIT Press, Cambridge, MA. http://www.deeplearningbook.org
[3]	Nielsen, M.A. (2015) Neural Networks and Deep Learning. http://neuralnetworksanddeeplearning.com/
[4]	LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. https://doi.org/10.1109/5.726791
[5]	Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) Imagenet Classification with Deep Convolutional Neural Networks. In: Advances in Neural Information Processing Systems, 1097-1105.
[6]	Simonyan, K., Vedaldi, A. and Zisserman, A. (2013) Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv preprint arXiv:1312.6034.
[7]	Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R. and Fei-Fei, L. (2014) Large-Scale Video Classification with Convolutional Neural Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1725-1732. https://doi.org/10.1109/CVPR.2014.223
[8]	Wei, Y., Xia, W., Huang, J., Ni, B., Dong, J., Zhao, Y. and Yan, S. (2014) CNN: Single-Label to Multi-Label. arXiv preprint arXiv:1406.5726.
[9]	Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R. and LeCun, Y. (2013) Overfeat: Integrated Recognition, Localization and Detection Using Convolutional Networks. arXiv preprint arXiv:1312.6229.
[10]	Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015) Going Deeper with Convolutions. CVPR.
[11]	He, K., Zhang, X., Ren, S. and Sun, J. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778. https://doi.org/10.1109/CVPR.2016.90
[12]	Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580-587. https://doi.org/10.1109/CVPR.2014.81
[13]	Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B. and Wang, G. (2015) Recent Advances in Convolutional Neural Networks. arXiv preprint arXiv:1512.07108
[14]	Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556.
[15]	Zeiler, M.D. and Fergus, R. (2013) Visualizing and Understanding Convolutional Networks. arXiv preprint arXiv:1311.2901
[16]	He, K., Zhang, X., Ren, S. and Sun, J. (2014) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. arXiv preprint arXiv:1406.4729
[17]	Chatfield, K., Simonyan, K., Vedaldi, A. and Zisserman, A. (2014) Return of the Devil in the Details: Delving Deep into Convolutional Nets. arXiv preprint arXiv:1405.3531
[18]	Yang, J., Li, Y., Tian, Y., Duan, L. and Gao, W. (2009) Group-Sensitive Multiple Kernel Learning for Object Categorization. 12th International Conference on Computer Vision, Kyoto, 29 September-2 October 2009, 436-443.
[19]	Fei-Fei, L., Fergus, R. and Perona, P. (2007) Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories. Computer Vision and Image Understanding, 106, 59-70. https://doi.org/10.1016/j.cviu.2005.09.012
[20]	Griffin, G., Holub, A. and Perona, P. (2007) Caltech-256 Object Category Dataset. http://authors.library.caltech.edu
[21]	MATLAB Neural Network Toolbox. https://www.mathworks.com/help/nnet/ref/
[22]	Machine Learning with MATLAB, MathsWorks. http://www.mathworks.com/solutions/machine-learning.html
[23]	Zhou, Z., Wu, J. and Tang, W. (2002) Ensembling Neural Networks: Many Could Be Better than All. Artificial Intelligence, 137, 239-263. https://doi.org/10.1016/S0004-3702(02)00190-X

Full-Text

comments powered by Disqus

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133

WeChat 1538708413