| Peer-Reviewed

Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size

Received: 30 November 2020    Accepted: 17 December 2020    Published: 31 December 2020
Views:       Downloads:
Abstract

The significant challenges for deploying CNNs/DNNs on ADAS are limited computation and memory resources with very limited efficiency. Design space exploration of CNNs or DNNS, training and testing DNN from scratch, hyper parameter tuning, implementation with different optimizers contributed towards the efficiency and performance improvement of the Shallow SqueezeNext architecture. It is also computationally efficient, inexpensive and requires minimum memory resources. It achieves better model size and speed in comparison to other counterparts such as AlexNet, VGGnet, SqueezeNet, and SqueezeNext, trained and tested from scratch on datasets such as CIFAR-10 and CIFAR-100. It can achieve the least model size of 272KB with a model accuracy of 82%, a model speed of 9 seconds per epoch, and tested on the CIFAR-10 dataset. It achieved the best accuracy of 91.41%, best model size of 0.272 MB, and best model speed of 4 seconds per epoch. Memory resources are of high importance when it comes down to real time system or platforms because usually the memory is quite limited. To verify that the Shallow SqueezeNext can be successfully deployed on a real time platform, bluebox2.0 by NXP was used. Bluebox2.0 deployment of Shallow SqueezeNext architecture achieved a model accuracy of 90.50%, 8.72MB model size and 22 seconds per epoch model speed. There is another version of the Shallow SqueezeNext which performed better that attained a model size of 0.5MB with model accuracy of 87.30% and 11 seconds per epoch model speed trained and tested from scratch on CIFAR-10 dataset.

Published in Journal of Electrical and Electronic Engineering (Volume 8, Issue 6)
DOI 10.11648/j.jeee.20200806.11
Page(s) 127-136
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Deep Neural Network (DNN), Design Space Exploration (DSE), Pytorch Implementation, Real-time Deployment, RTMaps, SqueezeNext, Shallow SqueezeNext

References
[1] Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J. and Keutzer, K., 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv: 1602.07360.
[2] Gholami, A., Kwon, K., Wu, B., Tai, Z., Yue, X., Jin, P., Zhao, S. and Keutzer, K., 2018. Squeezenext: Hardware-aware neural network design. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 1638-1647).
[3] Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M. and Adam, H., 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: 1704.04861.
[4] Ludermir, T. B., Yamazaki, A. and Zanchettin, C., 2006. An optimization methodology for neural network weights and architectures. IEEE Transactions on Neural Networks, 17 (6), pp. 1452-1459.
[5] Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B. and Shelhamer, E., 2014. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv: 1410.0759.
[6] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 15 (1), pp. 1929-1958.
[7] Ruder, S. An overview of gradient descent optimization algorithms. arXiv preprint arXiv: 1609.04747, 2016.
[8] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V. and Rabinovich, A., 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[9] Krizhevsky, A., Nair, V. and Hinton, G., 2010. Cifar-10 (canadian institute for advanced research). URL http://www. cs. toronto. edu/kriz/cifar. html, 5.
[10] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J. and Wojna, Z., 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818-2826).
[11] Jayan Kant Duggal and Mohamed El-Sharkawy, “Shallow SqueezeNext Architecture Implementation on Bluebox2.0,” Transactions on Computational Science and Computational Intelligence, Springer, 2020.
[12] Luo, L., Xiong, Y., Liu, Y. and Sun, X., 2019. Adaptive gradient methods with dynamic bound of learning rate. arXiv preprint arXiv: 1902.09843.
[13] Clevert, D. A., Unterthiner, T. and Hochreiter, S., 2015. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv: 1511.07289.
[14] Simonyan, K. and Zisserman, A., 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556.
[15] Shah, A., Kadam, E., Shah, H., Shinde, S. and Shingade, S., 2016, September. Deep residual networks with exponential linear unit. In Proceedings of the Third International Symposium on Computer Vision and the Internet (pp. 59-65).
[16] Duggal, Jayan Kant. Design Space Exploration of DNNs for Autonomous Systems. Diss. Purdue University Graduate School, 2019.
[17] Duggal, Jayan Kant, and Mohamed El-Sharkawy. "High Performance SqueezeNext for CIFAR-10." 2019 IEEE National Aerospace and Electronics Conference (NAECON). IEEE, 2019.
[18] Duggal, Jayan Kant, and Mohamed El-Sharkawy. "Shallow SqueezeNext: An Efficient \& Shallow DNN." 2019 IEEE International Conference of Vehicular Electronics and Safety (ICVES). IEEE, 2019.
[19] Venkitachalam, S., Manghat, S. K., Gaikwad, A. S., Ravi, N., Bhamidi, S. B. S. and El-Sharkawy, M., 2018. Realtime Applications with RTMaps and Bluebox 2.0. In Proceedings on the International Conference on Artificial Intelligence (ICAI) (pp. 137-140). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp).
Cite This Article
  • APA Style

    Jayan Kant Duggal, Mohamed El-Sharkawy. (2020). Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size. Journal of Electrical and Electronic Engineering, 8(6), 127-136. https://doi.org/10.11648/j.jeee.20200806.11

    Copy | Download

    ACS Style

    Jayan Kant Duggal; Mohamed El-Sharkawy. Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size. J. Electr. Electron. Eng. 2020, 8(6), 127-136. doi: 10.11648/j.jeee.20200806.11

    Copy | Download

    AMA Style

    Jayan Kant Duggal, Mohamed El-Sharkawy. Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size. J Electr Electron Eng. 2020;8(6):127-136. doi: 10.11648/j.jeee.20200806.11

    Copy | Download

  • @article{10.11648/j.jeee.20200806.11,
      author = {Jayan Kant Duggal and Mohamed El-Sharkawy},
      title = {Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size},
      journal = {Journal of Electrical and Electronic Engineering},
      volume = {8},
      number = {6},
      pages = {127-136},
      doi = {10.11648/j.jeee.20200806.11},
      url = {https://doi.org/10.11648/j.jeee.20200806.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20200806.11},
      abstract = {The significant challenges for deploying CNNs/DNNs on ADAS are limited computation and memory resources with very limited efficiency. Design space exploration of CNNs or DNNS, training and testing DNN from scratch, hyper parameter tuning, implementation with different optimizers contributed towards the efficiency and performance improvement of the Shallow SqueezeNext architecture. It is also computationally efficient, inexpensive and requires minimum memory resources. It achieves better model size and speed in comparison to other counterparts such as AlexNet, VGGnet, SqueezeNet, and SqueezeNext, trained and tested from scratch on datasets such as CIFAR-10 and CIFAR-100. It can achieve the least model size of 272KB with a model accuracy of 82%, a model speed of 9 seconds per epoch, and tested on the CIFAR-10 dataset. It achieved the best accuracy of 91.41%, best model size of 0.272 MB, and best model speed of 4 seconds per epoch. Memory resources are of high importance when it comes down to real time system or platforms because usually the memory is quite limited. To verify that the Shallow SqueezeNext can be successfully deployed on a real time platform, bluebox2.0 by NXP was used. Bluebox2.0 deployment of Shallow SqueezeNext architecture achieved a model accuracy of 90.50%, 8.72MB model size and 22 seconds per epoch model speed. There is another version of the Shallow SqueezeNext which performed better that attained a model size of 0.5MB with model accuracy of 87.30% and 11 seconds per epoch model speed trained and tested from scratch on CIFAR-10 dataset.},
     year = {2020}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Shallow SqueezeNext: Real Time Deployment on Bluebox2.0 with 272KB Model Size
    AU  - Jayan Kant Duggal
    AU  - Mohamed El-Sharkawy
    Y1  - 2020/12/31
    PY  - 2020
    N1  - https://doi.org/10.11648/j.jeee.20200806.11
    DO  - 10.11648/j.jeee.20200806.11
    T2  - Journal of Electrical and Electronic Engineering
    JF  - Journal of Electrical and Electronic Engineering
    JO  - Journal of Electrical and Electronic Engineering
    SP  - 127
    EP  - 136
    PB  - Science Publishing Group
    SN  - 2329-1605
    UR  - https://doi.org/10.11648/j.jeee.20200806.11
    AB  - The significant challenges for deploying CNNs/DNNs on ADAS are limited computation and memory resources with very limited efficiency. Design space exploration of CNNs or DNNS, training and testing DNN from scratch, hyper parameter tuning, implementation with different optimizers contributed towards the efficiency and performance improvement of the Shallow SqueezeNext architecture. It is also computationally efficient, inexpensive and requires minimum memory resources. It achieves better model size and speed in comparison to other counterparts such as AlexNet, VGGnet, SqueezeNet, and SqueezeNext, trained and tested from scratch on datasets such as CIFAR-10 and CIFAR-100. It can achieve the least model size of 272KB with a model accuracy of 82%, a model speed of 9 seconds per epoch, and tested on the CIFAR-10 dataset. It achieved the best accuracy of 91.41%, best model size of 0.272 MB, and best model speed of 4 seconds per epoch. Memory resources are of high importance when it comes down to real time system or platforms because usually the memory is quite limited. To verify that the Shallow SqueezeNext can be successfully deployed on a real time platform, bluebox2.0 by NXP was used. Bluebox2.0 deployment of Shallow SqueezeNext architecture achieved a model accuracy of 90.50%, 8.72MB model size and 22 seconds per epoch model speed. There is another version of the Shallow SqueezeNext which performed better that attained a model size of 0.5MB with model accuracy of 87.30% and 11 seconds per epoch model speed trained and tested from scratch on CIFAR-10 dataset.
    VL  - 8
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Internet of Things Collaboratory, Purdue School of Engineering and Technology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, USA

  • Internet of Things Collaboratory, Purdue School of Engineering and Technology, Indiana University Purdue University Indianapolis (IUPUI), Indianapolis, USA

  • Sections