| Peer-Reviewed

Survey of Big Data Storage Technology

Received: 21 June 2016    Accepted:     Published: 21 June 2016
Views:       Downloads:
Abstract

Big data storage is the foundation of big data processing and analysis. By researching and summarizing main processing technology of data storage, this paper respectively investigates and analyzes the following four aspects: distributed file system, NoSQL database, database appliance and new-type data storage technology of MPP architecture. In addition, this paper gives some recommendations applicable to different environments in favor of grasping the development states of data storage technology from different angles. This paper summarizes file segmentation, appropriate scenarios and merits and faults of distributed file system, and mainly analyzes and summarizes the theories and appropriate scenarios of four data storage models of NoSql database. Furthermore, this paper investigates and concludes the developments and features of database appliance minutely. At the same time, outline MPP (Massively Parallel Processing) architecture, a new data storage technology. At last, the research trends of storage technology are prospected, providing references to the research of big data storage technology.

Published in Internet of Things and Cloud Computing (Volume 4, Issue 3)
DOI 10.11648/j.iotcc.20160403.13
Page(s) 28-33
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Big Data Storage, NoSql, Distributed File System, Database All-in-One Machine, MPP Architecture

References
[1] Zhang X, Xu F. Survey of Research on Big Data Storage [C] // International Symposium on Distributed Computing and Applications To Business, Engineering & Science. IEEE Computer Society, 2013: 76-80.
[2] Biesdorf S, Court D, Willmott P. Big data: What's your plan? [J]. Mckinsey Quarterly, 2013 (2): 40-51.
[3] TU Xinli, Liu Bo, Lin Weiwei. Survey of Big Data [J]. Application Research of Computers, 2014, 31 (6): 1612-1616.
[4] Garcia H, Ludu A. The Google file system [C] // Acm Sigops Operating Systems Review. ACM, 2003: 29-43.
[5] Tong Ming. Research and application of distributed storage based on HDFS [D]. Huazhong University of Science and Technology, 2012.
[6] Davies A, Orsaria A. Scale out with GlusterFS [J]. Linux Journal, 2013, 2013 (235): 1.
[7] Hows D, Membrey P, Plugge E, et al. GridFS [M]. Apress, 2013.
[8] Zhao Yang. Depth Profiles of TaoBao TFS [J]. Digital Users, 2013 (3).
[9] Wang Bo, Li Xianguo, Zhang Xiao. Research on performance optimization of Lustre file system [J]. Microcomputer Applications, 2011, 27 (5): 31-33.
[10] Yu Qing. Analyses of Distributed File System FastDFS Architecture [J]. Programmer, 2010 (11): 63-65.
[11] Golov N, Rönnbäck L. Big Data Normalization for Massively Parallel Processing Databases [C] // International Workshop on Modeling & Management of Big Data. 2015.
[12] Thereska E, Gunawardena D S, Scott J W, et al. Distributed File System: US, US20120254116 [P]. 2012.
[13] Li Hongqi, Zhu Liping, Sun Guoyu, et al. Design and Implementation of Distributed Storage System Facing Vast Small Files [J]. Computer Engineering and Design, 2016 (1): 86-92.
[14] Qi Ying. Research on Low Latency Access Technology of Distributed File System with Vast Small Files [D]. University of Chinese Academy of Sciences, 2013.
[15] Weil S A, Brandt S A, Miller E L, et al. Ceph: A Scalable, High-Performance Distributed File System [C] // 7th Symposium on Operating Systems Design and Implementation (OSDI '06), November 6-8, Seattle, WA, USA. 2006: 307--320.
[16] Gpfs B. A Shared-Disk File System for Large Computing Clusters [C] // of the First Conference on File and Storage Technologies. 2010.
[17] Xu Chunling, Zhang Guangquan. Comparison and Analysis of Distributed File System Hadoop HDFS and Traditional File System Linux FS [J]. Journal of Soochow University (Engineering Science Edition), 2010, 30 (4): 5-9.
[18] Xiong Wen, Yu Zhibin, Xu Chengzhong. Feature Analysis and Performance Comparison of Several Common Distributed File System [J]. Journal of Integration Technology, 2012, 1 (4): 58-63.
[19] Sawicki A, Nowak T. NETWORK DISTRIBUTED FILE SYSTEM:, US20080320097[P]. 2008.
[20] Shi Xiaodong. Research on High Availability of Distributed File System [D]. Institute of Computing Technology, Chinese Academy of Sciences, 2002.
[21] Qin Xiongpai, Wang Huiju, Du Xiaoyong, et al. big data analytics --Competition and Coexistence of RDBMS and MapReduce [J]. Journal of Software, 2012, 23 (1): 32-45.
[22] Shen Derong, Yu Ge, Wang Xite, et al. Survey of Research on NoSQL System Supporting Big Data Management [J]. Journal of Software, 2013 (8): 1786-1803.
[23] Curé O, Kerdjoudj F, Faye D, et al. On The Potential Integration of an Ontology-Based Data Access Approach in NoSQL Stores [J]. International Journal of Distributed Systems & Technologies, 2012, 4 (3): 166-173.
[24] Wang Jieping, Li Haibo, Song Jie, et al. Research on Cloud Data Storage and Management Standardization [J]. Information Technology and Standardization, 2011 (9): 28-31.
[25] Liu Y, Zhu L, Jiang W. Column caching mechanism for column based database:, EP2743839 [P]. 2014.
[26] Bhogal J, Choksi I. Handling Big Data Using NoSQL [C]// IEEE International Conference on Advanced Information NETWORKING and Applications Workshops. IEEE, 2015: 393-398.
[27] Amirian P, Basiri A, Winstanley A. Efficient Online Sharing of Geospatial Big Data Using NoSQL XML Databases [C] // Fourth International Conference on Computing for Geospatial Research and Application. IEEE, 2013: 152-152.
[28] Deka G C. A Survey of Cloud Database Systems [J]. It Professional, 2014, 16 (2): 50-57.
[29] Castelltort A, Laurent A. Fuzzy Historical Graph Pattern Matching A NoSQL Graph Database Approach for Fraud Ring Resolution [M] // Artificial Intelligence Applications and Innovations. Springer International Publishing, 2015.
[30] Dong-Hai L U, Xian-Bo H E. The Analysis of NoSQL Database [J]. Science & Technology of West China, 2011.
[31] Srivastava P P, Goyal S, Kumar A. Analysis of various NoSql database [C] // International Conference on Green Computing and Internet of Things. IEEE, 2015: 539-544.
[32] Chandra D G. BASE analysis of NoSQL database [J]. Future Generation Computer Systems, 2015, 52: 13–21.
[33] Gu Y, Wang X, Shen S, et al. Analysis of data replication mechanism in NoSQL database MongoDB [C] // IEEE International Conference on Consumer Electronics - Taiwan. IEEE, 2015.
[34] Han J, Haihong E, Le G, et al. Survey on NoSQL database [C] // Pervasive Computing and Applications (ICPCA), 2011 6th International Conference on. IEEE, 2011: 363-366.
[35] Hinshaw F D, Meyers D L, Zane B M. Programmable streaming data processor for database appliance having multiple processing unit groups: US, US7577667[P]. 2009.
[36] Zhang Dong, Qi Kaiyuan, Wu Nan, et al. Architecture and Key Technology of Yunhai Big Data All-in-one Machine [J]. Computer Research and Development, 2016, 53 (2): 374-389.
[37] Yue Junfeng, Zhao Junfeng, Zhao Wei, et al. Analysis of Database All-in-one Machine Technical Architecture [J]. Electric Power Information and Communication Technology, 2013, 11 (4): 60-64.
[38] Pu Siyu, Bai qionghua. A Kind of New-type Cloud Storage All-in-one Machine Backing Up Data Information Automatically:, CN204305087U[P]. 2015.
[39] Li C, Yang J, Han J, et al. The Distributed Storage System Based on MPP for Mass Data [C] // Proceedings of the 2012 IEEE Asia-Pacific Services Computing Conference). IEEE Computer Society, 2012: 384-387.
[40] Chen Z, Song L. A solution based on the MPP'S to storage mass data [C] // Mechatronic Science, Electric Engineering and Computer (MEC), 2011 International Conference on. IEEE, 2011: 868-871.
[41] Cheng Lianjuan. Application Practice and Beneficial Reference of Big Data Promotion in the U. S.: An Analysis from the Perspective of Library [J]. Information & Documentation Services, 2013, 34 (5): 110-112.
[42] Li G. Research Status and Scientific Thinking of Big Data [J]. Bulletin of Chinese Academy of Sciences, 2012.
[43] Chen Jirong, Yue Jiajin. Survey of Big Data Solution Based on Hadoop Ecosystem [J]. Computer Engineering and Science, 2013, 35 (10): 25-35.
Cite This Article
  • APA Style

    Wang Weichen, Gao Jing, Cao Rui. (2016). Survey of Big Data Storage Technology. Internet of Things and Cloud Computing, 4(3), 28-33. https://doi.org/10.11648/j.iotcc.20160403.13

    Copy | Download

    ACS Style

    Wang Weichen; Gao Jing; Cao Rui. Survey of Big Data Storage Technology. Internet Things Cloud Comput. 2016, 4(3), 28-33. doi: 10.11648/j.iotcc.20160403.13

    Copy | Download

    AMA Style

    Wang Weichen, Gao Jing, Cao Rui. Survey of Big Data Storage Technology. Internet Things Cloud Comput. 2016;4(3):28-33. doi: 10.11648/j.iotcc.20160403.13

    Copy | Download

  • @article{10.11648/j.iotcc.20160403.13,
      author = {Wang Weichen and Gao Jing and Cao Rui},
      title = {Survey of Big Data Storage Technology},
      journal = {Internet of Things and Cloud Computing},
      volume = {4},
      number = {3},
      pages = {28-33},
      doi = {10.11648/j.iotcc.20160403.13},
      url = {https://doi.org/10.11648/j.iotcc.20160403.13},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.iotcc.20160403.13},
      abstract = {Big data storage is the foundation of big data processing and analysis. By researching and summarizing main processing technology of data storage, this paper respectively investigates and analyzes the following four aspects: distributed file system, NoSQL database, database appliance and new-type data storage technology of MPP architecture. In addition, this paper gives some recommendations applicable to different environments in favor of grasping the development states of data storage technology from different angles. This paper summarizes file segmentation, appropriate scenarios and merits and faults of distributed file system, and mainly analyzes and summarizes the theories and appropriate scenarios of four data storage models of NoSql database. Furthermore, this paper investigates and concludes the developments and features of database appliance minutely. At the same time, outline MPP (Massively Parallel Processing) architecture, a new data storage technology. At last, the research trends of storage technology are prospected, providing references to the research of big data storage technology.},
     year = {2016}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Survey of Big Data Storage Technology
    AU  - Wang Weichen
    AU  - Gao Jing
    AU  - Cao Rui
    Y1  - 2016/06/21
    PY  - 2016
    N1  - https://doi.org/10.11648/j.iotcc.20160403.13
    DO  - 10.11648/j.iotcc.20160403.13
    T2  - Internet of Things and Cloud Computing
    JF  - Internet of Things and Cloud Computing
    JO  - Internet of Things and Cloud Computing
    SP  - 28
    EP  - 33
    PB  - Science Publishing Group
    SN  - 2376-7731
    UR  - https://doi.org/10.11648/j.iotcc.20160403.13
    AB  - Big data storage is the foundation of big data processing and analysis. By researching and summarizing main processing technology of data storage, this paper respectively investigates and analyzes the following four aspects: distributed file system, NoSQL database, database appliance and new-type data storage technology of MPP architecture. In addition, this paper gives some recommendations applicable to different environments in favor of grasping the development states of data storage technology from different angles. This paper summarizes file segmentation, appropriate scenarios and merits and faults of distributed file system, and mainly analyzes and summarizes the theories and appropriate scenarios of four data storage models of NoSql database. Furthermore, this paper investigates and concludes the developments and features of database appliance minutely. At the same time, outline MPP (Massively Parallel Processing) architecture, a new data storage technology. At last, the research trends of storage technology are prospected, providing references to the research of big data storage technology.
    VL  - 4
    IS  - 3
    ER  - 

    Copy | Download

Author Information
  • College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China

  • College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China

  • College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China

  • Sections