| Peer-Reviewed

DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data

Received: 8 September 2017    Accepted: 8 October 2017    Published: 30 November 2017
Views:       Downloads:
Abstract

Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.

Published in American Journal of Biomedical and Life Sciences (Volume 5, Issue 6)
DOI 10.11648/j.ajbls.20170506.15
Page(s) 135-143
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Lysine Succinylation, AAC, CKSAAP, Binary Encoding, PSAAP, AAindex, Ensemble Classifier

References
[1] B. N. Sobolev, A. V. Veselovsky, and V. V. Poroikov, “Prediction of protein post-translational modifications: main trends and methods,” Russian Chemical Reviews: Russian Academy of Sciences and Turpion Ltd, vol. 83(2), pp. 143-154, 2014.
[2] Rosen and R. et al., “Probing the active site of homoserine trans-succinylase,” FEBS Lett., vol. 577, pp. 386-392, 2004.
[3] X. Zhao, Q. Ning, H. Chai, and Z. Ma, “Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique,” Journal of Theoretical Biology, vol. 374, pp. 60-65, 2015.
[4] H. D. Xu, S. P. Shi, P. P. Wen, and J. D. Qiu, “SuccFind: a novel succinylation sites online prediction tool via enhanced characteristic strategy,” Bioinformatics, vol. 31(23), pp. 3748-3750, 2015.
[5] Y. Xu et al., “iSuc-PseAAC: predicting lysine succinylation in proteins by incorporating peptide positionspecific propensity,” Scientific Reports, vol. 5, 2015.
[6] J. Jia et al., “iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset,” Analytical Biochemistry, vol. 497, pp. 48-56, 2016.
[7] A. M. Hasan, S. Yang, Y. Zhoua, and M. N. H. Mollahb, “SuccinSite: a computational tool for the prediction of protein succinylation sites by exploiting the amino acid patterns and properties,” Molecular BioSystems, vol. 12(3), pp. 786-795, 2016.
[8] J. Jia et al., “pSuc-Lys: predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach,” Journal of Theoretical Biology, vol. 394, pp. 223-230, 2016.
[9] W. Bao, L. Zhu, and D. S. Huang, “ILSES: Identification lysine succinylation-sites with ensemble classification.” In IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2016.
[10] L. Nanni, A. Lumini, and S. Brahnam, “An empirical study of different approaches for protein classification,” The Scientific World Journal, 2014.
[11] K. Chen, L. Kurgan, and M. Rahbari, “Prediction of protein crystallization using collocation of amino acid pairs,” Biochemical and Biophysical Research Communications, vol. 355(3), pp. 764-769, 2007.
[12] S. Kawashima et al., “AAindex: amino acid index database, progress report 2008,” Nucleic Acids Research, vol. 36(D202-5), 2008.
[13] Y. R. Tang, Y. Z. Chen, C. A. Canchaya, and Z. Zhang, “GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network,” Protein Engineering, Design and Selection, vol. 20(8), pp. 405-412, 2007.
[14] M. A. M. Hasan, M. Nasser, S. Ahmad and K. I. Molla, “Feature Selection for Intrusion Detection Using Random Forest,” Journal of Information Security, vol. 7, pp. 129-140, 2016.
[15] S. Wang and S. Liu, “Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA.” International Journal of Molicular Science, vol. 16(12), pp. 30343-30361, 2015.
[16] Y. López et al., “SucStruct: Prediction of succinylated lysine residues by using structural properties of amino acids,” Analytical Biochemistry, vol. 527, pp. 24-32, 2017.
[17] The UniProt Consortium, “UniProt: the universal protein knowledgebase,” Nucleic Acids Research; vol. 45, 2016, (D1): D158-D169. doi: 10.1093/nar/gkw1099.
[18] Z. Liu et al. “CPLM: a database of protein lysine modifications.” Nucleic Acids Res. Vol. 42, pp. D531–D536, 2016.
[19] W. R. Qiu, B. Q. Sun, X. Xiao, Z. C. Xu, K. C. Chou, iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics, 32(20), pp. 3116-3123, 2016.
[20] Z. Ju, J. J. He, "Prediction of lysine propionylation sites using biased SVM and incorporating four different sequence features into Chou’s PseAAC", Journal of Molecular Graphics and Modelling, vol. 76, pp. 356-363, 2017.
[21] W. R. Qiu, Q. S. Zheng, B. Q. Sun, X. Xiao, “Multi-iPPseEvo: A Multi‐label Classifier for Identifying Human Phosphorylated Proteins by Incorporating Evolutionary Information into Chou′ s General PseAAC via Grey System Theory”, Molecular Informatics, 36(3), 2017.
[22] H. Long, M. Wang, H. Fu, “Deep Convolutional Neural Networks for Predicting Hydroxyproline in Proteins” Current Bioinformatics, 12(3), pp. 233-238, 2017.
[23] M. A. M. Hasan, S. Ahmad, M. K. I. Molla, "iMulti-HumPhos: a multi-label classifier for identifying human phosphorylated proteins using multiple kernel learning based support vector machines", Molecular Bio Systems, vol. 13, pp. 1608-1618, 2017.
Cite This Article
  • APA Style

    Md. Khaled Ben Islam, Md. Nazrul Islam Mondal, Julia Rahman, Md. Al Mehedi Hassan. (2017). DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. American Journal of Biomedical and Life Sciences, 5(6), 135-143. https://doi.org/10.11648/j.ajbls.20170506.15

    Copy | Download

    ACS Style

    Md. Khaled Ben Islam; Md. Nazrul Islam Mondal; Julia Rahman; Md. Al Mehedi Hassan. DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. Am. J. Biomed. Life Sci. 2017, 5(6), 135-143. doi: 10.11648/j.ajbls.20170506.15

    Copy | Download

    AMA Style

    Md. Khaled Ben Islam, Md. Nazrul Islam Mondal, Julia Rahman, Md. Al Mehedi Hassan. DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data. Am J Biomed Life Sci. 2017;5(6):135-143. doi: 10.11648/j.ajbls.20170506.15

    Copy | Download

  • @article{10.11648/j.ajbls.20170506.15,
      author = {Md. Khaled Ben Islam and Md. Nazrul Islam Mondal and Julia Rahman and Md. Al Mehedi Hassan},
      title = {DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data},
      journal = {American Journal of Biomedical and Life Sciences},
      volume = {5},
      number = {6},
      pages = {135-143},
      doi = {10.11648/j.ajbls.20170506.15},
      url = {https://doi.org/10.11648/j.ajbls.20170506.15},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajbls.20170506.15},
      abstract = {Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.},
     year = {2017}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - DV-iSucLys: Decision Voting to Improve Protein Lysine Succinylation Site Identification from Sequence Data
    AU  - Md. Khaled Ben Islam
    AU  - Md. Nazrul Islam Mondal
    AU  - Julia Rahman
    AU  - Md. Al Mehedi Hassan
    Y1  - 2017/11/30
    PY  - 2017
    N1  - https://doi.org/10.11648/j.ajbls.20170506.15
    DO  - 10.11648/j.ajbls.20170506.15
    T2  - American Journal of Biomedical and Life Sciences
    JF  - American Journal of Biomedical and Life Sciences
    JO  - American Journal of Biomedical and Life Sciences
    SP  - 135
    EP  - 143
    PB  - Science Publishing Group
    SN  - 2330-880X
    UR  - https://doi.org/10.11648/j.ajbls.20170506.15
    AB  - Protein Post Translation Modification identification is one of the important steps in conducting disease-associated mutation studies. Though multiple chemical alterations happen in a protein after translation, the addition of succinyl group to lysine residue plays a vital role in regulating cellular metabolism and thus disease. Use of a classification algorithm on some features, driven either from protein structural, physicochemical or even biochemical information becomes a common approach that can yield a satisfactory result up to a certain level. Although, researchers already developed many computational methods to identify whether a lysine residue modified with succinyl group after translation, most of them focused on the improvement either on a single decision using a single method or feature enrichment or even development of a benchmark dataset. Therefore, there still exists scope for further improvement to characterise lysine residues of a protein sequence by considering multiple predictors at a time. In this study, an ensemble based approach called DV-iSucLys has been designed to characterise the lysine residue by adapting three well known and conceptually different classifiers and ensembling their decisions. Also, a benchmark succinylation dataset was extracted from existing benchmark datasets and recently updated succinylation data from UniProt consortium to investigate the performance of the proposed approach as well as contribute to further research. Analysing rigorous cross-validation results show that DV-iSucLys can characterise succinyl lysine residue better than the existing predictors.
    VL  - 5
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh; Department of Computer Science & Engineering, Pabna University of Science & Technology, Pabna, Bangladesh

  • Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh

  • Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh

  • Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi, Bangladesh

  • Sections