Using Multi-Feature Weak Consensus Model to Discover Essential Proteins.
Abstract
Essential proteins play an essential role in cell survival and replication. Currently, more and more computational methods are developed to identify essential proteins, which overcome the time-consuming, costly and inefficient shortcomings with biological experimental methods. In order to improve the recognition rate, some new methods by fusing multiple features are developed, but they seldom consider the connection among features. After analyzing a large number of methods based on multi-feature fusion, a phenomenon among features is found, called weak consensus, then a weak consensus model to fuse these features is proposed in this paper. After analyzing the relationship between a protein and its neighbors in protein-protein interaction networks, a new centrality, namely neighborhood aggregation centrality(NAC) is developed in this paper. Then, a Max-Min strategy is used to integrate NAC with Pearson correlation coefficient and Jaccard similarity coefficient based on gene expression data to obtain local importance score. In addition, orthologous feature score is used to measure proteins conservation. Finally, by using the weak consensus model to fuse orthologous feature score with local importance score, a new method WOL is proposed in this paper. Then experiments are performed on S.cerevisiae data. The results show that compared with WDC, PeC, ION, JDC, NCCO and E_POC, WOL has a higher recognition rate.