Comparison of sequence-and structure-based protein-protein interaction sites


Computational protein-protein interaction (PPI) prediction is a diverse field with multiple paradigms generating insightful interaction interface information. The shortcomings of one approach are often the strength of another and establishing the agreement between methodologies is valuable for the development of novel PPI prediction techniques. This study represents the first large-scale comparison of PPI sites determined through a sequence-based method (PIPE-Sites) and a structure-based method (PiSITEs). A set of interactions (n = 3,109) amenable to analysis by both methods was examined. Interestingly, the distributions of the sizes of the predicted interaction sites have similar means and identical median values. Using the Sorensen-Dice similarity coefficient and independent randomization testing, we determined the degree of agreement of the predicted sites of interaction for both methods to be statistically significant (p <= 0.001). Finally, applying the hypergeometric test and Q-Analysis, we identified 491 interactions with significantly heightened agreement (p <= 0.002). These interactions represent a broad range of biological function including transcriptional regulation, cell proliferation, cytoskeletal dynamics, and apoptosis. These findings corroborate the joint application of these paradigms for future PPI prediction studies.

2016 IEEE EMBS International Student Conference (ISC)