Books >Machine Learning under Resour... >2.6 Protein Complex Similarity

2.6 Protein Complex Similarity

is part of: Machine Learning under Resource Constraints - Applications

Download PDF
Download References
Request Permissions
Save to
Alerts

Chapter Abstract:

Proteins have manifold functions in living cells, including structural integrity, transport, defense against pathogens, or message transmission, to name but a few. Recent...Show More

Metadata

Chapter Abstract:

Proteins have manifold functions in living cells, including structural integrity, transport, defense against pathogens, or message transmission, to name but a few. Recent advances in Machine Learning appear to have solved the protein folding problem, i.e., how to obtain the three-dimensional functional protein structure from the amino acid sequence of the protein. However, proteins rarely act alone, but instead perform their functions together with other proteins in so-called protein complexes. Quantifying the similarity between two protein complexes is essential for numerous applications, e.g., for database searches of complexes that are similar to a given input complex. While similarity measures have been extensively studied on single proteins and on protein families, there is little work on modeling and computing the similarity between protein complexes yet. Because protein complexes can be naturally modeled as graphs, graph similarity measures may be used, but these are often computationally hard to obtain and do not take typical properties of protein complexes into account. We introduce a parametric family of similarity measures based on Weisfeiler-Leman labeling see "The Weisfeiler-Leman Algorithm for Machine Learning with Graphs" in Section 4.2 in Volume 1. Based on simulated complexes, we show that the defined family of similarity measures is in good agreement with edit similarity, a similarity measure derived from graph edit distance, though it can be computed more efficiently.Moreover, in contrast to graph edit similarity, the proposed measures allow for an efficient similarity search in large volumes of protein complex data. It can therefore be used as a basis for large-scale machine learning applications.

Page(s): 85 - 102

ISBN Information:

DOI: 10.1515/9783110785982

2.6 Protein Complex Similarity

Chapter Abstract:

Metadata

Chapter Abstract:

IEEE Account

Purchase Details

Profile Information

Need Help?

2.6 Protein Complex Similarity

Alerts

Chapter Abstract:

Metadata

Chapter Abstract:

Authors

Keywords

Metrics

IEEE Account

Purchase Details

Profile Information

Need Help?