Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank

Helfrecht, Benjamin Aaron; Gasparotto, Piero; Giberti, Federico; Ceriotti, Michele

doi:10.3389/fmolb.2019.00024

research article

Atomic Motif Recognition in (Bio)Polymers: Benchmarks From the Protein Data Bank

Helfrecht, Benjamin Aaron

•

Gasparotto, Piero

•

Giberti, Federico

more

2019

Frontiers in Molecular Biosciences

Rationalizing the structure and structure–property relations for complex materials such as polymers or biomolecules relies heavily on the identification of local atomic motifs, e.g., hydrogen bonds and secondary structure patterns, that are seen as building blocks of more complex supramolecular and mesoscopic structures. Over the past few decades, several automated procedures have been developed to identify these motifs in proteins given the atomic structure. Being based on a very precise understanding of the specific interactions, these heuristic criteria formulate the question in a way that implies the answer, by defining a list of motifs based on those that are known to be naturally occurring. This makes them less likely to identify unexpected phenomena, such as the occurrence of recurrent motifs in disordered segments of proteins, and less suitable to be applied to different polymers whose structure is not driven by hydrogen bonds, or even to polypeptides when appearing in unusual, non-biological conditions. Here we discuss how unsupervised machine learning schemes can be used to recognize patterns based exclusively on the frequency with which different motifs occur, taking high-resolution structures from the Protein Data Bank as benchmarks. We first discuss the application of a density-based motif recognition scheme in combination with traditional representations of protein structure (namely, interatomic distances and backbone dihedrals). Then, we proceed one step further toward an entirely unbiased scheme by using as input a structural representation based on the atomic density and by employing supervised classification to objectively assess the role played by the representation in determining the nature of atomic-scale patterns.

Name

helf+19fmb-pdb_pamm.pdf

Type

Publisher's version

Access type

openaccess

License Condition

CC BY

Size

5.4 MB

Format

Adobe PDF

Checksum (MD5)

ef69e1230d4a16b817ebb7d53cfb7d50