Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. DiffPaSS-high-performance differentiable pairing of protein sequences using soft scores
 
research article

DiffPaSS-high-performance differentiable pairing of protein sequences using soft scores

Lupo, Umberto  
•
Sgarbossa, Damiano  
•
Milighetti, Martina
Show more
December 26, 2024
Bioinformatics (Oxford, England)

MOTIVATION: Identifying interacting partners from two sets of protein sequences has important applications in computational biology. Interacting partners share similarities across species due to their common evolutionary history, and feature correlations in amino acid usage due to the need to maintain complementary interaction interfaces. Thus, the problem of finding interacting pairs can be formulated as searching for a pairing of sequences that maximizes a sequence similarity or a coevolution score. Several methods have been developed to address this problem, applying different approximate optimization methods to different scores. RESULTS: We introduce Differentiable Pairing using Soft Scores (DiffPaSS), a differentiable framework for flexible, fast, and hyperparameter-free optimization for pairing interacting biological sequences, which can be applied to a wide variety of scores. We apply it to a benchmark prokaryotic dataset, using mutual information and neighbor graph alignment scores. DiffPaSS outperforms existing algorithms for optimizing the same scores. We demonstrate the usefulness of our paired alignments for the prediction of protein complex structure. DiffPaSS does not require sequences to be aligned, and we also apply it to nonaligned sequences from T-cell receptors. AVAILABILITY AND IMPLEMENTATION: A PyTorch implementation and installable Python package are available at https://github.com/Bitbol-Lab/DiffPaSS.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

btae738.pdf

Type

Main Document

Version

Published version

Access type

openaccess

License Condition

CC BY

Size

2.82 MB

Format

Adobe PDF

Checksum (MD5)

b8ebc66a4cc9bb9c8664d8ece469b450

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés