Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Deep Non-Rigid Structure-From-Motion: A Sequence-to-Sequence Translation Perspective
 
research article

Deep Non-Rigid Structure-From-Motion: A Sequence-to-Sequence Translation Perspective

Deng, Hui
•
Zhang, Tong  
•
Dai, Yuchao
Show more
2024
IEEE Transactions on Pattern Analysis and Machine Intelligence

Directly regressing the non-rigid shape and camera pose from the individual 2D frame is ill-suited to the Non-Rigid Structure-from-Motion (NRSfM) problem. This frame-by-frame 3D reconstruction pipeline overlooks the inherent spatial-temporal nature of NRSfM, i.e., reconstructing the 3D sequence from the input 2D sequence. In this paper, we propose to solve deep sparse NRSfM from a sequence-to-sequence translation perspective, where the input 2D keypoints sequence is taken as a whole to reconstruct the corresponding 3D keypoints sequence in a self-supervised manner. First, we apply a shape-motion predictor on the input sequence to obtain an initial sequence of shapes and corresponding motions. Then, we propose the Context Layer, which enables the deep learning framework to effectively impose overall constraints on sequences based on the structural characteristics of non-rigid sequences. The Context Layer constructs modules for imposing the self-expressiveness regularity on non-rigid sequences with multi-head attention (MHA) as the core, together with the use of temporal encoding, both of which act simultaneously to constitute constraints on non-rigid sequences in the deep framework. Experimental results across different datasets such as Human3.6M, CMU Mocap, and InterHand prove the superiority of our framework. The code will be made publicly available.

  • Details
  • Metrics
Type
research article
DOI
10.1109/TPAMI.2024.3443922
Scopus ID

2-s2.0-85201443380

PubMed ID

39150802

Author(s)
Deng, Hui

Northwestern Polytechnical University

Zhang, Tong  

École Polytechnique Fédérale de Lausanne

Dai, Yuchao

Northwestern Polytechnical University

Shi, Jiawei

Northwestern Polytechnical University

Zhong, Yiran

Shanghai Artificial Intelligence Laboratory

Li, Hongdong

The Australian National University

Date Issued

2024

Published in
IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume

46

Issue

12

Start page

10814

End page

10828

Subjects

Non-rigid structure-from-motion (NRSfM)

•

self- expressiveness

•

self-attention

•

sequence-to-sequence

•

temporal encoding

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
IVRL  
FunderFunding(s)Grant NumberGrant URL

Fundamental Research Funds for the Central Universities

National Natural Science Foundation of China

61871325,62271410

Swiss National Science Foundation

CRSII5-180359

Available on Infoscience
January 24, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/243535
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés