Occlusion Resilient 3D Human Pose Estimation
Occlusions remain one of the key challenges in 3D body pose estimation from single-camera video sequences. Temporal consistency has been extensively used to mitigate their impact but the existing algorithms in the literature do not explicitly model them.Here, we apply this by representing the deforming body as a spatio-temporal graph. We then introduce a refinement network that performs graph convolutions over this graph to output 3D poses. To ensure robustness to occlusions, we train this network with a set of binary masks that we use to disable some of the edges as in drop-out techniques.In effect, we simulate the fact that some joints can be hidden for periods of time and train the network to be immune to that. We demonstrate the effectiveness of this approach compared to state-of-the-art techniques that infer poses from single-camera sequences.
2-s2.0-85196744350
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
Samsung AI Center Toronto
EPFL
2024
9798350362459
1198
1207
REVIEWED
EPFL
| Event name | Event acronym | Event place | Event date |
Davos, Switzerland | 2024-03-18 - 2024-03-21 | ||