Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Enhancing Hand and Object Detection for Monitoring Patients with Upper-Limb Impairment: A Study on the Impact of Input Size in Foundation Models
 
conference paper

Enhancing Hand and Object Detection for Monitoring Patients with Upper-Limb Impairment: A Study on the Impact of Input Size in Foundation Models

Izadmehr, Yasaman  
•
Aminian, Kamiar  
•
Perez-Uribe, Andres
2023
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
IEEE International Conference on Systems, Man, and Cybernetics

The choice of input image size can have a significant impact on the performance of the state-of-the-art algorithms. We can always customize the algorithms by training and fine-tuning them on our datasets, but it is time consuming. Nowadays there is a trend to use foundation models but in our application of monitoring patients, both hand detection, object detection and hand-object interaction detection resulted in mediocre performance. This study aimed to investigate the significance of input size for detecting hand-object interaction in two datasets: the patient dataset (captured by super view mode) and the EpicKitchen dataset (captured by normal view mode). The results showed that using different input sizes with the same foundation model can lead to a significant improvement in performance. In the patient dataset, using frames with input sizes of 300 × 300 pixels (px) and 256 × 256 px after cropping and resizing the original images led to more successful hand detection results. Furthermore, using video processing tools like FFmpeg for resizing frames instead of passing the original images to the MediaPipe model for resizing resulted in a 33% improvement. In the EpicKitchen dataset with normal view mode, successful hand detection results were obtained by resizing frames into a rectangle of 256 px and 300 px after padding and cropping the original images. Overall, the study emphasizes the significance of input size for detecting hand-object interaction detection for the purpose of monitoring patients with upper-limb impairment. The combination analysis within each dataset showed that the most effective combination in hand-object interaction detection is achieved by applying the MediaPipe model to an input image size of 300 × 300 px (for super view mode) or 256 × 256 px (for normal view mode) along with the result of YOLOv7 model with an input image size of 1920 × 1920 px. By using this combination, a 100% success rate was achieved for both datasets.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/SMC53992.2023.10393880
Scopus ID

2-s2.0-85187269135

Author(s)
Izadmehr, Yasaman  

École Polytechnique Fédérale de Lausanne

Aminian, Kamiar  

École Polytechnique Fédérale de Lausanne

Perez-Uribe, Andres

University of Applied Sciences Western Switzerland

Date Issued

2023

Publisher

Institute of Electrical and Electronics Engineers Inc.

Published in
Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
ISBN of the book

9798350337020

Start page

4702

End page

4707

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
PH-STI  
Event nameEvent acronymEvent placeEvent date
IEEE International Conference on Systems, Man, and Cybernetics

Hybrid, Honolulu, United States

2023-10-01 - 2023-10-04

Available on Infoscience
January 26, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/244849
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés