Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Dated: Guidelines for Creating Synthetic Datasets for Engineering Design Applications
 
conference paper

Dated: Guidelines for Creating Synthetic Datasets for Engineering Design Applications

Picard, Cyril
•
Schiffmann, Jurg  
•
Ahmed, Faez
January 1, 2023
Proceedings Of Asme 2023 International Design Engineering Technical Conferences And Computers And Information In Engineering Conference, Idetc-Cie2023, Vol 3A
ASME International Design Engineering Technical Conferences / Computers and Information in Engineering Conference (IDETC-CIE) / 49th Design Automation Conference (DAC)

Exploiting the recent advancements in artificial intelligence, showcased by ChatGPT and DALL-E, in real-world applications necessitates vast, domain-specific, and publicly accessible datasets. Unfortunately, the scarcity of such datasets poses a significant challenge for researchers aiming to apply these breakthroughs in engineering design. Synthetic datasets emerge as a viable alternative. However, practitioners are often uncertain about generating high-quality datasets that accurately represent real-world data and are suitable for the intended downstream applications. This study aims to fill this knowledge gap by proposing comprehensive guidelines for generating, annotating, and validating synthetic datasets. The trade-offs and methods associated with each of these aspects are elaborated upon. Further, the practical implications of these guidelines are illustrated through the creation of a turbo-compressors dataset. The study underscores the importance of thoughtful sampling methods to ensure the appropriate size, diversity, utility, and realism of a dataset. It also highlights that design diversity does not equate to performance diversity or realism. By employing test sets that represent uniform, real, or task-specific samples, the influence of sample size and sampling strategy is scrutinized. Overall, this paper offers valuable insights for researchers intending to create and publish synthetic datasets for engineering design, thereby paving the way for more effective applications of AI advancements in the field. The code and data for the dataset and methods are made publicly accessible at https://github.com/cyrilpic/radcomp.

  • Details
  • Metrics
Type
conference paper
Web of Science ID

WOS:001221579000015

Author(s)
Picard, Cyril
Schiffmann, Jurg  
Ahmed, Faez
Corporate authors
AMER SOC MECHANICAL ENGINEERS
Date Issued

2023-01-01

Publisher

Amer Soc Mechanical Engineers

Publisher place

New York

Published in
Proceedings Of Asme 2023 International Design Engineering Technical Conferences And Computers And Information In Engineering Conference, Idetc-Cie2023, Vol 3A
ISBN of the book

978-0-7918-8730-1

Subjects

Technology

•

Model

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LAMD  
Event nameEvent placeEvent date
ASME International Design Engineering Technical Conferences / Computers and Information in Engineering Conference (IDETC-CIE) / 49th Design Automation Conference (DAC)

Boston, MA

AUG 20-23, 2023

FunderGrant Number

Swiss National Science Foundation

P500PT_206937

Available on Infoscience
June 19, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/208593
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés