Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling
 
conference paper

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Sakota, Marija  
•
Peyrard, Maxime  
•
West, Robert  
January 1, 2024
Proceedings Of The 17Th Acm International Conference On Web Search And Data Mining, Wsdm 2024
17th ACM International Conference on Web Search and Data Mining (WSDM)

Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size-but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equally hard: some require larger LMs for obtaining a satisfactory solution, whereas for others smaller LMs suffice. Based on this fact, we design a framework for cost-effective language model choice, called "Fly-swat or cannon" (FORC). Given a set of inputs and a set of candidate LMs, FORC judiciously assigns each input to an LM predicted to do well on the input according to a so-called meta-model, aiming to achieve high overall performance at low cost. The costperformance tradeoff can be flexibly tuned by the user. Options include, among others, maximizing total expected performance (or the number of processed inputs) while staying within a given cost budget, or minimizing total cost while processing all inputs. We evaluate FORC on 14 datasets covering five natural language tasks, using four candidate LMs of vastly different size and cost. With FORC, we match the performance of the largest available LM while achieving a cost reduction of 63%. Via our publicly available library,1 researchers as well as practitioners can thus save large amounts of money without sacrificing performance.

  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3616855.3635825
Web of Science ID

WOS:001182230100069

Author(s)
Sakota, Marija  
Peyrard, Maxime  
West, Robert  
Corporate authors
Assoc computing machinery
Date Issued

2024-01-01

Publisher

Assoc Computing Machinery

Publisher place

New York

Published in
Proceedings Of The 17Th Acm International Conference On Web Search And Data Mining, Wsdm 2024
ISBN of the book

979-8-4007-0371-3

Start page

606

End page

615

Subjects

Technology

•

Generative Models

•

Cost-Performance Tradeoff

•

Meta-Modelling

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
DLAB  
Event nameEvent placeEvent date
17th ACM International Conference on Web Search and Data Mining (WSDM)

Merida, MEXICO

MAR 04-08, 2024

FunderGrant Number

Swiss National Science Foundation

200021_185043

Microsoft Swiss Joint Research Center

Google

Show more
Available on Infoscience
May 1, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/207609
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés