Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

West, Robert

doi:10.1145/3616855.3635825

conference paper

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Sakota, Marija

•

Peyrard, Maxime

•

West, Robert

January 1, 2024

Proceedings Of The 17Th Acm International Conference On Web Search And Data Mining, Wsdm 2024

17th ACM International Conference on Web Search and Data Mining (WSDM)

Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size-but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equally hard: some require larger LMs for obtaining a satisfactory solution, whereas for others smaller LMs suffice. Based on this fact, we design a framework for cost-effective language model choice, called "Fly-swat or cannon" (FORC). Given a set of inputs and a set of candidate LMs, FORC judiciously assigns each input to an LM predicted to do well on the input according to a so-called meta-model, aiming to achieve high overall performance at low cost. The costperformance tradeoff can be flexibly tuned by the user. Options include, among others, maximizing total expected performance (or the number of processed inputs) while staying within a given cost budget, or minimizing total cost while processing all inputs. We evaluate FORC on 14 datasets covering five natural language tasks, using four candidate LMs of vastly different size and cost. With FORC, we match the performance of the largest available LM while achieving a cost reduction of 63%. Via our publicly available library,1 researchers as well as practitioners can thus save large amounts of money without sacrificing performance.

Type

conference paper

DOI

10.1145/3616855.3635825

Web of Science ID

WOS:001182230100069

Author(s)

Sakota, Marija

Peyrard, Maxime

West, Robert

Corporate authors

Assoc computing machinery

Date Issued

2024-01-01

Publisher

Assoc Computing Machinery

Publisher place

New York

Published in

Proceedings Of The 17Th Acm International Conference On Web Search And Data Mining, Wsdm 2024

ISBN of the book

979-8-4007-0371-3

Start page

606

End page

615

Subjects

Technology

•

Generative Models

•

Cost-Performance Tradeoff

•

Meta-Modelling

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

DLAB

Event name	Event place	Event date
17th ACM International Conference on Web Search and Data Mining (WSDM)	Merida, MEXICO	MAR 04-08, 2024

Funder	Grant Number
Swiss National Science Foundation	200021_185043
Microsoft Swiss Joint Research Center
Google
Show more

Available on Infoscience

May 1, 2024

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/207609