Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors
 
conference paper not in proceedings

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors

Nie, Fan
•
Feng, Lan  
•
Ye, Haotian
Show more
April 7, 2025
CONFERENCE ON LANGUAGE MODELING 2025

Efficiently leveraging of the capabilities of contemporary large language models (LLMs) is increasingly challenging, particularly when direct finetuning is expensive and often impractical. Existing training-free methods, including manually or automated designed workflows, typically demand substantial human effort or yield suboptimal results. This paper proposes Weak-for-Strong Harnessing (W4S), a novel framework that customizes smaller, cost-efficient language models to design and optimize workflows for harnessing stronger models. W4S formulates workflow design as a multi-turn markov decision process and introduces reinforcement learning for agentic workflow optimization (RLAO) to train a weak meta-agent. Through iterative interaction with the environment, the meta-agent learns to design increasingly effective workflows without manual intervention. Empirical results demonstrate the superiority of W4S that our 7B metaagent, trained with just one GPU hour, outperforms the strongest baseline by 2.9% ∼ 24.6% across eleven benchmarks, successfully elevating the performance of state-of-the-art models such as GPT-3.5-Turbo and GPT-4o. Notably, W4S exhibits strong generalization capabilities across both seen and unseen tasks, offering an efficient, high-performing alternative to directly fine-tuning strong models. Code is available here.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

2504.04785v1.pdf

Type

Main Document

Version

http://purl.org/coar/version/c_71e4c1898caa6e32

Access type

openaccess

License Condition

CC BY

Size

2.49 MB

Format

Adobe PDF

Checksum (MD5)

1eb46935ddea31f3bc7f0cdd3a57ae95

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés