Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Multiplex: Unifying conventional and speculative thread-level parallelism on a chip multiprocessor
 
conference paper

Multiplex: Unifying conventional and speculative thread-level parallelism on a chip multiprocessor

Ooi, Chong-Liang
•
Kim, Seon Wook
•
Park, Ill
Show more
2001
Proceedings of the International Conference on Supercomputing

Recent proposals for Chip Multiprocessors (CMPs) advocate speculative, or implicit, threading in which the hardware employs prediction to peel off instruction sequences (i.e., implicit threads) from the sequential execution stream and speculatively executes them in parallel on multiple processor cores. These proposals augment a conventional multiprocessor, which employs explicit threading, with the ability to handle implicit threads. Current proposals focus on only implicitly-threaded code sections. This paper identifies, for the first time, the issues in combining explicit and implicit threading. We present the Multiplex architecture to combine the two threading models. Multiplex exploits the similarities between implicit and explicit threading, and provides a unified support for the two threading models without additional hardware. Multiplex groups a subset of protocol states in an implicitly-threaded CMP to provide a write-invalidate protocol for explicit threads. Using a fully-integrated compiler inf rastructure for automatic generation of Multiplex code, this paper presents a detailed performance analysis for entire benchmarks, instead of just implicitly- threaded sections, as done in previous papers. We show that neither threading models alone performs consistently better than the other across the benchmarks. A CMP with four dual-issue CPUs achieves a speedup of 1.48 and 2.17 over one dual-issue CPU, using implicit-only and explicit-only threading, respectively. Multiplex matches or outperforms the better of the two threading models for every benchmark, and a four-CPU Multiplex achieves a speedup of 2.63. Our detailed analysis indicates that the dominant overheads in an implicitly-threaded CMP are speculation state overflow due to limited L1 cache capacity, and load imbalance and data dependences in fine-grain threads.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1145/377792.377863
Author(s)
Ooi, Chong-Liang
Kim, Seon Wook
Park, Ill
Eigenmann, Rudolph
Falsafi, Babak  
Vijaykumar, T. N.
Date Issued

2001

Publisher place

Yorktown Heights, NY, USA

Published in
Proceedings of the International Conference on Supercomputing
Start page

368

End page

380

Editorial or Peer reviewed

REVIEWED

Written at

OTHER

EPFL units
PARSA  
Event placeEvent date
Available on Infoscience
April 6, 2009
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/36923
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés