Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Student works
  4. LaMBERT: Light and Multigranular BERT
 
semester or other student projects

LaMBERT: Light and Multigranular BERT

Milosheski, Ljupche  
July 1, 2020

Pre-training complex language models is essential for the success of the recent methods such as BERT or OpenAI GPT. Their size makes not only the pre-training phase, but also consecutive applications to be computationally expensive. BERT-like models excel at token-level tasks as they provide reliable token embeddings, but they fall short when it comes to sentence or higher-level structure embeddings. The reason is that these models do not have a built-in mechanism that explicitly provides such representations. We introduce Light and Multigranural BERT that has similar complexity to BERT in the number of parameters, but is about 3 times faster by modifying the input representation, which consequently introduces changes to the attention mechanism and at the same time produces reliable segment embeddings as it is one of our training objectives. The model we publish achieves 70.7% on the MNLI task, which is promising bearing in mind there were two major issues with it.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

LaMBERT.pdf

Type

Preprint

Version

Submitted version (Preprint)

Access type

openaccess

Size

171.53 KB

Format

Adobe PDF

Checksum (MD5)

0b16d2adf69378965524ace34b29ef84

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés