This paper introduces a new pre-training method for bidirectional transformers that improves performance on a variety of language understanding problems.
Share this post
Cloze-Driven Pretraining of Self-Attention…
Share this post
This paper introduces a new pre-training method for bidirectional transformers that improves performance on a variety of language understanding problems.