Although just released in 2017, the Transformer architecture (Vaswani et al.) has widely become the de facto standard approach for sequence-to-sequence problems like translation or abstractive text summarization. Despite the great success, the relatively slow inference process as well the general lack of flexibility due to the autoregressive nature of the generation process can quickly become a problem. The Levenshtein Transformer (Gu et al., 2019) proposes to overcome these flaws by not modelling the output sequence directly but two operations on top of a given input instead: insertions and deletions, effectively turning the generation process into a non-autoregressive one, allowing to refine a given sequence over and over again in a dynamic and flexible way. In this talk, we will have a short introduction to language modelling, common implementations like recurrent models or Transformers before we turn to the actual Levenshtein Transformer. Finally we will have a short discussion on possible applications outside of the scope of the original paper.
Modern neural sequence generation models are built to either generate tokens step-by-step from scratch or (iteratively) modify a sequence of tokens bounded by a fixed length. In this work, we develop Levenshtein Transformer, a new partially autoregressive model devised for more flexible and amenable sequence generation. Unlike previous approaches, the atomic operations of our model are insertion …