35

I am a new learner in NLP. I am interested in the sentence generating task. As far as I am concerned, one state-of-the-art method is the CharRNN, which uses RNN to generate a sequence of words.

However, BERT has come out several weeks ago and is very powerful. Therefore, I am wondering whether this task can also be done with the help of BERT? I am a new learner in this field, and thank you for any advice!

kmario23
  • 105
  • 5
ch271828n
  • 453
  • 1
  • 4
  • 11

3 Answers3

37

For newbies, NO.

Sentence generation requires sampling from a language model, which gives the probability distribution of the next word given previous contexts. But BERT can't do this due to its bidirectional nature.


For advanced researchers, YES.

You can start with a sentence of all [MASK] tokens, and generate words one by one in arbitrary order (instead of the common left-to-right chain decomposition). Though the text generation quality is hard to control.

Here's the technical report BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model, its errata and the source code.


In summary:

  • If you would like to do some research in the area of decoding with BERT, there is a huge space to explore
  • If you would like to generate high quality texts, personally I recommend you to check GPT-2.
soloice
  • 531
  • 4
  • 7
10

this experiment by Stephen Mayhew suggests that BERT is lousy at sequential text generation:

http://mayhewsw.github.io/2019/01/16/can-bert-generate-text/

although he had already eaten a large meal, he was still very hungry

As before, I masked “hungry” to see what BERT would predict. If it could predict it correctly without any right context, we might be in good shape for generation.

This failed. BERT predicted “much” as the last word. Maybe this is because BERT thinks the absence of a period means the sentence should continue. Maybe it’s just so used to complete sentences it gets confused. I’m not sure.

One might argue that we should continue predicting after “much”. Maybe it’s going to produce something meaningful. To that I would say: first, this was meant to be a dead giveaway, and any human would predict “hungry”. Second, I tried it, and it keeps predicting dumb stuff. After “much”, the next token is “,”.

So, at least using these trivial methods, BERT can’t generate text.

stuart
  • 201
  • 2
  • 5
2

No. Sentence generating is directly related to language modelling (given the previous words in the sentence, what is the next word). Because of bi-directionality of BERT, BERT cannot be used as a language model. If it cannot be used as language model, I don't see how you can generate a sentence using BERT.

nbro
  • 42,615
  • 12
  • 119
  • 217
Astariul
  • 371
  • 3
  • 15