Is there a popular diffusion model-based framework for language modelling? If not, is it because of the difficulty sampling for discrete distributions?
Asked
Active
Viewed 94 times
2 Answers
1
The problem with the current diffusion models, is that they are great density models, but they require to know a priori the size of the sample (ie the size of the image), which definitely is not the case for text
However, you might get around this by first having a model, an encoder, that encodes text to a fixed size vector, and then a decoder that decodes that back to text, and use the diffusion model on the latest space of it
However, most likely, it won't give great results, for the same reasons Seq2Seq is not the SoTA for text generation
Alberto
- 2,863
- 5
- 12
0
Examples of language modelling using diffusion:
- CodeFusion: A Pre-trained Diffusion Model for Code Generation. Mukul Singh, José Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, Gust Verbruggen. EMNLP 2023.
- Step-unrolled Denoising Autoencoders for Text Generation. Nikolay Savinov, Junyoung Chung, Mikolaj Binkowski, Erich Elsen, Aaron van den Oord. ICLR 2022.
Franck Dernoncourt
- 3,473
- 2
- 21
- 39