0

I am looking for a multi-modal model that can generate a mix of images and text when prompted with a query in text only.

Please find a sample expected input and output below.

Are there any models that can achieve this? Please suggest.

I have seen Next-GPT capable of generating Images based on prompt however was wondering if there are models that can decide if a text can be presented as an Image and presents the same in output.

e.g.

Input - What was the market share of smartphones in the year 2008?

Output - The marketshare of smartphones segregated by manufacturer are as listed below : Nokia: 265.6148 million (32.5% market share) Motorola: 144.9204 million (17.7% market share) Samsung: 103.7536 million (12.7% market share) LG: 54.9246 million (6.7% market share) Sony Ericsson: 51.7738 million (6.3% market share) Siemens: 28.5906 million (3.5% market share) Others: 166.9851 million (20.6% market share) Total: 816.5629 million

pie chart

Source - Wikipedia

nbro
  • 42,615
  • 12
  • 119
  • 217

0 Answers0