🔗 Reference

April 13, 2022 - Hierarchical text-conditional image generation with CLIP latents

https://arxiv.org/pdf/2204.06125.pdf

https://openai.com/research/hierarchical-text-conditional-image-generation-with-clip-latents

✍🏻 Background

Screen Shot 2023-04-18 at 1.55.54 pm.png

Screen Shot 2023-04-18 at 1.55.31 pm.png

The Images API provides three methods for interacting with images:

  1. Creating images from scratch based on a text prompt
  2. Creating edits of an existing image based on a new text prompt
  3. Creating variations of an existing image

Screen Shot 2023-04-18 at 2.00.50 pm.png

Screen Shot 2023-04-18 at 2.01.32 pm.png

<aside> 🌟 不是一对一映射关系,只用扩散模型,可以产生任意多模型,只是细节不同

</aside>

⏰ Timeline

DALLE CogView Nvwa女娲模型 GLIDE ERNIE-VILG DALLE 2 CogView2 Cog Video Imagen
01/21 05/21 11/21 12/21 12/21 04/22 04/22 05/22 05/22
OpenAI 清华 微软北大 OpenAI 百度 OpenAI 清华 清华 Google
120亿参数 支持中文生成图像 图像+短视频 DALLE2基础 支持中文,100亿参数 针对视频生成 模型简单,和DALLE相似,使用扩散模型