Okay, this is pretty cool.
DreamFusion, Google's next-gen, AI-powered text-to-3D-image generator, is here.
Well, sort of. A proof-of-concept paper is here, at least. DreamFusion is an evolution of Dream Fields, a text-to-3D-image generator revealed by Google back in 2021. And like Dream Fields, DreamFusion creates its 3D images by combining a Neural Radiance Field (NeRF) — or a neural network that can create synthetic 3D scenes using partial 2D datasets — with a pre-trained text-to-image prompt model.
The twist? Unlike Dream Fields, which utilized OpenAI's CLIP technology as that latter pre-trained model, DreamFusion now uses its own: Imagen, Google's DALL-E 2 competitor.
So, basically, Google booted Elon Musk's OpenAI tech and figured out how to use its own. Keeping things in-house — smart.
"Happy to announce DreamFusion, our new method for Text-to-3D!" Ben Poole, a research scientist at Google Brain and co-author of the proof-of-concept paper, wrote on Twitter. "We optimize a NeRF from scratch using a pre-trained text-to-image diffusion model. No 3D data needed!"
Happy to announce DreamFusion, our new method for Text-to-3D!https://t.co/4xI2VHcoQW
We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed!
Joint work w/ the incredible team of @BenMildenhall @ajayj_ @jon_barron#dreamfusion pic.twitter.com/YeG0zaFxuu
— Ben Poole (@poolio) September 29, 2022
Ghost Eating a Hamburger
While the DreamFusion models aren't totally realistic, they're admittedly pretty impressive — as its creators explain the paper, the AI-generated forms that are shown off on its website are "coherent, with high-quality normals, surface geometry and depth, and are relightable with a Lambertian shading model."
In other words, while they might not be as convincingly realistic as some of those photorealistic DALL-E 2 images (yet), they have all of the right elements. The proportions are right, the depth makes sense, and so on. And not to shade OpenAI, but this next version of the tech is certainly a visual improvement from its first iteration.
It's unclear when DreamFusion — or whatever comes next — will be available to the public, though we can definitely see a number of applications already. Just think of the value to indie game developers alone! And according to Twitter, it's already been used to 3D-print a ghost eating a hamburger, so cheers to that.
More on text-to-image generators: Researcher Says an Image Generating AI Invented Its Own Language