jd7h / zero123plusplus

Turn an image into a set of images from different 3D angles

  • Public
  • 6.5K runs
  • GitHub
  • Paper
  • License

Input

Output

Run time and cost

This model runs on Nvidia A40 (Large) GPU hardware. Predictions typically complete within 117 seconds. The predict time for this model varies significantly based on the inputs.

Readme

Zero123++

Zero123++ is a single image to consistent multi-view diffusion base model. The input image needs to be square, and the recommended image resolution is >=320x320

Output views are a fixed set of camera poses relative to the input view:

  • Azimuth: 30, 90, 150, 210, 270, 330.
  • Elevation: 30, -20, 30, -20, 30, -20.

Paper

If you found Zero123++ helpful, please cite the paper:

@misc{shi2023zero123plus,
      title={Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model}, 
      author={Ruoxi Shi and Hansheng Chen and Zhuoyang Zhang and Minghua Liu and Chao Xu and Xinyue Wei and Linghao Chen and Chong Zeng and Hao Su},
      year={2023},
      eprint={2310.15110},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}