✨
AI Summary
- Introduced BLIP3-o, a unified multimodal model for image understanding and generation.
- Found CLIP features and flow matching effective for generation.
- Sequential training (understanding then generation) yielded best results.