A generative AI approach to cost-effective advertisement based on synthetic images
Advertisement heavily relies on compelling visuals to engage audiences across sectors. Recent advances in AIdriven text-to-image generation, particularly diffusion models like Stable Diffusion, offer novel opportunities for hyper-personalized and context-aware advertising content. However, challenges remain in precise control over image composition, segmentation robustness, and semantic consistency. In this work, we enhance the state-of-the-art Anywhere-Multi-Agent framework by replacing the original RMBG segmentation module with the Segment Anything Model (SAM), integrated via an interactive web interface enabling user-guided mask refinement. We further improve generation fidelity through prompt engineering with language models and explore multiple ControlNet conditioning strategies, including Canny, depth, and their combination modalities. Our experiments demonstrate significant gains in segmentation accuracy, object placement, and background coherence, facilitating flexible and precise image composition suitable for real-world advertising workflows. These modular improvements pave the way for scalable, controllable generative pipelines that better align AI outputs with user intent.
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
École Polytechnique Fédérale de Lausanne
2025-09-16
Proceedings; 13605
54
REVIEWED
EPFL
| Event name | Event acronym | Event place | Event date |
San Diego, United States | 2025-08-03 - 2025-08-08 | ||