Composite Relationship Fields with Transformers for Scene Graph Generation

Adaimi, George; Mizrahi, David; Alahi, Alexandre

doi:10.1109/WACV56688.2023.00014

conference paper

Composite Relationship Fields with Transformers for Scene Graph Generation

Adaimi, George

•

Mizrahi, David

•

Alahi, Alexandre

2023

2023 IEEE/CVF Winter Conference On Applications Of Computer Vision (Wacv)

IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2023)

Scene graph generation (SGG) methods extract relationships between objects. While most methods focus on improving top-down approaches, which build a scene graph based on detected objects from an off-the-shelf object detector, there is a limited amount of work on bottom-up approaches, which jointly detect objects and their relationships in a single stage. In this work, we present a novel bottom-up SGG approach by representing relationships using Composite Relationship Fields (CoRF). CoRF turns relationship detection into a dense regression and classification task, where each cell of the output feature map identifies surrounding objects and their relationships. Furthermore, we propose a refinement head that leverages Transformers for global scene reasoning, resulting in more meaningful relationship predictions. By combining both contributions, our method outperforms previous bottom-up methods on the Visual Genome dataset by 26% while preserving real-time performance.

Name

0455.pdf

Type

Postprint

Version

Accepted version

Access type

openaccess

License Condition

copyright

Size

5.05 MB

Format

Adobe PDF

Checksum (MD5)

bf0fd16ed8310b3c9417781d6faaea81