Your resource for web content, online publishing
and the distribution of digital products.
S M T W T F S
 
 
 
 
 
1
 
2
 
3
 
4
 
5
 
6
 
7
 
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
 
24
 
25
 
26
 
27
 
28
 
29
 
30
 

Visualizing Promptable and Open-Vocabulary Segmentation Across Multiple Datasets

Tags: framework
DATE POSTED:November 13, 2024

:::info Authors:

(1) Zhaoqing Wang, The University of Sydney and AI2Robotics;

(2) Xiaobo Xia, The University of Sydney;

(3) Ziye Chen, The University of Melbourne;

(4) Xiao He, AI2Robotics;

(5) Yandong Guo, AI2Robotics;

(6) Mingming Gong, The University of Melbourne and Mohamed bin Zayed University of Artificial Intelligence;

(7) Tongliang Liu, The University of Sydney.

:::

Table of Links

Abstract and 1. Introduction

2. Related works

3. Method and 3.1. Problem definition

3.2. Baseline and 3.3. Uni-OVSeg framework

4. Experiments

4.1. Implementation details

4.2. Main results

4.3. Ablation study

5. Conclusion

6. Broader impacts and References

\ A. Framework details

B. Promptable segmentation

C. Visualisation

C. Visualisation

We illustrate a wide range of visualisations of promptable segmentation and open-vocabulary segmentation across multiple datasets.

\ Figure 7. Box-promptable segmentation performance. We compare our method with SAM-ViT/L [34] on a wide range of datasets. Given a ground-truth box as the visual prompt, we select the output masks with max IoU by calculating the IoU with the ground-truth masks. We report 1-pt IoU for all datasets.

\ Figure 8. Point-promptable segmentation performance. We compare our method with SAM-ViT/L [34] on the SegInW datasets [87]. Given a 20 × 20 point grid as a visual prompt, we select the output masks with max IoU by calculating the IoU with the ground-truth masks. We report 1-pt IoU for all datasets.

\ Figure 9. Box-promptable segmentation performance. We compare our method with SAM-ViT/L [34] on the SegInW datasets [87]. Given a ground-truth box as the visual prompt, we select the output masks with max IoU by calculating the IoU with the ground-truthmasks. We report 1-pt IoU for all datasets.

\ Table 5. Segmentation datasets used to evaluate promptable segmentation with point and box prompts. The 11 datasets cover a broad range of domains, which are illustrated in “image type”.

\ Figure 10. Visualisation of open-vocabulary segmentation between the baseline and ours Uni-OVSeg.

\ Figure 11. Visualisation of open-vocabulary segmentation between the baseline and ours Uni-OVSeg.

\ Figure 12. Visualisation of open-vocabulary segmentation between the baseline and ours Uni-OVSeg.

\ Figure 13. Visualisation of promptable segmentation between SAM-ViT/L and ours Uni-OVSeg.

\ Figure 14. Visualisation of promptable segmentation between SAM-ViT/L and ours Uni-OVSeg.

\ Figure 15. Visualisation of promptable segmentation between SAM-ViT/L and ours Uni-OVSeg.

\ Figure 16. Visualisation of promptable segmentation between SAM-ViT/L and ours Uni-OVSeg.

\ Figure 17. Visualisation of promptable segmentation between SAM-ViT/L and ours Uni-OVSeg.

\

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Tags: framework