Transforming the Interactive Segmentation forMedical Imaging
	
	
	
	  
            
                    | 
              
                    1CMIC, Shanghai Jiao Tong University
                
                 | 
                
                  
                        2Shanghai AI Laboratory
                    
                     | 
        
	
	
	
      
       Abstract 
 
      
      
	      The goal of this paper is to interactively refine the automaticsegmentation on challenging structures that fall behind human perfor-mance, either due to the scarcity of available annotations or the difficultynature of the problem itself, for example, on segmenting cancer or smallorgans. Specifically, we propose a novelTransformer-based architectureforInteractiveSegmentation (TIS), that treats the refinement task as aprocedure for grouping pixels with similar feature representation to thoseclicks given by the end users. Our proposed architecture is composedof two variants of Transformer Decoder, which naturally fulfills featurecomparison with the attention mechanisms. In contrast to existing ap-proaches, our proposed TIS is not limited to binary segmentations, andallows the user to edit the masks of arbitrary number of categories. Tovalidate the proposed approach, we conduct extensive experiments onthree challenging datasets and demonstrate superior performance overthe existing state-of-the-art methods.
      
	
      
        Method 
 
      
        Structure of our method. (a) Overview of the whole framework. (b) Structureof click encoding module. (d) Structure of label assignment module. 
	
        
      
      
      
      Results
      R1: Compared with baselines  
      
	      Performances of different methods on three datasets with 10 clicks.
      
      
      
	      Improvements in one interactive sequence of different methods on three datasets(four structures).
      
      
      
      
      R2: Visualization  
	
      
	      Visualization of qualitative comparison on three datasets (four structures).
      
      
      
      
      
        Acknowledgements 
 
       
	      Based on a template by Phillip Isola and Richard Zhang.