At train-time, we embed each pixel of the ground truth image SI as the mean of predefined guide functions f over instance pixels it belongs to, resuling in embeddings e(S, Ψ). We then train the ...
Results that may be inaccessible to you are currently showing.
Hide inaccessible results