Important Note: This repository implements SVG-T2I, a text-to-image diffusion framework that performs visual generation directly in Visual Foundation Model (VFM) representation space, rather than ...
Abstract: Recent advancements in sensor technologies, including camera-based systems integrated with computer vision and deep learning, have significantly transformed Advanced Driving Assistance ...
2023.06.2 Update the pre-trained and fine-tuned Chinese scene text spotting model (78.3% 1-NED on ICDAR 2019 ReCTS). 2023.05.31 The extension paper (DeepSolo++) is submitted to ArXiv. The code and ...
Abstract: Although text-guided infrared-visible image fusion helps improve content understanding under extreme illumination, existing methods usually ignore semantic differences between textual and ...