Text Encoder and Decoder

SVG-T2I: Scaling up Text-to-Image Latent Diffusion Model

Important Note: This repository implements SVG-T2I, a text-to-image diffusion framework that performs visual generation directly in Visual Foundation Model (VFM) representation space, rather than ...

IEEE

VITA: Revolutionizing Traffic Analysis for Autonomous Vehicles with CVT and Encoder-Decoder Integration

Abstract: Recent advancements in sensor technologies, including camera-based systems integrated with computer vision and deep learning, have significantly transformed Advanced Driving Assistance ...

GitHub

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

2023.06.2 Update the pre-trained and fine-tuned Chinese scene text spotting model (78.3% 1-NED on ICDAR 2019 ReCTS). 2023.05.31 The extension paper (DeepSolo++) is submitted to ArXiv. The code and ...

IEEE

Text-Guided Semantic Alignment Network With Spatial-Frequency Interaction for Infrared-Visible Image Fusion Under Extreme Illumination

Abstract: Although text-guided infrared-visible image fusion helps improve content understanding under extreme illumination, existing methods usually ignore semantic differences between textual and ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results