CASIA-IVA-Lab

DANet Public

Dual Attention Network for Scene Segmentation (CVPR2019)

Python 2.5k 484

VALOR Public

[TPAMI2024] Codes and Models for VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Python 306 18

VAST Public

[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

Jupyter Notebook 297 18

MRES Public

This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation", accepted by CVPR 2024.

72

ChatBridge Public

ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.

Python 54 1

VideoNIAH Public

VideoNIAH: A Flexible Synthetic Method for Benchmarking Video MLLMs

Python 54 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CASIA-IVA-Lab

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!