SA-DNet is an end-to-end Semantic-Aware on-Demand registration-fusion network. SA-DNet can specify registration and fusion regions based on semantic information, achieving on-demand conceptualization. Furthermore, the sam, reg, and fuse modules are separated, allowing integration of new functionalities. The current architecture implemented in this project is as follows:
In this project, the HOL operators in reg have not been implemented. When SAM perceives regions that are too large, it leads to excessive computation in HOL, causing the system to freeze.
git clone git@github.com:Meng-Sang/SA-DNet.git
cd SA-DNetThis project requires a runtime environment with Python=3.8 or higher. Additionally, it depends on the libraries specified in the requirements.txt file and 2 GitHub repositories (Grounding DINO and Segment Anything). The environment configuration is as follows:
conda create -n sa-dnet python=3.8
pip install -r requirement.txtStep 2: Configure Grounding DINO
Install Grounding DINO
For Grounding DINO installation, please refer to Grounding DINO.
# if model/sam/grounding_dino/weight not exist.
# please run: mkdir -p model/sam/grounding_dino/weight
# then:
cd model/sam/grounding_dino/weight
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pthIf the network is unavailable, the following error may occur:
OSError: Can't load tokenizerStep 3: Configure Segment Anything
Install Segment Anything
git clone git@github.com:facebookresearch/segment-anything.git
cd segment-anything
pip install -e .or
pip install git+git://github.com/facebookresearch/segment-anything.git# if model/sam/grounding_dino/weight not exist.
# please run: mkdir -p model/sam/segment_anything/weight
# then:
cd model/sam/segment_anything/weight
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pthimport cv2
from utils.predict import sa_dnet
ir_image = cv2.imread("assets/ir/006393.jpg")
vi_image = cv2.imread("assets/vi/006393.jpg")
_, fuse_image = sa_dnet(ir_image, vi_image, "car", is_mask=True)
cv2.imwrite("assets/demo.jpg", fuse_image)This project depends on the following repositories:
If you find our work helpful for your research, please consider citing the following BibTeX entry
@article{sang2025end,
title={An end-to-end semantic-guided infrared and visible registration-fusion network for advanced visual tasks},
author={Sang, Meng and Xie, Housheng and Meng, Jingrui and Zhang, Yukuan and Qiu, Junhui and Zhao, Shan and Yang, Yang},
journal={Engineering Applications of Artificial Intelligence},
volume={149},
pages={110489},
year={2025},
publisher={Elsevier}
}





