Finlay GC Hudson, William AP Smith
University of York
This repository contains the implementation of Iterative Cluster-free Re-ranking (ICFRR) for refining image retrieval results. While it does not include model training code, it supports running ICFRR using either our pretrained models or custom user models.
The approach involves a query image from domain A and a gallery of images from domain B. These images are processed through a feature extractor, generating an initial ranking based on similarity between the query and each gallery image. However, direct visual features alone may not always establish meaningful connections (as shown in Fig. 1 of our paper). Instead, a semantic link is often required—this is where ICFRR comes into play, enabling more effective re-ranking.
Clone this repository
Make venv in preferred format - for instruction venv will be shown
python3 -m venv icfrr_venv
source icfrr_venv/bin/activate
pip install -r requirements.txt
If using your own model, install any additional dependencies. If using our pretrained model, install the required package:
pip install timm==0.6.11
Datasets can be downloaded from this Google Drive link The original datasets are licenced with CC BY 4.0 and the original dataset websites are:
- TU-Berlin
- For Sketchy I could not locate the original page so have linked to Original Paper, Extended Paper
Once download please extract the zip files into your chosen data_root
We provide the pretrained models described in our paper.
They can be downloaded from
this Google Drive link
Once downloaded, their root dir should be set as pretrained_root (mentioned in arguments below).
We provide different running options according to use-case
Shared arguments between all running files:
--batchsize # batchsize to use
--gpu_id # gpu id to use
--per_class # number of ims per class (-1 is all)
--dataset # dataset to use
--data_root # Parent directory of all datasets
--debug # enable debug mode
--cache-root # Parent directory to store caches
We provide query-gallery (qg) and gallery-gallery (gg) dists as well as some ground truth class labels in examples to
showcase the running of our rerank.py file.
PYTHONPATH=. python src/rerank.py --dataset custom --cache_root examples --data_root examples --n_times 20 --KG 10 --KQ 15 --rerank_results_dir examples --query_idxs_to_vis 0
Run: PYTHONPATH=. python src/run_through_model.py
Args:
--pretrained_root # Parent directory of all pretrained model checkpoints
--no_cache #whether to cache data
--force_cache # whether to force recalculation and caching of data
You need to generate the features in our cacher format, with these components:
qg_dists(torch.Tensor): of shape[len(query), len(gallery)]Created withtorch.cdist(features["query"], features["gallery"]). The Euclidean distance between each query and all gallery images.gg_dists(torch.Tensor): which is of shape[len(gallery), len(gallery)]. Created withtorch.cdist(features["gallery"], features["gallery"]). The Euclidean distance between each gallery image to all gallery images.- OPTIONAL:
labelsdict: dict of keys query and gallery each with the integer labels of class. Can beNoneif no ground truth
from src.icfrr.cache_utils import Cacher
cacher = Cacher(<OUT_DIR_PATH>)
write_caches(qg_dists, gg_dists, labels)
Run PYTHONPATH=. python src/rerank.py
Args:
--KG # limit of how many of domB to domB we are counting as a strong match
--KQ # how many of domA-domB matches we count as strong
--beta # factor of effect of the rerank
--n_times # how many times to run the reranking
--limited_memory # If you have limited memory, try enabling this to chunk up some operations
--cpu_as_metric_device # to calculate metrics of huge datasets, sometimes cpu memory is larger so should be used
# Results permuting and visualisation
--rerank_results_dir # Directory to store reranking results
--query_idxs_to_vis # Idxs of the query ids for reranking visualisation, if left blank no visualisation occurs
--num_gallery_ims_to_vis # If visualising with query_idxs_to_vis, how many of the top query-gallery results to see
PYTHONPATH=. python src/run_through_model.py --dataset tuberlin
Note: check the args listed to change data directories, cache directories etc
PYTHONPATH=. python src/rerank.py --dataset tuberlin --limited_memory --KG 512 --KQ 512
PYTHONPATH=. python src/run_through_model.py --dataset sketchy_zs2
Note: check the args listed to change data directories, cache directories etc
PYTHONPATH=. python src/rerank.py --dataset sketchy_zs2 --limited_memory --KG 125 --KQ 100

