Currently, checking for duplicate images is done in QueueEventExtractor.process_label (quite late in the pipeline).
If the input data contains duplicate frames (e.g. a file that has already been analyze), checking for duplicate images earlier would be a performance boost.