Skip to content

Conversation

@Chandraveersingh1717
Copy link

Summary

Split coco_detection_dataset() into two specialized datasets for object detection and instance segmentation tasks.

Problem

  • Current implementation carries 500MB+ annotation object in memory for entire dataset lifetime
  • Confusing documentation mixing detection and segmentation use cases
  • Tasks never run together (different model architectures)
  • Poor cache organization with unidentified large files

Solution

coco_detection_dataset() - Object Detection Only

  • Returns: boxes, labels, area, iscrowd
  • Memory: ~250MB (50% reduction)
  • Use: Faster R-CNN, YOLO, SSD

coco_segmentation_dataset() - Instance Segmentation (NEW)

  • Returns: boxes, labels, area, iscrowd, segmentation, masks
  • Memory: ~250MB
  • Use: Mask R-CNN, DeepLab

Cache Organization: Files now stored in /coco subdirectory for better identification

Breaking Change

Segmentation users must migrate:

# Before
coco_detection_dataset(..., target_transform = target_transform_coco_masks)

# After  
coco_segmentation_dataset(..., target_transform = target_transform_coco_masks)

…ory reduction and better UX (Breaking: segmentation users migrate to coco_segmentation_dataset)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant