-
Notifications
You must be signed in to change notification settings - Fork 148
Open
Description
When I convert the image data from tfrecord format to jpg formet, I found that, each jpg file is actually 4 square images concatenated together. And the the FileBasedDataset does nothing regarding that. And I don't see the FSNSLocalizationNet do separate localization for these 4 images. How to understand this?
if self.uses_original_data:
# handle each individual view as increase in batch size
batch_size, num_channels, height, width = images.shape
images = F.reshape(images, (batch_size, num_channels, height, 4, -1))
images = F.transpose(images, (0, 3, 1, 2, 4))
images = F.reshape(images, (batch_size * 4, num_channels, height, width // 4))
does it consider 4 different images as an additional dimension for the localization?
Metadata
Metadata
Assignees
Labels
No labels