SpaceTime-SonoNet: Efficient Classification of Ultra-Sound Video Sequences

The goal of this project is to propose a method to automatically detect and classify standard planes (SP) in liver ultrasound (US) videos. The operator - commonly a nurse - should detect the most informative images within each US video, to later provide them to physicians for diagnostic purposes. Such frames are known as standard planes and are identified by the presence of specific anatomical structures within the image. Given the nature of this imaging technique (being highly noisy and subject to device settings and manual skills of the operator) and the resulting challenge of recognising anatomical structures (often not clearly visible even by expert physicians), the standard plane detection task is non-trivial and strongly operator-dependent. Nonetheless, one aspect that seems to aid expert users is the temporal evolution of the data within the performed motion scan (combined with some prior background knowledge of human anatomy). Our aim is hence to develop a deep learning pipeline for the automatic classification SP from single frames and sequences of frames within US videos.

We start by following a 2D approach with a 2D CNN architecture named SonoNet [1], which proved to achieve state-of-the-art results on the US fetal standard plane detection task. As a first approach concerning the usage of time information, instead, we propose to employ a 3D CNN model to exploit both spatial and temporal information on a short timescale. Specifically, we implemented a 3D extension of the mentioned SonoNet architecture. Extending convolutions to the third (temporal) domain should aid the network in solving ambiguous situations where some parts of the anatomical structures are not clearly visible (or partly occluded) within a single frame, though they could appear in nearby frames. Based on [2] we also implemented SonoNet(2+1)D model. It is a 3D version of SonoNet2D, but each 2D convolution layers is replaced with a SpatioTemporal block, which consists of a 2D convolution layer followed by a 1D convolution layer. In this way, we have a model which is comparable to the SonoNet2D, in terms of trainable parameters, but with a number of non-linear operations that is double with respect to the 3D model, potentially leading to the best results.

Code Organization

models

This folder contains the main scripts for defining and training 2D-SonoNet architectures, as well as 3D-SonoNet and (2+1)D-SonoNet extensions. See the "usage" note at the beginning of each of them.

sononet2d-traintest.py: train and test the 2D SonoNet-16/32/64 model.

sononet2d-traintest_3d_comparable.py: trains and evaluates the 2D SonoNet-16/32/64 model using the same dataset as the 3D models for direct comparison.

sononet3d-traintest.py: train and test the 3D SonoNet-16/32/64 model.

temporal_test.py: loads a test video and visualises the predictions of different models for temporal comparison.

2d_vs_3d: computes per-video accuracy on the test set for both 2D and 3D models, and calculates the average accuracy..

Such scripts use code from the following Python packages:

utils: This folder contains Python files with many general-purpose utility functions.

augments.py: defines data augmentation methods for US images.

datareader.py: defines a class for loading either the 2D or the 3D version of our dataset.

datasplit.py: defines functions for splitting the dataset into training and validation sets. Note that the splitting logic is tailored to our specific scenario, so using your own custom split method is recommended.

iterators.py: define basic training and testing loops for a single epoch.

runner.py: defines train and test functions.

visualize.py: defines a useful function for plotting a confusion matrix and saving it as a PNG image.

_IMPORTANT: _ Change the method split3d_train_validation and split2d_train_validation based on your data and needs.

The data sononet2d: This folder contains the 2D implementation of the SonoNet-16/32/64 model.

models.py: defines the SonoNet2D class. The number of features in the hidden layers of the network can be set by choosing between 3 configurations (16, 32, and 64). The network may be used in "classification mode." (the adaptation layer gives the output) or for "feature extraction" (no adaptation layer is defined, and the output is the set of features in the last convolutional layer. This last functionality is achieved by setting the features_only parameter to True (useful to check on which image parts the network is focusing its attention). Finally, by setting the train_classifier_only parameter to True, it is possible to freeze learning in all convolutional layers (only the adaptation layer will be trained).

remap-weights.py: convert SonoNet weights (downloaded from the reference repository) to be compatible with our implementation of the model.

sononet3d: This folder contains the 3D and (2+1)D extensions of the standard SonoNet-16/32/64 model implementation.

models.py: defines the SonoNet3D and SonoNet(2+1)D classes. For the SonoNet3D, all 2D convolutional and pooling layers are changed to their 3D extension. Instead, in the (2+1)D model, the 3D convolutional layers are replaced by a SpatioTemporal block, where the standard convolution is decomposed into a 2D convolution followed by a 1D convovolution. As for the 2D case, the number of features in the hidden layers of the network can be set by choosing between 3 configurations (16, 32, and 64).

logs / weights4sononet2d / FetalDB: pretrained weights of all SonoNet configurations (16, 32, and 64 initial features) from the FetalDB dataset. Each configuration has its own folder (SonoNet-16, SonoNet-32, and SonoNet-64) where weights are stored in "ckpt_best_loss.pth" file. Such files were obtained from those denoted as "old", which are the ones provided in this repository (same weights but not directly compatible with our model definition).

Command line examples

sononet2d-traintest.py:

python sononet2d-traintest.py -data_dir 'path_to_data' -log_dir 'logs/sononet2d' -gpu 0 -num_features 32 -batch_size 128 -lr 0.00001 -max_num_epochs 200 -patience 10 -lr_sched_patience 4 -weight_decay 0.0001 -seed 21 --sampler --augmentation

sononet2d-traintest_3d_comparable.py:

python sononet2d-traintest_3d_comparable.py -data_dir 'path_to_data' -log_dir 'logs/sononet2d_3d_comparable' -gpu 0 -num_features 32 -clip_len 10 -batch_size 128 -lr 0.00001 -max_num_epochs 200 -patience 10 -lr_sched_patience 4 -weight_decay 0.0001 -seed 21 --sampler --augmentation

sononet3d-traintest.py:

python sononet3d-traintest.py -data_dir 'path_to_data' -log_dir 'logs/sononet3d' -gpu 0 -num_features 32 -clip_len 10 -batch_size 128 -lr 0.00001 -max_num_epochs 200 -patience 10 -lr_sched_patience 4 -weight_decay 0.0001 -seed 21 --sampler --augmentation

if you want use SoneNet (2+1)D version add to the command line the argument --modify_3d

temporal_test.py:

python temporal_test.py -data_dir 'path_to_data' -log_dir 'logs/temporal_test' -model_dir_2d 'logs/sononet2d_3d_comparable' -model_dir_3d 'logs/sononet3d' -model_dir_2_1d 'logs/sononet_2_1d' -gpu 0 -num_features 32 -clip_len 10

2d_vs_3d:

python 2d_vs_3d.py -data_dir 'path_to_data' -log_dir 'logs/2d_vs_3d' -model_dir_2d 'logs/sononet2d_3d_comparable' -model_dir_3d 'logs/sononet3d' -model_dir_2d1d 'logs/sononet_2_1d' -gpu 0 -num_features 32 -clip_len 10 -batch_size 128

Data Organization

To run the experiments correctly, the dataset directory must follow the structure below:

data/
│
├── classes.json
│
├── train/
│   ├── labels/
│   │   ├── <video_name_1>/
│   │   │   ├── <video_name_1>_<frame_idx>.txt
│   │   │   ├── <video_name_1>_<frame_idx>.txt
│   │   │   └── ...
│   │   └── <video_name_2>/
│   │       └── ...
│   │
│   └── videos/
│       ├── <video_name_1>/
│       │   ├── <video_name_1>_<frame_idx>.<ext>
│       │   ├── <video_name_1>_<frame_idx>.<ext>
│       │   └── ...
│       └── <video_name_2>/
│           └── ...
│
└── test/
    ├── labels/
    │   ├── <video_name_1>/
    │   │   ├── <video_name_1>_<frame_idx>.txt
    │   │   ├── <video_name_1>_<frame_idx>.txt
    │   │   └── ...
    │   └── <video_name_2>/
    │       └── ...
    │
    └── videos/
        ├── <video_name_1>/
        │   ├── <video_name_1>_<frame_idx>.<ext>
        │   ├── <video_name_1>_<frame_idx>.<ext>
        │   └── ...
        └── <video_name_2>/
            └── ...

Folder Details

classes.json

This file must be located directly inside the data/ directory. It contains a dictionary where:
- each key is a class name (string),
- each value is a unique integer ID.
labels/

Contains one subfolder per video, named exactly as the video. Each subfolder includes one .txt file per frame. File naming format: <video_name>_<frame_idx>.txt
videos/

Contains one subfolder per video, using the same video name as in labels/.

Each subfolder includes all frames of the video.

File naming format matches the labels:

<video_name>_<frame_idx>.png

References

[1] Baumgartner C.F., Kamnitsas K., Matthew J., Fletcher T.P., Smith S., Koch L.M., Kainz B., and Rueckert D. (2017). SonoNet: real-time detection and localisation of fetal standard scan planes in freehand ultrasound. IEEE transactions on medical imaging, 36(11), pp.2204-2215. [link]

[2] Tran, D., Wang, H., Torresani, L., Ray, J., LeCun, Y., & Paluri, M. (2018). A closer look at spatiotemporal convolutions for action recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 6450-6459). [link]

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
logs/weights4sononet2d		logs/weights4sononet2d
models		models
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpaceTime-SonoNet: Efficient Classification of Ultra-Sound Video Sequences

Code Organization

models

Command line examples

Data Organization

Folder Details

References

About

Uh oh!

Releases

Packages

Languages

Malga-Vision/SpaceTime-SonoNet

Folders and files

Latest commit

History

Repository files navigation

SpaceTime-SonoNet: Efficient Classification of Ultra-Sound Video Sequences

Code Organization

models

Command line examples

Data Organization

Folder Details

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages