This repository provides the code used for training and evaluating the LAN CNN. The model was designed to reproduce high-quality RGB images from the Bayer-Filtered RAW output of a smartphone sensor. This can completely replace the hand-crafted Image Signal Processing (ISP) pipelines encountered in digital cameras by a single deep learning model. The model is trained on pairs of images generated with the Sony IMX586 camera sensor and the Fujifilm GFX100 DSLR camera.
- Overview
- Requirements
- First steps
- Training
- Inference - Full-Resolution Images
- Inference - Numerical Evaluation
- Acknowledgements
- imageio=2.9.0 for loading .png images
- numpy=1.21.2 for general matrix operations
- pillow=8.3.1 for image resizing operations
- rawpy=0.16.0 for loading .raw images
- six=1.16.0 for downloads
- tensorflow-gpu=2.3.0 for the whole NN training and inference
- tqdm=4.62.1 for nice progress bars
- Download the dataset from the MAI'21 Learned Smartphone ISP Challenge website (registration needed).
The dataset directory (default name:
raw_images/) should contain three subfolders:train/,val/andtest/. - Download the pre-trained VGG-19 model Mirror and put it into the
vgg_pretrained/folder created at the root of the directory.
The train_model.py file can be invoked as follows:
python train_model.py <args>where args are defined in utils.py, and can be one of the following (default values in bold) :
dataset_dir:raw_images/- path to the folder with the dataset
vgg_dir:vgg_pretrained/imagenet-vgg-verydeep-19.mat- path to the pre-trained VGG-19 network
dslr_dir:fujifilm/- path to the folder with the RGB data
phone_dir:mediatek_raw/- path to the folder with the Raw data
model_dir:models/- path to the folder with the model to be restored or saved
restore_iter:None- iteration to restore, defaults to last iteration
patch_w:256- width of the training images
patch_h:256- height of the training images
batch_size:32- batch size [small values can lead to unstable training]
train_size:5000- the number of training patches randomly loaded each 1000 iterations
learning_rate:5e-5- learning rate
eval_step:1000- eacheval_stepiterations the accuracy is computed and the model is saved
num_train_iters:100000- the number of training iterations
optimizer:radam- the optimizer used (adamis the other option)
the loss function can also be built with the following losses:
fac_mse:0- Mean-Squared-Error
fac_l1:0- Mean-Absolute-Error
fac_ssim:0- Structural Similarity Index
fac_ms_ssim:30- Multi-Scale Structural Similarity Index
fac_uv:100- Loss between blurred UV channels (color loss)
fac_vgg:0- VGG loss (perceptual loss)
fac_lpips:10- LPIPS loss (perceptual loss)
fac_huber:300- Huber loss
fac_charbonnier:0- Charbonnier loss (approximation of L1 loss)
The test_model.py file can be invoked as follows:
python test_model.py <args>where args are defined in utils.py, and can be one of the following (default values in bold) :
dataset_dir:raw_images/- path to the folder with the dataset
result_dir:<model_dir>- output images are saved underresults/full-resolution/<result_dir>
phone_dir:mediatek_raw_normal/- path to the folder with the Raw data
model_dir:models/- path to the folder with the model to be restored or saved
restore_iter:None- iteration to restore, defaults to last iteration
img_w:3000- width of the full-resolution images
img_h:4000- height of the full-resolution images
use_gpu:True- use the GPU for inference
The evaluate_model.py file can be invoked as follows:
python evaluate_model.py <args>where args are defined in utils.py, and can be one of the following (default values in bold) :
dataset_dir:raw_images/- path to the folder with the dataset
vgg_dir:vgg_pretrained/imagenet-vgg-verydeep-19.mat- path to the pre-trained VGG-19 network
dslr_dir:fujifilm/- path to the folder with the RGB data
phone_dir:mediatek_raw/- path to the folder with the Raw data
model_dir:models/- path to the folder with the model to be restored or saved
restore_iter:None- iteration to restore, defaults to last iteration
img_w:256- width of the evaluation patches
img_h:256- height of the evaluation patches
use_gpu:True- use the GPU for inference
batch_size:10- batch size
- This repo bases on the mai21-learned-smartphone-isp repository;
- The LPIPS loss function was implemented using the code from alexlee-gk;
- The RAdam optimizer was implemented using the code from taki0112.

