Skip to content

Docker: gpu is not detected #20

@jalpianissimo

Description

@jalpianissimo

Hi, I've wanted to use your program with docker on Linux but I had the following problems:

Issue 1:

Following PaddleOCR environment setup instructions I used the paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 image, then I installed paddlepaddle-gpu, paddleocr and videocr-PaddleOCR. So I tried to run videocr like the Colab example, getting the following error:

ImportError: cannot import name 'shadow_var_between_sub_programs' from 'paddle.distributed.passes.pass_utils'

I fixed it by downloading the latest pass_utils.py from PaddleOCR repo, this way everything worked.

Issue 2:

Afterward I noticed that the OCR was performed via CPU, as nvidia-smi did not show active GPU usage and running the following returned False:

import paddle
gpu_available  = paddle.device.is_compiled_with_cuda()
print("GPU available:", gpu_available)

As I already installed the Nvidia Container Toolkit I tried to start a new container with the same image, to understand if the problem was within the image or from something else. So I did:

docker stop ppocr
docker container remove ppocr
docker image -a
docker image rm [ID]
sudo docker run --gpus all --name ppocr -v $PWD:/paddle --shm-size=64G --network=host -it paddlepaddle/paddle:2.1.3-gpu-cuda10.2-cudnn7 /bin/bash

I immediately tried to check for GPU usage with the python snippet above and it returned True. Next I installed videocr-PaddleOCR and checked again, this time it returned False.
Then I tried installing videocr-PaddleOCR on a newer docker image pulled from the Hub (paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6) and repeated the steps above, so I checked GPU after starting the container, after installing paddlepaddle-gpu, and after installing videocr, having the same results as before (but this time no ImportError) --> so running paddleocr alone works on GPU, after installing videocr it does not anymore...

As I do understand very little of everything programming-related, my solution (in order to have videocr on gpu) is as follows:

  1. Start a docker container with paddlepaddle/paddle:2.6.1-gpu-cuda12.0-cudnn8.9-trt8.6 as image
  2. Clone this repo git clone https://github.com/devmaxxing/videocr-PaddleOCR and edit the requirements.txt file so that only includes:
paddlepaddle-gpu
paddleocr==2.7.0.2
charset-normalizer==3.2.0
colorama==0.4.6
Levenshtein==0.21.1
paddle-bfloat==0.1.7
python-Levenshtein==0.21.1
PyWavelets==1.4.1
thefuzz==0.19.0
  1. Install python -m pip install . and run! Now it uses gpu (as running the first snippet returns True even after installing videocr).

Note:

I reached this conclusion by chance, but if I have to give a reasoning behind is the presence of paddlepaddle in the original requirements.txt file, as I noted paddlepaddle-gpu and paddlepaddle together return gpu usage as False regardless of docker image.
To be sure I only included all the missing dependencies: (i.e. on the clean docker image i installed paddlepaddle-gpu, then pip freeze and cross-checked to get everything that was missing from the original requirements.txt), and added paddleocr=2.7.0.2 due to #16 as using the latest paddleocr I encountered the same issue.

Note-bis:

Here is the code I used (as taken from Colab) to test videocr

from videocr import save_subtitles_to_file

#@title OCR parameters
input_file_path = "/home/test.mp4" 
output_file_path = "/home/out.srt" 
language_code = "ch" 
use_gpu = True 
start_time = "0:00" 
end_time = "" 
confidence_threshold = 75 
similarity_threshold = 80 
frames_to_skip = 0 
crop_x = None 
crop_y = None 
crop_width = None 
crop_height = None 

save_subtitles_to_file(input_file_path, output_file_path, lang=language_code,
                       time_start=start_time, time_end=end_time,
                       conf_threshold=confidence_threshold, sim_threshold=similarity_threshold,
                       use_gpu=use_gpu,
                       frames_to_skip=frames_to_skip,
                       crop_x=crop_x, crop_y=crop_y, crop_width=crop_width, crop_height=crop_height)

Note-last:

I wanted to thank you for this program as it helped me a lot, I wanted to share my experience as I lost some good few hours but now seem to be fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions