GPU not detected by Tensorflow V1 - python-2.7

I'm currently using Tf 1.1.0. I tried to list out the available devices by using the following command
print(device_lib.list_local_devices())
It displayed only CPU but not GPU. When I ran the same command with tf 2.x.x, both CPU and GPU were displayed. Is there a way to make tf v1 detect my GPU,Coz I'm not willing to switch to tf v2
My model is being trained on CPU as tf isn't detecting the GPU

If you are not willing to switch to TensorFlow 2.x, you can try updating your GPU drivers, and specifying which GPU to use when running TensorFlow. You can do this using the CUDA_VISIBLE_DEVICES environment variable. For example:
CUDA_VISIBLE_DEVICES=0 python your_script.py
This will tell TensorFlow to use the first GPU (with an index of 0) when running your script. You can change the index to specify a different GPU.

Related

Why isn't my colab notebook using the GPU?

When I run code on my colab notebook after having selected the GPU, I get a message saying "You are connected to a GPU runtime, but not utilizing the GPU". Now I understand similar questions have been asked before, but I still don't understand why. I am running PCA on a dataset over hundreds of iterations, for multiple trials. Without a GPU it takes about as long as it does on my laptop, which can be >12 hours, resulting in a time out on colab. Is colab's GPU restricted to machine learning libraries like tensorflow only? Is there a way around this so I can take advantage of the GPU to speed up my analysis?
Colab is not restricted to Tensorflow only.
Colab offers three kinds of runtimes: a standard runtime (with a CPU), a GPU runtime (which includes a GPU) and a TPU runtime (which includes a TPU).
"You are connected to a GPU runtime, but not utilizing the GPU" indicates that the user is conneted to a GPU runtime, but not utilizing the GPU, and so a less costly CPU runtime would be more suitable.
Therefore, you have to use a package that utilizes the GPU, such as Tensorflow or Jax. GPU runtimes also have a CPU, and unless you are specifically using packages that exercise the GPU, it will sit idle.

How to release the occupied GPU memory when calling keras model by Apache mod_wsgi django?

My server configuration is as follows:
apache 2.4.23.
Mod_wsgi 4.5.9
By using the Django framework and apache server, we call the Keras deep learning model. And after the successful calling of the model, the model has been always running in the GPU memory, which causes the GPU memory can not be released except by shutting down the apache server.
So, is there any way to control the release of GPU memory when calling a Keras model by Apache+Mod_wsgi+Django?
Thanks!
Runtime memory footprint screenshots
For people who fail to make K.clear_session() work, there is an alternative solution:
from numba import cuda
cuda.select_device(0)
cuda.close()
Tensorflow is just allocating memory to the GPU, while CUDA is responsible for managing the GPU memory.
If CUDA somehow refuses to release the GPU memory after you have cleared all the graph with K.clear_session(), then you can use the cuda library to have a direct control on CUDA to clear GPU memory.
from keras import backend as K
K.clear_session()
This will clear the current session (Graph) and so the stale model should be removed from GPU. If it didn't work, you might need to 'del model' and reload it again.
from numba import cuda
device = cuda.get_current_device()
device.reset()

Pytorch .backward() method without CUDA

I am trying to run the code in the Pytorch tutorial on the autograd module. However, when I run the .backwards() call, I get the error:
cuda runtime error (38) : no CUDA-capable device is detected at torch/csrc/autograd/engine.cpp:359
I admittedly have no CUDA-capable device set up at the moment, but it was my understanding that this wasn't strictly necessary (at least I didn't find it specified anywhere in the tutorial). So I was wondering if there is a way to still run the code without a CUDA-enabled GPU.
You should transfer you network, inputs, and labels onto the cpu using: net.cpu(), Variable(inputs.cpu()), Variable(labels.cpu())

Time Consuming Tensorflow C++ Session->Run - Images for Real-time Inference

[Tensorflow (TF) on CPU]
I am using the skeleton code provided for C++ TF inference from GitHub [label_image/main.cc] in order to run a frozen model I have created in Python. This model is an FC NN with two hidden layers.
In my current project's C++ code, I run the NN's frozen classifier for each single image (8x8 pixels). For each sample, a Session->Run call takes about 0.02 seconds, which is expensive in my application, since I can have 64000 samples that I have to run.
When I send a batch of 1560 samples, the Session->Run call takes about 0.03 seconds.
Are these time measurements normal for the Session->Run Call? From the C++ end, should I send my frozen model batches of images and not single samples? From the Python end, are there optimisation tricks to alleviate that bottleneck? Is there a way to concurrently do Session-Run calls in C++?
Environment info
Operating System: Linux
Installed version of CUDA and cuDNN: N/A
What other attempted solutions have you tried?
I installed TF using the optimised instruction set for the CPU, but it does not seem to give me the huge time saving mentioned in StackOverflow
Unified the session for the Graph I created.
EDIT
It seems that MatMul is the bottleneck -- Any suggestions how to improve that?
Should I use 'optimize_for_inference.py' script for my frozen graph?
How can you measure the time in Python with high precision?
Timeline for feeding an 8x8 sample and getting the result in Python
Timeline for feeding an 8x8 batch and getting the result in Python
For the record, I have done two things that significantly increased the speed of my application:
Compiled TF to work on the optimised ISA of my machine.
Applied batching to my data samples.
Please feel free to comment here if you have questions about my answer.

How can I make ensure caffe using GPU?

Is there any way to ensure caffe using GPU? I was compiled caffe after installing CUDA driver and without CPU_ONLY flag in cmake and while compiling cmake logged detection of CUDA 8.0.
But while train a sample, I doubt it using GPU according nvidia-smi result. How can I ensure?
For future caffe wanderers scouring around, this finally did the trick for me:
caffe.set_mode_gpu()
caffe.set_device(0)
I did have solver_mode: GPU, and it would show the process on the gpu, but the 'GPU Memory Usage' as seen using nvidia-smi was not enough to fit my model (so I knew something was wrong...)
The surest way I know is to properly configure the solver.prototxt file.
Include the line
solver_mode: GPU
If you have any specifications for the engine to use in each layer of your model, you'll want to also make sure they refer to GPU software.
You can use Caffe::set_mode(Caffe::GPU); in you program explicitly.
To make sure the process is using GPU, you can use nvidia-smi command in ubuntu to which process use GPU.
As to me, I use MTCNN to do face detection(implement by caffe):
I use nvidia-smi command to show processes who use GPU, if you want to see it by interval use watch nvidia-smi.
As below image, we can see that the process mtcnn_c(use caffe backend) is using GPU.