I am using python's pil library to display images. Now I have a sequence of frames to display as a video content. I have a np.array that contains the RGB values of all the frames.
Could not find a method similar to Mathlabs implay to display these frames.
I can use imshow in a loop, but thats would be very slow as I need to mention framerate.
Matplotlib animations work well, and is easy to use. For reasonable size images they typically run at 30fps, or around that. Matplotlib 1.1+ has a nice new animation interface: here are some examples and a tutorial.
Older versions of matplotlib aren't to hard to animate either (you basically just set the data directly and refresh the plot) but the animation depends a bit more on the backend, so you need to look for an appropriate example.
For a specific example, if images is your list of matplotlib images that you want to animate, you can simply do:
animation.ArtistAnimation(fig, images, interval=50, blit=True, repeat_delay=1000)
This, btw, is taken from this example, if you want to also see the code that generates test images. The code to animate is simply the line above.
I have implemented a handy script that just suits your need. Try it out here
An example to show lists of images in a directory will be
import os
import glob
from scipy.misc import imread
img_dir = 'YOUR-IMAGE-DIRECTORY'
img_files = glob.glob(os.path.join(video_dir, '*.jpg'))
def redraw_fn(f, axes):
img_file = img_files[f]
img = imread(img_file)
if not redraw_fn.initialized:
redraw_fn.im = axes.imshow(img, animated=True)
redraw_fn.initialized = True
else:
redraw_fn.im.set_array(img)
redraw_fn.initialized = False
videofig(len(img_files), redraw_fn, play_fps=30)
If you happen to already have a working OpenCV install built with OpenEXR – which if you don’t, it’s at least as much of an irritating time-sink to rebuild OpenCV as e.g. compiling SciPy from source in the first place – but so if the library is already there and working, you can use the Python bindings to quickly† view images with only a little bit of boilerplate. From this example:
import OpenEXR, Imath, cv
filename = "GoldenGate.exr"
exrimage = OpenEXR.InputFile(filename)
dw = exrimage.header()['dataWindow']
(width, height) = (dw.max.x - dw.min.x + 1, dw.max.y - dw.min.y + 1)
def fromstr(s):
mat = cv.CreateMat(height, width, cv.CV_32FC1)
cv.SetData(mat, s)
return mat
pt = Imath.PixelType(Imath.PixelType.FLOAT)
(r, g, b) = [fromstr(s) for s in exrimage.channels("RGB", pt)]
bgr = cv.CreateMat(height, width, cv.CV_32FC3)
cv.Merge(b, g, r, None, bgr)
cv.ShowImage(filename, bgr)
cv.WaitKey()
I believe the OpenCV matrix type implements the python interfaces for memoryview et al – don’t be scared off by those objects as they’re NumPy arrays with different socks on, if you will.
†) quickly, w/r/t both the developer sense of speed: you can use this stuff immediately instead of building SciPy addons or mucking about with the Python array view C interface; but also in the real sense, as everything that comprises the aforementioned stuff – the OpenCV matrix structs, their related Python C API underpinnings, the OpenEXR format, and the stock implementation of the interface to same – have been raked over the optimization coals for years, largely by notable and grant-backed squadrons of specialist scholar-nerds who know what they are doing in this arena.
Related
I use C++ code to read pictures from WMTS server using DGAL.
First I initialize GDAL once:
...
OGRRegisterAll();
etc.
But new connection is opened every time I want to read new image (different urls):
gdalDataset = GDALOpen(my_url, GA_ReadOnly);
URL example: https://sampleserver6.arcgisonline.com/arcgis/rest/services/Toronto/ImageServer/tile/12/1495/1145
Unfortunately I didn't find a way to read multiply images by same connection.
Is there such option in GDAL or in WMTS?
Are there other ways to improve timing (I read thousands of images)?
While GDAL can read PNG files, it doesn't add much since those lack any geographical metadata.
You probably want to interact with the WMS server instead, not the images directly. You can for example run gdalinfo on the main url to see the subdatasets:
gdalinfo https://sampleserver6.arcgisonline.com/arcgis/services/Toronto/ImageServer/WMSServer?request=GetCapabilities&service=WMS
The first layer seems to have an issue, I'm not sure, but the other ones seem to behave fine.
I hope you don't mind me using some Python code, but the c++ api should be similar. Or you could try using the command-line utilities first (gdal_translate), to get familiar with the service.
See the WMS driver for more information and examples:
https://gdal.org/drivers/raster/wms.html
You can for example retrieve a subset and store it with:
from osgeo import gdal
url = r"WMS:https://sampleserver6.arcgisonline.com:443/arcgis/services/Toronto/ImageServer/WMSServer?SERVICE=WMS&VERSION=1.1.1&REQUEST=GetMap&LAYERS=Toronto%3ANone&SRS=EPSG:4326&BBOX=-79.454856,43.582524,-79.312167,43.711781"
bbox = [-79.35, 43.64, -79.32, 43.61]
filename = r'D:\Temp\toronto_subset.tif'
ds = gdal.Translate(filename, url, xRes=0.0001, yRes=0.0001, projWin=bbox)
ds = None
Which looks like:
import numpy as np
import matplotlib.pyplot as plt
ds = gdal.OpenEx(filename)
img = ds.ReadAsArray()
ds = None
mpl_extent = [bbox[i] for i in [0,2,3,1]]
fig, ax = plt.subplots(figsize=(5,5), facecolor="w")
ax.imshow(np.moveaxis(img, 0, -1), extent=mpl_extent)
Note that the data in native resolution for these type of services is often ridiculously large, so usually you want to specify a subset and/or limited resolution as the output.
This is main code which works on CPU machine. It loads all images and masks from folders, resizes them, and save as 2 numpy arrays.
from skimage.transform import resize as imresize
from skimage.io import imread
def create_data(dir_input, img_size):
img_files = sorted(glob(dir_input + '/images/*.jpg'))
mask_files = sorted(glob(dir_input + '/masks/*.png'))
X = []
Y = []
for img_path, mask_path in zip(img_files, mask_files):
img = imread(img_path)
img = imresize(img, (img_size, img_size), mode='reflect', anti_aliasing=True)
mask = imread(mask_path)
mask = imresize(mask, (img_size, img_size), mode='reflect', anti_aliasing=True)
X.append(img)
Y.append(mask)
path_x = dir_input + '/images-{}.npy'.format(img_size)
path_y = dir_input + '/masks-{}.npy'.format(img_size)
np.save(path_x, np.array(X))
np.save(path_y, np.array(Y))
Here is gcloud storage hierarchy
gs://my_bucket
|
|----inputs
| |----images/
| |-----masks/
|
|----outputs
|
|----trainer
dir_input should be gs://my_bucket/inputs
This doesn't work. What is the proper way to load images from that path on cloud, and save numpy array in the inputs folder?
Preferable with skimage, which is loaded in setup.py
Most Python libraries such as numpy don't natively support reading from and writing to object stores like GCS or S3. There are a few options:
Copy the data to local disk first (see this answer).
Try using the GCS python SDK (docs)
Use another library, like TensorFlow's FileIO abstraction. Here's some code similar to what you're trying to do (read/write numpy arrays).
The latter is particularly useful if you are using TensorFlow, but can still be used even if you are using some other framework.
I have trained some models using tensorflow 1.5.1 and I have the checkpoints for those models (including .ckpt and .meta files). Now I want to do inference in c++ using those files.
In python, I would do the following to save and load the graph and the checkpoints.
for saving:
images = tf.placeholder(...) // the input layer
//the graph def
output = tf.nn.softmax(net) // the output layer
tf.add_to_collection('images', images)
tf.add_to_collection('output', output)
for inference i restore the graph and the checkpoint then restore the input and output layers from collections like so:
meta_file = './models/last-100.meta'
ckpt_file = './models/last-100'
with tf.Session() as sess:
saver = tf.train.import_meta_graph(meta_file)
saver.restore(sess, ckpt_file)
images = tf.get_collection('images')
output = tf.get_collection('output')
outputTensors = sess.run(output, feed_dict={images: np.array(an_image)})
now assuming that I did the saving in python as usual, how can I do inference and restore in c++ with simple code like in python?
I have found examples and tutorials but for tensorflow versions 0.7 0.12 and the same code doesn't work for version 1.5. I found no tutorials for restoring models using c++ API on tensorflow website.
For the sake of this thread. I will rephrase my comment into an answer.
Posting a full example would require either a CMake setup or putting the file into a specific directory to run bazel. As I do favor the first way and it would burst all limits on this post to cover all parts I would like to redirect to a complete implementation in C99, C++, GO without Bazel which I tested for TF > v1.5.
Loading a graph in C++ is not much more difficult than in Python, given you compiled TensorFlow already from source.
Start by creating a MWE, which creates a very dump network graph is always a good idea to figure out how things work:
import tensorflow as tf
x = tf.placeholder(tf.float32, shape=[1, 2], name='input')
output = tf.identity(tf.layers.dense(x, 1), name='output')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver(tf.global_variables())
saver.save(sess, './exported/my_model')
There are probably tons of answers here on SO about this part. So I just let it stay here without further explanation.
Loading in Python
Before doing stuff in other languages, we can try to do it in python properly -- in the sense: we just need to rewrite it in C++.
Even restoring is very easy in python like:
import tensorflow as tf
with tf.Session() as sess:
# load the computation graph
loader = tf.train.import_meta_graph('./exported/my_model.meta')
sess.run(tf.global_variables_initializer())
loader = loader.restore(sess, './exported/my_model')
x = tf.get_default_graph().get_tensor_by_name('input:0')
output = tf.get_default_graph().get_tensor_by_name('output:0')
it is not helpful as most of these API endpoints do not exists in the C++ API (yet?). An alternative version would be
import tensorflow as tf
with tf.Session() as sess:
metaGraph = tf.train.import_meta_graph('./exported/my_model.meta')
restore_op_name = metaGraph.as_saver_def().restore_op_name
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name
sess.run(restore_op, {filename_tensor_name: './exported/my_model'})
x = tf.get_default_graph().get_tensor_by_name('input:0')
output = tf.get_default_graph().get_tensor_by_name('output:0')
Hang on. You can always use print(dir(object)) to get the properties like restore_op_name, ... .
Restoring a model is an operation in TensorFlow like every other operation. We just call this operation and providing the path (a string-tensor) as an input. We can even write our own restore operation
def restore(sess, metaGraph, fn):
restore_op_name = metaGraph.as_saver_def().restore_op_name # u'save/restore_all'
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name # u'save/Const'
sess.run(restore_op, {filename_tensor_name: fn})
Even this looks strange, it now greatly helps to do the same stuff in C++.
Loading in C++
Starting with the usual stuff
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/public/session_options.h>
#include <tensorflow/core/protobuf/meta_graph.pb.h>
#include <string>
#include <iostream>
typedef std::vector<std::pair<std::string, tensorflow::Tensor>> tensor_dict;
int main(int argc, char const *argv[]) {
const std::string graph_fn = "./exported/my_model.meta";
const std::string checkpoint_fn = "./exported/my_model";
// prepare session
tensorflow::Session *sess;
tensorflow::SessionOptions options;
TF_CHECK_OK(tensorflow::NewSession(options, &sess));
// here we will put our loading of the graph and weights
return 0;
}
You should be able to compile this by either put it in the TensorFlow repo and use bazel or simply follow the instructions here to use CMake.
We need to create such a meta_graph created by tf.train.import_meta_graph. This can be done by
tensorflow::MetaGraphDef graph_def;
TF_CHECK_OK(ReadBinaryProto(tensorflow::Env::Default(), graph_fn, &graph_def));
In C++ reading a graph from file is not the same as importing a graph in Python. We need to create this graph in a session by
TF_CHECK_OK(sess->Create(graph_def.graph_def()));
By looking at the strange python restore function above:
restore_op_name = metaGraph.as_saver_def().restore_op_name
restore_op = tf.get_default_graph().get_operation_by_name(restore_op_name)
filename_tensor_name = metaGraph.as_saver_def().filename_tensor_name
we can code the equivalent piece in C++
const std::string restore_op_name = graph_def.saver_def().restore_op_name()
const std::string filename_tensor_name = graph_def.saver_def().filename_tensor_name()
Having this in place, we just run the operation by
sess->Run(feed_dict, // inputs
{}, // output_tensor_names (we do not need them)
{restore_op}, // target_node_names
nullptr) // outputs (there are no outputs this time)
Creating the feed_dict is probably a post on its own and this answer is already long enough. It does only cover the most important stuff. I would like to redirect to a complete implementation in C99, C++, GO without Bazel which I tested for TF > v1.5. This is not that hard -- it just can get very long in the case of the plain C version.
I'd like to build and train a multi-layer LSTM model (stateIsTuple=True) in python, and then load and use it in C++. But I'm having a hard time figuring out how to feed and fetch states in C++, mainly because I don't have string names which I can reference.
E.g. I put the initial state in a named scope such as
with tf.name_scope('rnn_input_state'):
self.initial_state = cell.zero_state(args.batch_size, tf.float32)
and this appears in the graph as below, but how can I feed to these in C++?
Also, how can I fetch the current state in C++? I tried the graph construction code below in python but I'm not sure if it's the right thing to do, because last_state should be a tuple of tensors, not a single tensor (though I can see that the last_state node in tensorboard is 2x2x50x128, which sounds like it just concatenated the states as I have 2 layers, 128 rnn size, 50 mini batch size, and lstm cell - with 2 state vectors).
with tf.name_scope('outputs'):
outputs, last_state = legacy_seq2seq.rnn_decoder(inputs, self.initial_state, cell, loop_function=loop if infer else None)
output = tf.reshape(tf.concat(outputs, 1), [-1, args.rnn_size], name='output')
and this is what it looks like in tensorboard
Should I concat and split the state tensors so there is only ever one state tensor going in and out? Or is there a better way?
P.S. Ideally the solution won't involve hard-coding the number of layers (or rnn size). So I can just have four strings input_node_name, output_node_name, input_state_name, output_state_name, and the rest is derived from there.
I managed to do this by manually concatenating the state into a single tensor. I'm not sure if this is wise, since this is how tensorflow used to handle states, but is now deprecating that and switching to tuple states. Instead of setting state_is_tuple=False and risking my code being obsolete soon, I've added extra ops to manually stack and unstack the states to and from a single tensor. Saying that, it works fine both in python and C++.
The key code is:
# setting up
zero_state = cell.zero_state(batch_size, tf.float32)
state_in = tf.identity(zero_state, name='state_in')
# based on https://medium.com/#erikhallstrm/using-the-tensorflow-multilayered-lstm-api-f6e7da7bbe40#.zhg4zwteg
state_per_layer_list = tf.unstack(state_in, axis=0)
state_in_tuple = tuple(
# TODO make this not hard-coded to LSTM
[tf.contrib.rnn.LSTMStateTuple(state_per_layer_list[idx][0], state_per_layer_list[idx][1])
for idx in range(num_layers)]
)
outputs, state_out_tuple = legacy_seq2seq.rnn_decoder(inputs, state_in_tuple, cell, loop_function=loop if infer else None)
state_out = tf.identity(state_out_tuple, name='state_out')
# running (training or inference)
state = sess.run('state_in:0') # zero state
loop:
feed = {'data_in:0': x, 'state_in:0': state}
[y, state] = sess.run(['data_out:0', 'state_out:0'], feed)
Here is the full code if anyone needs it
https://github.com/memo/char-rnn-tensorflow
I am playing around with pyglet 1.2alpha-1 and Python 3.3. I have the following (extremely simple) application and cannot figure out what my issue is:
import pyglet
window = pyglet.window.Window()
#image = pyglet.resource.image('img1.jpg')
image = pyglet.image.load('img1.jpg')
label = pyglet.text.Label('Hello, World!!',
font_name='Times New Roman',
font_size=36,
x=window.width//2, y=window.height//2,
anchor_x='center', anchor_y='center')
#window.event
def on_draw():
window.clear()
label.draw()
# image.blit(0,0)
pyglet.app.run()
With the above code, my text label will appear as long as image.blit(0, 0) is commented out. However, if I try to display the image, the program crashes with the following error:
File "C:\Python33\lib\site-packages\pyglet\gl\lib.py", line 105, in errcheck
raise GLException(msg)
pyglet.gl.lib.GLException: b'invalid value'
I also get the above error if I try to use pyglet.resource.image instead of pyglet.image.load (the image and py file are in the same directory).
Any one know how I can fix this issue?
I am using Python 3.3, pyglet 1.2alpha-1, and Windows 8.
The code -including the image.blit- runs fine for me. I'm using python 2.7.3, pyglet 1.1.4
There's nothing wrong with the code. You might consider trying other python and pyglet versions for the time being (until pyglet has a new stable release)
This isn't a "fix", but might at least determine if it's fixable or not (mine was not). (From the Pyglet mailing group.)
You can verify whether the system does not even support Textures greater than 1024, by running this code (Python 3+):
from ctypes import c_long
from pyglet.gl import glGetIntegerv, GL_MAX_TEXTURE_SIZE
i = c_long()
glGetIntegerv(GL_MAX_TEXTURE_SIZE, i)
print (i) # output: c_long(1024) (or higher)
That is the maximum texture size your system supports. If it's 1024, then any larger pictures will raise an Exception. (And the only fix is, get a better system).