GdkPixbuf - Saving pixel data to file from Python - python-2.7

I am trying to save some pixels to a file using GdkPixbuf from Python in Windows. I am making use of the excellent PyGI AIO (3.14.0) binaries.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from gi.repository import Gtk, Gdk, GdkPixbuf
w, h, n = 4, 4, 4
data = bytearray(b'\x00\x00\x00\xff' * w * h)
#data = GLib.Bytes.new(b'\x00\x00\x00\xff' * w * h).get_data()
#import numpy as np
#data = np.zeros((w,h,n), np.uint8)
#data[:,:,3] = 255
#data = data.tostring()
options = {}
pixbuf = GdkPixbuf.Pixbuf.new_from_data(data, GdkPixbuf.Colorspace.RGB, True, 8, w, h, n*w, None, None)
pixbuf.savev('screenshot.bmp', 'bmp', options.keys(), options.values())
The zoomed-in result looks as follows:
Clearly, the first couple of pixels are corrupted. The amout of broken pixels seems to vary depending on the image dimensions. However, some of the pixel manage to stay intact. There must be an error in my code or the memory is getting corrupted somehow. It is possible to encode a larger image, and the error always appears in the first few pixels. Could this be a string encoding problem or something?
Edit: I have tested the program on OS X and the error is very similar. Therefore, it seems to be a general issue with the Python bindings to GdkPixbuf, potentially related to this. Here is a bigger PNG produced by a modified version of the script. The grid of red and green lines is the expected output, whereas the pixels in the upper half of the image are just noise.

Related

Read, process and show the pixels in .EXR format images

I want to read the exr file format images and see the pixel intensities in the corresponding location. And also wanted to stack them together to give them into a neural network. How can I do the normal image processing on these kind of formats? Please help me in doing this!
I have tried this code using OpenEXR file but unable to proceed further.
import OpenEXR
file = OpenEXR.InputFile('file_name.exr')
I am expected to see the normal image processing tools like
file.size()
file.show()
file.write('another format')
file.min()
file.extract_channels()
file.append('another exr file')
OpenEXR seems to be lacking the fancy image processing features such as displaying images or saving the image to a different format. For this I would suggest you using OpenCV, which is full of image processing features.
What you may need to do is:
Read exr using OpenEXR only, then extract channels and convert them to numpy arrays as rCh = np.asarray(rCh, dtype=np.uint8)
Create a RGB image from these numpy arrays as img_rgb = cv2.merge([b, g, r]).
Use OpenCV functions for your listed operations:
Size: img_rgb.shape
Show: cv2.imshow(img_rgb)
Write: cv2.imwrite("path/to/file.jpg", img_rgb)
Min: np.min(b), np.min(g), np.min(r)
Extract channels: b, g, r = cv2.split(img_rgb)
There is an example on the OpenEXR webpage:
import sys
import array
import OpenEXR
import Imath
if len(sys.argv) != 3:
print "usage: exrnormalize.py exr-input-file exr-output-file"
sys.exit(1)
# Open the input file
file = OpenEXR.InputFile(sys.argv[1])
# Compute the size
dw = file.header()['dataWindow']
sz = (dw.max.x - dw.min.x + 1, dw.max.y - dw.min.y + 1)
# Read the three color channels as 32-bit floats
FLOAT = Imath.PixelType(Imath.PixelType.FLOAT)
(R,G,B) = [array.array('f', file.channel(Chan, FLOAT)).tolist() for Chan in ("R", "G", "B") ]
After this, you should have three arrays of floating point data, one per channel. You could easily convert these to numpy arrays and proceed with opencv as user #ZdaR suggests.

PIL - how to insert an index, or subscript, into text?

Like this:
Calculating coordinates looks not so good, maybe there is a better way?
This code works fine (), but it's complicated always calculate where to place index for each letter.
image = Image.new('I', (300, 100), "white").convert('RGBA')
font = ImageFont.truetype(font=r"C:\Windows\Fonts\Arial.ttf", size=39)
draw = ImageDraw.Draw(image, 'RGBA')
draw.text((10, 10), "P", fill="black", font=font, align="center")
font = ImageFont.truetype(font=r"C:\Windows\Fonts\Arial.ttf", size=20)
draw.text((25, 35), "2", fill="black", font=font, align="center")
image.save(output_folder + 'test.png')
One possibility for you might be to use ImageMagick which understands Pango Markup Language - which looks kind of like HTML.
So, at the command-line you could run this:
convert -background white pango:'<span size="49152">Formula: <b>2P<sub><small><small>2</small></small></sub>O<sub><small><small>5</small></small></sub></b></span>' formula.png
which produces this PNG file:
Change to -background none to write on a piece of transparent canvas if you want to preserve whatever is underneath the text in your original image.
You can also put all the markup in a separate text file, called say "pango.txt" like this:
<span size="49152">Formula: <b>2P<sub><small><small>2</small></small></sub>O<sub><small><small>5</small></small></sub></b></span>
and pass that into ImageMagick like this:
convert pango:#pango.txt result.png
You could shell out and do this using:
subprocess.call()
Then you can easily load the resultant image and composite/paste it in where you want it - that would take about 3 lines of Python that you could put in a function.
Here is a further example of an image generated with Pango by Anthony Thyssen so you can see some of the possibilities:
There is loads of further information on Pango by Anthony here.
Note that there are also Python bindings for ImageMagick but I am not very familiar with them, but that may be cleaner than shelling out.
Keywords: Pango, PIL, Pillow, Python, markup, subscript, superscript, formula, chemical formulae, ImageMagick, image, image processing, SGML, HTML.
You can also do this sort of thing using Mathtext in Matplotlib:
#!/usr/bin/env python3
import matplotlib.pyplot as plt
plt.axes([0.025, 0.025, 0.95, 0.95])
# Some formula with superscripts, subscripts, square roots, fractions and integrals
eq = r"$ 2P_2 O_5 + H^{2j}$"
size = 50
x,y = 0.5, 0.5
alpha = 1
params = {'mathtext.default': 'regular' }
plt.rcParams.update(params)
plt.text(x, y, eq, ha='center', va='center', color="#11557c", alpha=alpha,
transform=plt.gca().transAxes, fontsize=size, clip_on=True)
# Suppress ticks
plt.xticks(())
plt.yticks(())
# Save on transparent background
plt.savefig('result.png', transparent=True)
You can also save the output in a memory buffer (without going to disk) and then use that in your PIL-based image processing.
Note that I have explicitly named and assigned all the parameters (x, y, size and alpha) so you can play with them and that makes the code look longer and more complicated than it actually is.
Keywords: Python, PIL, Pillow, maths, mathematical symbols, formula with superscripts, subscripts, square roots, fractions and integrals.

Integrating an array in scipy with bounds.

I am trying to integrate over an array of data, but with bounds. Therfore I planned to use simps (scipy.integrate.simps). Because simps itself does not support bounds I decided to feed it only the selection of my data I want to integrate over. Yet this leads to strange results which are twice as big as the expected outcome.
What am I doing wrong, or what am I missing, or missunderstanding?
# -*- coding: utf-8 -*-
from scipy import integrate
from scipy import interpolate
import numpy as np
import matplotlib.pyplot as plt
# my data
x = np.linspace(-10, 10, 30)
y = x**2
# but I only want to integrate from 3 to 5
f = interpolate.interp1d(x, y)
x_selection = np.linspace(3, 5, 10)
y_selection = f(x_selection)
# quad returns the expected result
print 'quad', integrate.quad(f, 3, 5), '<- the expected value (includig error estimation)'
# but simps returns an uexpected result, when using the selected data
print 'simps', integrate.simps(x_selection, y_selection), '<- twice as big'
print 'trapz', integrate.trapz(x_selection, y_selection), '<- also twice as big'
plt.plot(x, y, marker='.')
plt.fill_between(x, y, 0, alpha=0.5)
plt.plot(x_selection, y_selection, marker='.')
plt.fill_between(x_selection, y_selection, 0, alpha=0.5)
plt.show()
Windows7, python2.7, scipy1.0.0
The Arguments for simps() and trapz() are in the wrong order.
You have flipped the calling arguments; simps and trapz expect first the y dimension, and second the x dimension, as per the docs. Once you have corrected this, similar results should obtain. Note that your example function admits a trivial analytic antiderivative, which would be much cheaper to evaluate.
– N. Wouda

Doing OCR to identify text written on trucks/cars or other vehicles

I am new to the world of Computer Vision.
I am trying to use Tesseract to detect numbers written on the side of trucks.
So for this example, I would like to see CMA CGM as the output.
I fed this image to Tesseract via command line
tesseract image.JPG out -psm 6
but it yielded a blank file.
Then I read the documentation of Tesserocr (python wrapper of Tesseract) and tried the following code
with PyTessBaseAPI() as api:
api.SetImage(image)
boxes = api.GetComponentImages(RIL.TEXTLINE, True)
print 'Found {} textline image components.'.format(len(boxes))
for i, (im, box, _, _) in enumerate(boxes):
# im is a PIL image object
# box is a dict with x, y, w and h keys
api.SetRectangle(box['x'], box['y'], box['w'], box['h'])
ocrResult = api.GetUTF8Text()
conf = api.MeanTextConf()
print (u"Box[{0}]: x={x}, y={y}, w={w}, h={h}, "
"confidence: {1}, text: {2}").format(i, conf, ocrResult, **box)
and again it was not able to read any characters in the image.
My question is how should I go about solving this problem? ( I am not looking for a ready made code, but approach on how to go about solving this problem).
Would I need to train tesseract with sample images or can I just write code using existing libraries to somehow detect the co-ordinates of the truck and try to do OCR only within the boundaries of the truck?
Tesseract expects document-only images, but you have non-document objects in your image. You need a sophisticated segmentation(then probably some image processing) process before feeding it to Tesseract-OCR.
I have a three-step solution
Take the part of the image you want to recognize
Apply Gaussian-blur
Apply simple-thresholding
You can use a range to get the part of the image.
For instance, if you select the
height range as: from (int(h/4) + 40 to int(h/2)-20)
width range as: from int(w/2) to int((w*3)/4)
Result
Take Part
Gaussian
Threshold
Pytesseract
CMA CGM
Code:
import cv2
import pytesseract
img = cv2.imread('YizU3.jpg')
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
gry = gry[int(h/4) + 40:int(h/2)-20, int(w/2):int((w*3)/4)]
blr = cv2.GaussianBlur(gry, (3, 3), 0)
thr = cv2.threshold(gry, 128, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr)
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)

Poor Performance and Strange Array Behavior when run on Linux

So I am working on a script to do some video processing. It will read a video file searching for red dots that are a certain size then find the center of each and return the x/y coordinates. Initially I had it working great on my Windows Machine, so I sent it over to the raspberry pi to see if i would encounter issues, and boy did I.
On Windows the script would run in real time, completing at the same time as the video. On the Raspberry it is slowwwwwwww. Also I noticed when I looked into the structure of countours, there is a huge array of 0's first, before my x/y coordinates array. I have no idea what is creating this, but it doesn't happen on the windows box.
I have same version of python and opencv installed on both boxes, the only difference is numpy 1.11 on windows and numpy 1.12 on raspberry. Note, I had to change np.mean(contours[?]) to 1 to skip the initial array of 0's. What have I done wrong?
Here's a video I made for testing purposes if needed:
http://www.foxcreekwinery.com/video.mp4
import numpy as np
import cv2
def vidToPoints():
cap = cv2.VideoCapture('video.mp4')
while(cap.isOpened()):
ret, image = cap.read()
if (ret):
cv2.imshow('frame',image)
if cv2.waitKey(1) == ord('q'):
break
# save frame as image
cv2.imwrite('frame.jpg',image)
# load the image
image = cv2.imread('frame.jpg')
# define the list of boundaries
boundaries = [
([0, 0, 150], [90, 90, 255])
]
# loop over the boundaries
for (lower, upper) in boundaries:
# create NumPy arrays from the boundaries
lower = np.array(lower, dtype = "uint8")
upper = np.array(upper, dtype = "uint8")
# find the colors within the specified boundaries
mask = cv2.inRange(image, lower, upper)
if (50 > cv2.countNonZero(mask) > 10):
#find contour
contours = cv2.findContours(mask, 0, 1)
#average countour list to find center
avg = np.mean(contours[1],axis=1)
x = int(round(avg[0,0,0]))
y = int(round(avg[0,0,1]))
print [x,y]
print cv2.countNonZero(mask)
for l in range(5):
cap.grab()
else:
break
cap.release()
cv2.destroyAllWindows()
vidToPoints()