How can I overlay a transparent PNG onto another image without losing it's transparency using openCV in python?
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
# Help please
cv2.imwrite('combined.png', background)
Desired output:
Sources:
Background Image
Overlay
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png')
added_image = cv2.addWeighted(background,0.4,overlay,0.1,0)
cv2.imwrite('combined.png', added_image)
The correct answer to this was far too hard to come by, so I'm posting this answer even though the question is really old. What you are looking for is "over" compositing, and the algorithm for this can be found on Wikipedia: https://en.wikipedia.org/wiki/Alpha_compositing
I am far from an expert with OpenCV, but after some experimentation this is the most efficient way I have found to accomplish the task:
import cv2
background = cv2.imread("background.png", cv2.IMREAD_UNCHANGED)
foreground = cv2.imread("overlay.png", cv2.IMREAD_UNCHANGED)
# normalize alpha channels from 0-255 to 0-1
alpha_background = background[:,:,3] / 255.0
alpha_foreground = foreground[:,:,3] / 255.0
# set adjusted colors
for color in range(0, 3):
background[:,:,color] = alpha_foreground * foreground[:,:,color] + \
alpha_background * background[:,:,color] * (1 - alpha_foreground)
# set adjusted alpha and denormalize back to 0-255
background[:,:,3] = (1 - (1 - alpha_foreground) * (1 - alpha_background)) * 255
# display the image
cv2.imshow("Composited image", background)
cv2.waitKey(0)
The following code will use the alpha channels of the overlay image to correctly blend it into the background image, use x and y to set the top-left corner of the overlay image.
import cv2
import numpy as np
def overlay_transparent(background, overlay, x, y):
background_width = background.shape[1]
background_height = background.shape[0]
if x >= background_width or y >= background_height:
return background
h, w = overlay.shape[0], overlay.shape[1]
if x + w > background_width:
w = background_width - x
overlay = overlay[:, :w]
if y + h > background_height:
h = background_height - y
overlay = overlay[:h]
if overlay.shape[2] < 4:
overlay = np.concatenate(
[
overlay,
np.ones((overlay.shape[0], overlay.shape[1], 1), dtype = overlay.dtype) * 255
],
axis = 2,
)
overlay_image = overlay[..., :3]
mask = overlay[..., 3:] / 255.0
background[y:y+h, x:x+w] = (1.0 - mask) * background[y:y+h, x:x+w] + mask * overlay_image
return background
This code will mutate background so create a copy if you wish to preserve the original background image.
Been a while since this question appeared, but I believe this is the right simple answer, which could still help somebody.
background = cv2.imread('road.jpg')
overlay = cv2.imread('traffic sign.png')
rows,cols,channels = overlay.shape
overlay=cv2.addWeighted(background[250:250+rows, 0:0+cols],0.5,overlay,0.5,0)
background[250:250+rows, 0:0+cols ] = overlay
This will overlay the image over the background image such as shown here:
Ignore the ROI rectangles
Note that I used a background image of size 400x300 and the overlay image of size 32x32, is shown in the x[0-32] and y[250-282] part of the background image according to the coordinates I set for it, to first calculate the blend and then put the calculated blend in the part of the image where I want to have it.
(overlay is loaded from disk, not from the background image itself,unfortunately the overlay image has its own white background, so you can see that too in the result)
If performance isn't a concern then you can iterate over each pixel of the overlay and apply it to the background. This isn't very efficient, but it does help to understand how to work with png's alpha layer.
slow version
import cv2
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
height, width = overlay.shape[:2]
for y in range(height):
for x in range(width):
overlay_color = overlay[y, x, :3] # first three elements are color (RGB)
overlay_alpha = overlay[y, x, 3] / 255 # 4th element is the alpha channel, convert from 0-255 to 0.0-1.0
# get the color from the background image
background_color = background[y, x]
# combine the background color and the overlay color weighted by alpha
composite_color = background_color * (1 - overlay_alpha) + overlay_color * overlay_alpha
# update the background image in place
background[y, x] = composite_color
cv2.imwrite('combined.png', background)
result:
fast version
I stumbled across this question while trying to add a png overlay to a live video feed. The above solution is way too slow for that. We can make the algorithm significantly faster by using numpy's vector functions.
note: This was my first real foray into numpy so there may be better/faster methods than what I've come up with.
import cv2
import numpy as np
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
# separate the alpha channel from the color channels
alpha_channel = overlay[:, :, 3] / 255 # convert from 0-255 to 0.0-1.0
overlay_colors = overlay[:, :, :3]
# To take advantage of the speed of numpy and apply transformations to the entire image with a single operation
# the arrays need to be the same shape. However, the shapes currently looks like this:
# - overlay_colors shape:(width, height, 3) 3 color values for each pixel, (red, green, blue)
# - alpha_channel shape:(width, height, 1) 1 single alpha value for each pixel
# We will construct an alpha_mask that has the same shape as the overlay_colors by duplicate the alpha channel
# for each color so there is a 1:1 alpha channel for each color channel
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# The background image is larger than the overlay so we'll take a subsection of the background that matches the
# dimensions of the overlay.
# NOTE: For simplicity, the overlay is applied to the top-left corner of the background(0,0). An x and y offset
# could be used to place the overlay at any position on the background.
h, w = overlay.shape[:2]
background_subsection = background[0:h, 0:w]
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + overlay_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[0:h, 0:w] = composite
cv2.imwrite('combined.png', background)
How much faster? On my machine the slow method takes ~3 seconds and the optimized method takes ~ 30 ms. So about
100 times faster!
Wrapped up in a function
This function handles foreground and background images of different sizes and also supports negative and positive offsets the move the overlay across the bounds of the background image in any direction.
import cv2
import numpy as np
def add_transparent_image(background, foreground, x_offset=None, y_offset=None):
bg_h, bg_w, bg_channels = background.shape
fg_h, fg_w, fg_channels = foreground.shape
assert bg_channels == 3, f'background image should have exactly 3 channels (RGB). found:{bg_channels}'
assert fg_channels == 4, f'foreground image should have exactly 4 channels (RGBA). found:{fg_channels}'
# center by default
if x_offset is None: x_offset = (bg_w - fg_w) // 2
if y_offset is None: y_offset = (bg_h - fg_h) // 2
w = min(fg_w, bg_w, fg_w + x_offset, bg_w - x_offset)
h = min(fg_h, bg_h, fg_h + y_offset, bg_h - y_offset)
if w < 1 or h < 1: return
# clip foreground and background images to the overlapping regions
bg_x = max(0, x_offset)
bg_y = max(0, y_offset)
fg_x = max(0, x_offset * -1)
fg_y = max(0, y_offset * -1)
foreground = foreground[fg_y:fg_y + h, fg_x:fg_x + w]
background_subsection = background[bg_y:bg_y + h, bg_x:bg_x + w]
# separate alpha and color channels from the foreground image
foreground_colors = foreground[:, :, :3]
alpha_channel = foreground[:, :, 3] / 255 # 0-255 => 0.0-1.0
# construct an alpha_mask that matches the image shape
alpha_mask = np.dstack((alpha_channel, alpha_channel, alpha_channel))
# combine the background with the overlay image weighted by alpha
composite = background_subsection * (1 - alpha_mask) + foreground_colors * alpha_mask
# overwrite the section of the background image that has been updated
background[bg_y:bg_y + h, bg_x:bg_x + w] = composite
example usage:
background = cv2.imread('field.jpg')
overlay = cv2.imread('dice.png', cv2.IMREAD_UNCHANGED) # IMREAD_UNCHANGED => open image with the alpha channel
x_offset = 0
y_offset = 0
print("arrow keys to move the dice. ESC to quit")
while True:
img = background.copy()
add_transparent_image(img, overlay, x_offset, y_offset)
cv2.imshow("", img)
key = cv2.waitKey()
if key == 0: y_offset -= 10 # up
if key == 1: y_offset += 10 # down
if key == 2: x_offset -= 10 # left
if key == 3: x_offset += 10 # right
if key == 27: break # escape
You need to open the transparent png image using the flag IMREAD_UNCHANGED
Mat overlay = cv::imread("dice.png", IMREAD_UNCHANGED);
Then split the channels, group the RGB and use the transparent channel as an mask, do like that:
/**
* #brief Draws a transparent image over a frame Mat.
*
* #param frame the frame where the transparent image will be drawn
* #param transp the Mat image with transparency, read from a PNG image, with the IMREAD_UNCHANGED flag
* #param xPos x position of the frame image where the image will start.
* #param yPos y position of the frame image where the image will start.
*/
void drawTransparency(Mat frame, Mat transp, int xPos, int yPos) {
Mat mask;
vector<Mat> layers;
split(transp, layers); // seperate channels
Mat rgb[3] = { layers[0],layers[1],layers[2] };
mask = layers[3]; // png's alpha channel used as mask
merge(rgb, 3, transp); // put together the RGB channels, now transp insn't transparent
transp.copyTo(frame.rowRange(yPos, yPos + transp.rows).colRange(xPos, xPos + transp.cols), mask);
}
Can be called like that:
drawTransparency(background, overlay, 10, 10);
To overlay png image watermark over normal 3 channel jpeg image
import cv2
import numpy as np
def logoOverlay(image,logo,alpha=1.0,x=0, y=0, scale=1.0):
(h, w) = image.shape[:2]
image = np.dstack([image, np.ones((h, w), dtype="uint8") * 255])
overlay = cv2.resize(logo, None,fx=scale,fy=scale)
(wH, wW) = overlay.shape[:2]
output = image.copy()
# blend the two images together using transparent overlays
try:
if x<0 : x = w+x
if y<0 : y = h+y
if x+wW > w: wW = w-x
if y+wH > h: wH = h-y
print(x,y,wW,wH)
overlay=cv2.addWeighted(output[y:y+wH, x:x+wW],alpha,overlay[:wH,:wW],1.0,0)
output[y:y+wH, x:x+wW ] = overlay
except Exception as e:
print("Error: Logo position is overshooting image!")
print(e)
output= output[:,:,:3]
return output
Usage:
background = cv2.imread('image.jpeg')
overlay = cv2.imread('logo.png', cv2.IMREAD_UNCHANGED)
print(overlay.shape) # must be (x,y,4)
print(background.shape) # must be (x,y,3)
# downscale logo by half and position on bottom right reference
out = logoOverlay(background,overlay,scale=0.5,y=-100,x=-100)
cv2.imshow("test",out)
cv2.waitKey(0)
import cv2
import numpy as np
background = cv2.imread('background.jpg')
overlay = cv2.imread('cloudy.png')
overlay = cv2.resize(overlay, (200,200))
# overlay = for_transparent_removal(overlay)
h, w = overlay.shape[:2]
shapes = np.zeros_like(background, np.uint8)
shapes[0:h, 0:w] = overlay
alpha = 0.8
mask = shapes.astype(bool)
# option first
background[mask] = cv2.addWeighted(shapes, alpha, shapes, 1 - alpha, 0)[mask]
cv2.imwrite('combined.png', background)
# option second
background[mask] = cv2.addWeighted(background, alpha, overlay, 1 - alpha, 0)[mask]
# NOTE : above both option will give you image overlays but effect would be changed
cv2.imwrite('combined.1.png', background)
**Use this function to place your overlay on any background image.
if want to resize overlay use this overlay = cv2.resize(overlay, (200,200)) and then pass resized overlay into the function.
**
import cv2
import numpy as np
def image_overlay_second_method(img1, img2, location, min_thresh=0, is_transparent=False):
h, w = img1.shape[:2]
h1, w1 = img2.shape[:2]
x, y = location
roi = img1[y:y + h1, x:x + w1]
gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
_, mask = cv2.threshold(gray, min_thresh, 255, cv2.THRESH_BINARY)
mask_inv = cv2.bitwise_not(mask)
img_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
img_fg = cv2.bitwise_and(img2, img2, mask=mask)
dst = cv2.add(img_bg, img_fg)
if is_transparent:
dst = cv2.addWeighted(img1[y:y + h1, x:x + w1], 0.1, dst, 0.9, None)
img1[y:y + h1, x:x + w1] = dst
return img1
if __name__ == '__main__':
background = cv2.imread('background.jpg')
overlay = cv2.imread('overlay.png')
output = image_overlay_third_method(background, overlay, location=(800,50), min_thresh=0, is_transparent=True)
cv2.imwrite('output.png', output)
background.jpg
output.png
Related
Apologies in advance as i am newbie to OpenCV-Python. I set myself a task to create a Passport type image from the video capture.
Using a head and shoulders Haar Cascade i was able to create a portrait photo but i now want to turn the background to a white background (leaving the head and shoulders portrait in the foreground).
Just not sure how/ best way to do this. Any help would be welcome.
Many thanks in advance.
Here is the code:
import numpy as np
import cv2
# face file
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# eye file
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
# head shoulders file
hs_cascade = cv2.CascadeClassifier('HS.xml')
cap = cv2.VideoCapture(1)
while 1:
ret, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
headshoulders = hs_cascade.detectMultiScale(gray, 1.3, 3)
# find the head and shoulders
for (x,y,w,h) in headshoulders:
# variable change to make portrait orientation
x = int(x*1.5)
w = int(w/1.5)
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
# crop the image
crop_img = img[y: y + h, x: x + w]
# show original and crop
cv2.imshow('crop', crop_img)
cv2.imshow('img', img)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
elif k == ord('s'):
# save out the portrait image
cv2.imwrite('cropimage.png',crop_img)
# release the camera
cap.release()
cv2.destroyAllWindows()
I got it to work. Here is my solution.
PLEASE NOTE: This worked for HI-RES images (Nikon D7100 - JPEG). LOW-RES did NOT work when i tried a Webcam (Logitech C615).
I used some of the code from a link that was suggested.
# import numpy
import numpy as np
# import cv2
import cv2
# import Matplitlib
from matplotlib import pyplot as plt
# Fill any holes function
def get_holes(image, thresh):
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
im_bw = cv2.threshold(gray, thresh, 255, cv2.THRESH_BINARY)[1]
im_bw_inv = cv2.bitwise_not(im_bw)
im_bw_inv, contour, _ = cv2.findContours(im_bw_inv, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
cv2.drawContours(im_bw_inv, [cnt], 0, 255, -1)
nt = cv2.bitwise_not(im_bw)
im_bw_inv = cv2.bitwise_or(im_bw_inv, nt)
return im_bw_inv
# Remove background Function
def remove_background(image, thresh, scale_factor=.25, kernel_range=range(1, 15), border=None):
border = border or kernel_range[-1]
holes = get_holes(image, thresh)
small = cv2.resize(holes, None, fx=scale_factor, fy=scale_factor)
bordered = cv2.copyMakeBorder(small, border, border, border, border, cv2.BORDER_CONSTANT)
for i in kernel_range:
#kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2*i+1, 2*i+1))
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (2*i+1, 2*i+1))
bordered = cv2.morphologyEx(bordered, cv2.MORPH_CLOSE, kernel)
unbordered = bordered[border: -border, border: -border]
mask = cv2.resize(unbordered, (image.shape[1], image.shape[0]))
fg = cv2.bitwise_and(image, image, mask=mask)
return fg
# Load a color image in grayscale
img = cv2.imread('original/11.png')
# Start background removal -- Parameters are <image> and <threshold level>
nb_img = remove_background(img, 180)
# Change Black Pixels to WHITE
nb_img[np.where((nb_img==[0,0,0]).all(axis=2))] = [255,255,255]
# resize the viewing size (as the images are too big for the screen
small = cv2.resize(nb_img, (300, 400))
# Show the finished image
cv2.imshow('image',small)
k = cv2.waitKey(0) & 0xFF
if k == 27: #wait for ESC key to exit
# if ESC pressed close the camera windows
cv2.destroyAllWindows()
elif k == ord('s'): #wait for 's' key to save and exit
# Save the img(greyscale version)
cv2.imwrite('bg_removal/11.png',small)
cv2.destroyAllWindows()
I want to find the HSV value of a LASER dot using opencv and python. I got the code http://opencv-srf.blogspot.com.au/2010/09/object-detection-using-color-seperation.html from here but it is in c++, installing visual studio and opencv takes time so i changed the code in python
import cv2
import numpy as np
def callback(x):
pass
cap = cv2.VideoCapture(0)
cv2.namedWindow('image')
ilowH = 0
ihighH = 179
ilowS = 0
ihighS = 255
ilowV = 0
ihighV = 255
# create trackbars for color change
cv2.createTrackbar('lowH','image',ilowH,179,callback)
cv2.createTrackbar('highH','image',ihighH,179,callback)
cv2.createTrackbar('lowS','image',ilowS,255,callback)
cv2.createTrackbar('highS','image',ihighS,255,callback)
cv2.createTrackbar('lowV','image',ilowV,255,callback)
cv2.createTrackbar('highV','image',ihighV,255,callback)
while(1):
ret, frame = cap.read()
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
cv2.imshow('hsv', hsv)
lower_hsv = np.array([ilowH, ilowS, ilowV])
higher_hsv = np.array([ihighH, ihighS, ihighV])
mask = cv2.inRange(hsv, lower_hsv, higher_hsv)
cv2.imshow('mask', mask)
cv2.imshow('frame', frame)
print ilowH, ilowS, ilowV
if(cv2.waitKey(1) & 0xFF == ord('q')):
break
cv2.destroyAllWindows()
cap.release()
but this code doesnot threshold anything. It seems like the trackbars i created doesnot change the value of ilowH ,ilowS, ilowV . I checked it by printing those values inside while loop. What could be the problem for not thresholding any of those values or is there better code in python to find HSV values of the LASER.
Thank you, any help is appreciated.
You can grab the trackbar values with cv2.getTrackbarPos(). Also note that sometimes it puts trackbars out of order, which is annoying, but at least they're labeled.
However, I don't think that these trackbars will work very well for live video feed. There's a lot of freezing issues. You'll have to have a super low framerate (works for me with cv2.waitKey(500) if you're actually trying to display it). This is mostly due to the trackbars sucking, not the thresholding operation, which is not that slow.
You need to add your trackbars after you create the named window. Then, for your while loop, try:
while True:
# grab the frame
ret, frame = cap.read()
# get trackbar positions
ilowH = cv2.getTrackbarPos('lowH', 'image')
ihighH = cv2.getTrackbarPos('highH', 'image')
ilowS = cv2.getTrackbarPos('lowS', 'image')
ihighS = cv2.getTrackbarPos('highS', 'image')
ilowV = cv2.getTrackbarPos('lowV', 'image')
ihighV = cv2.getTrackbarPos('highV', 'image')
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
lower_hsv = np.array([ilowH, ilowS, ilowV])
higher_hsv = np.array([ihighH, ihighS, ihighV])
mask = cv2.inRange(hsv, lower_hsv, higher_hsv)
frame = cv2.bitwise_and(frame, frame, mask=mask)
# show thresholded image
cv2.imshow('image', frame)
k = cv2.waitKey(1000) & 0xFF # large wait time to remove freezing
if k == 113 or k == 27:
break
and finally end the file with a cv2.destroyAllWindows()
As an aside, the maximum H value for HSV is 180, not 179.
Shameless plug: I happened to just finish a project doing precisely this, but on images. You can grab it on GitHub here. There is an example; try running it and then modifying as you need. It will let you change the colorspace and threshold inside each different colorspace, and it will print the final thresholding values that you ended on. Additionally it will return the output image from the operation for you to use, too. Hopefully it is useful for you! Feel free to send any issues or suggestions through GitHub for the project.
Here is an example of it running:
And as output it gives you:
Colorspace: HSV
Lower bound: [68.4, 0.0, 0.0]
Upper bound: [180.0, 255.0, 255.0]
as well as the binary image. I am currently working on getting this into a web application as well, but that probably won't be finished for a few days.
Use this code to find range of masking of real-time video! this might save you time. Below is a whole code, Check it and run it to have a test.
import cv2
import numpy as np
camera = cv2.VideoCapture(0)
def nothing(x):
pass
cv2.namedWindow('marking')
cv2.createTrackbar('H Lower','marking',0,179,nothing)
cv2.createTrackbar('H Higher','marking',179,179,nothing)
cv2.createTrackbar('S Lower','marking',0,255,nothing)
cv2.createTrackbar('S Higher','marking',255,255,nothing)
cv2.createTrackbar('V Lower','marking',0,255,nothing)
cv2.createTrackbar('V Higher','marking',255,255,nothing)
while(1):
_,img = camera.read()
img = cv2.flip(img,1)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
hL = cv2.getTrackbarPos('H Lower','marking')
hH = cv2.getTrackbarPos('H Higher','marking')
sL = cv2.getTrackbarPos('S Lower','marking')
sH = cv2.getTrackbarPos('S Higher','marking')
vL = cv2.getTrackbarPos('V Lower','marking')
vH = cv2.getTrackbarPos('V Higher','marking')
LowerRegion = np.array([hL,sL,vL],np.uint8)
upperRegion = np.array([hH,sH,vH],np.uint8)
redObject = cv2.inRange(hsv,LowerRegion,upperRegion)
kernal = np.ones((1,1),"uint8")
red = cv2.morphologyEx(redObject,cv2.MORPH_OPEN,kernal)
red = cv2.dilate(red,kernal,iterations=1)
res1=cv2.bitwise_and(img, img, mask = red)
cv2.imshow("Masking ",res1)
if cv2.waitKey(10) & 0xFF == ord('q'):
camera.release()
cv2.destroyAllWindows()
break`
Thanks!
Hugs..
I'm trying to convert BGR to YUV with cvCvtColor method AND then get reference to each component.
The source image (IplImage1) has following parameters:
depth = 8
nChannels = 3
colorModel = RGB
channelSeq = BGR
width = 1620
height = 1220
Convert and get the components after conversion:
IplImage* yuvImage = cvCreateImage(cvSize(1620, 1220), 8, 3);
cvCvtColor(IplImage1, yuvImage, CV_BGR2YCrCb);
yPtr = yuvImage->imageData;
uPtr = yPtr + height*width;
vPtr = uPtr + height*width/4;
I have method that converts the YUV back to RGB and saves to file. When I create the YUV components manually (I create blue image) it works and when I open the image it's really blue. But, when I create YUV components using the method above I get black image. I think that maybe I get reference to YUV components wrongly
yPtr = yuvImage->imageData;
uPtr = yPtr + height*width;
vPtr = uPtr + height*width/4;
What could be the problem?
If you really must use IplImage (e.g. in legacy code, or C) then use cvSplit
IplImage* IplImage1 = something;
IplImage* ycrcbImage = cvCreateImage(cvSize(1620, 1220), 8, 3);
cvCvtColor(IplImage1, ycrcbImage, CV_BGR2YCrCb);
IplImage* yImage = cvCreateImage(cvSize(1620, 1220), 8, 1);
IplImage* crImage = cvCreateImage(cvSize(1620, 1220), 8, 1);
IplImage* cbImage = cvCreateImage(cvSize(1620, 1220), 8, 1);
cvSplit(ycrcbImage, yImage, crImage , cbImage, 0);
The modern approach would be to avoid the legacy API and use Mats:
cv::Mat matImage1(IplImage1);
cv::Mat ycrcb_image;
cv::cvtColor(matImage1, ycrcb_image, CV_BGR2YCrCb);
// Extract the Y, Cr and Cb channels into separate Mats
std::vector<cv::Mat> planes(3);
cv::split(ycrcb_image, planes);
// Now you have the Y image in planes[0],
// the Cr image in planes[1],
// and the Cb image in planes[2]
cv::Mat Y = planes[0]; // if you want
While RGB represents color as red, green and blue; the YCbCr color model represents color as brightness and two color difference signals. In YCbCr, the Y is the brightness (luma), Cb is blue minus luma (B-Y) and Cr is red minus luma (R-Y).
Here is the code for the same in case you are using OpenCV 3.0.0 :
import numpy as np
import cv2
#Obtaining and displaying the image
x = 'C:/Users/524316/Desktop/car.jpg'
img = cv2.imread(x, 1)
cv2.imshow("img",img)
#converting to YCrCb color space
YCrCb = cv2.cvtColor(a, cv2.COLOR_BGR2YCrCb)
cv2.imshow("YCrCb",YCrCb)
#splitting the channels individually
Y, Cr, Cb = cv2.split(YCrCb)
cv2.imshow('Y_channel', Y)
cv2.imshow('Cr_channel', Cr)
cv2.imshow('Cb_channel', Cb)
cv2.waitKey(0)
cv2.destroyAllWindows()
Original image:
YCrCb image :
Y - Channel :
It is the same as grayscale image
Cr - channel :
Cb - channel :
Starting from an image, I would like to shift its content upward of 10 pixels, without changing size and filling in black the sub image (width x 10px) on the bottom.
For instance, the original:
And the shifted:
Is there any function to perform this operation with OpenCV?
You can simply use affine transformation translation matrix (which is for shifting points basically). cv::warpAffine() with proper transformation matrix will do the trick.
where:
tx is shift in the image x axis,
ty is shift in the image y axis,
Every single pixel in the image will be shifted like that.
You can use this function which returns the translation matrix. (That is probably unnecessary for you) But it will shift the image based on offsetx and offsety parameters.
Mat translateImg(Mat &img, int offsetx, int offsety){
Mat trans_mat = (Mat_<double>(2,3) << 1, 0, offsetx, 0, 1, offsety);
warpAffine(img,img,trans_mat,img.size());
return img;
}
In your case - you want to shift image 10 pixels up, you call:
translateImg(image,0,-10);
And then your image will be shifted as you desire.
Is there a function to perform directly this operation with OpenCV?
https://github.com/opencv/opencv/issues/4413 (previously
http://web.archive.org/web/20170615214220/http://code.opencv.org/issues/2299)
or you would do this
cv::Mat out = cv::Mat::zeros(frame.size(), frame.type());
frame(cv::Rect(0,10, frame.cols,frame.rows-10)).copyTo(out(cv::Rect(0,0,frame.cols,frame.rows-10)));
this link maybe help this question, thanks
import cv2
import numpy as np
img = cv2.imread('images/input.jpg')
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
img_translation = cv2.warpAffine(img, translation_matrix, (num_cols, num_rows))
cv2.imshow('Translation', img_translation)
cv2.waitKey()
and tx and ty could control the shift pixels on x and y direction respectively.
Here is a function I wrote, based on Zaw Lin's answer, to do frame/image shift in any direction by any amount of pixel rows or columns:
enum Direction{
ShiftUp=1, ShiftRight, ShiftDown, ShiftLeft
};
cv::Mat shiftFrame(cv::Mat frame, int pixels, Direction direction)
{
//create a same sized temporary Mat with all the pixels flagged as invalid (-1)
cv::Mat temp = cv::Mat::zeros(frame.size(), frame.type());
switch (direction)
{
case(ShiftUp) :
frame(cv::Rect(0, pixels, frame.cols, frame.rows - pixels)).copyTo(temp(cv::Rect(0, 0, temp.cols, temp.rows - pixels)));
break;
case(ShiftRight) :
frame(cv::Rect(0, 0, frame.cols - pixels, frame.rows)).copyTo(temp(cv::Rect(pixels, 0, frame.cols - pixels, frame.rows)));
break;
case(ShiftDown) :
frame(cv::Rect(0, 0, frame.cols, frame.rows - pixels)).copyTo(temp(cv::Rect(0, pixels, frame.cols, frame.rows - pixels)));
break;
case(ShiftLeft) :
frame(cv::Rect(pixels, 0, frame.cols - pixels, frame.rows)).copyTo(temp(cv::Rect(0, 0, frame.cols - pixels, frame.rows)));
break;
default:
std::cout << "Shift direction is not set properly" << std::endl;
}
return temp;
}
Since there's currently no Python solution and a Google search for shifting an image using Python brings you to this page, here's an Python solution using np.roll()
Shifting against x-axis
import cv2
import numpy as np
image = cv2.imread('1.jpg')
shift = 40
for i in range(image.shape[1] -1, image.shape[1] - shift, -1):
image = np.roll(image, -1, axis=1)
image[:, -1] = 0
cv2.imshow('image', image)
cv2.waitKey()
Shifting against y-axis
import cv2
import numpy as np
image = cv2.imread('1.jpg')
shift = 40
for i in range(image.shape[0] -1, image.shape[0] - shift, -1):
image = np.roll(image, -1, axis=0)
image[-1, :] = 0
cv2.imshow('image', image)
cv2.waitKey()
Is there a function to perform directly this operation with OpenCV?
http://code.opencv.org/issues/2299
or you would do this
cv::Mat out = cv::Mat::zeros(frame.size(), frame.type());
frame(cv::Rect(0,10,
frame.cols,frame.rows-10)).copyTo(out(cv::Rect(0,0,frame.cols,frame.rows-10)));
The code above only can be used to shift to one side (to the left, and to the top). Below code is the extended version of above code which can be used to shift into every direction.
int shiftCol = 10;
int shiftRow = 10;
Rect source = cv::Rect(max(0,-shiftCol),max(0,-shiftRow), frame.cols-abs(shiftCol),frame.rows-abs(shiftRow));
Rect target = cv::Rect(max(0,shiftCol),max(0,shiftRow),frame.cols-abs(shiftCol),frame.rows-abs(shiftRow));
frame(source).copyTo(out(target));
h, w = image.shape # for gray image
shift = 100 # any legal number 0 < x < h
img[:h-shift, :] = img[shift:, :]
img[h-shift:, :] = 0
My implementation uses the same as the accepted answer however it can move in any direction...
using namespace cv;
//and whatever header 'abs' requires...
Mat offsetImageWithPadding(const Mat& originalImage, int offsetX, int offsetY, Scalar backgroundColour){
cv::Mat padded = Mat(originalImage.rows + 2 * abs(offsetY), originalImage.cols + 2 * abs(offsetX), CV_8UC3, backgroundColour);
originalImage.copyTo(padded(Rect(abs(offsetX), abs(offsetY), originalImage.cols, originalImage.rows)));
return Mat(padded,Rect(abs(offsetX) + offsetX, abs(offsetY) + offsetY, originalImage.cols, originalImage.rows));
}
//example use with black borders along the right hand side and top:
Mat offsetImage = offsetImageWithPadding(originalImage, -10, 6, Scalar(0,0,0));
It's taken from my own working code but some variables changed, if it doesn't compile, very likely just a small thing needs changing - but you get the idea re. the abs function...
You can use a simple 2d filter/convolution to achieve your goal:
Taken straight from the OpenCV documentation. You will need to filter with a kernel that has height (desired_displacement_y * 2 + 1) and width (desired_displacement_x * 2 + 1).
Then you will need to set the kernel to all zeros except for the relative pixel position from where you want to copy. So if your kernel center is (0,0) you would set (10,0) to 1 for a displacement of 10 pixels.
Take the sample code from the website, and replace the kernel code in the middle with the following:
/// Update kernel size for a normalized box filter
kernel_size = 1 + ind * 2; //Center pixel plus displacement diameter (=radius * 2)
kernel = Mat::zeros( kernel_size, kernel_size, CV_32F );
kernel.at<float>(ind * 2, ind) = 1.0f; // Indices are zero-based, not relative
/// Apply filter
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_CONSTANT );
Notice BORDER_CONSTANT in filter2D! You should now run the example and have a the picture scroll up by one pixel every 0.5 seconds. You could also draw the black pixels using drawing methods.
On why this works, see Wikipedia.
I first tried with pajus_cz's answer, but it was quite slow in practice. Also, I cannot afford to make a temporary copy, so I came up with this:
void translateY(cv::Mat& image, int yOffset)
{
int validHeight = std::max(image.rows - abs(yOffset), 0);
int firstSourceRow = std::max(-yOffset, 0);
int firstDestinationRow = std::max(yOffset, 0);
memmove(image.ptr(firstDestinationRow),
image.ptr(firstSourceRow),
validHeight * image.step);
}
It's orders of magnitude faster than the warpAffine-based solution. (But this of course may be completely irrelevant in your case.)
Python code some might find useful.
h, w, c = image.shape
shift = 4 #set shift magnitude
img_shift_right = np.zeros(image.shape)
img_shift_down = np.zeros(image.shape)
img_shift_left = np.zeros(image.shape)
img_shift_up = np.zeros(image.shape)
img_shift_right[:,shift:w, :] = image[:,:w-shift, :]
img_shift_down[shift:h, :, :] = image[:h-shift, :, :]
img_shift_left[:,:w-shift, :] = image[:,shift:, :]
img_shift_up[:h-shift, :, :] = image[shift:, :, :]
Starting from an image, I would like to shift its content upward of 10 pixels, without changing size and filling in black the sub image (width x 10px) on the bottom.
For instance, the original:
And the shifted:
Is there any function to perform this operation with OpenCV?
You can simply use affine transformation translation matrix (which is for shifting points basically). cv::warpAffine() with proper transformation matrix will do the trick.
where:
tx is shift in the image x axis,
ty is shift in the image y axis,
Every single pixel in the image will be shifted like that.
You can use this function which returns the translation matrix. (That is probably unnecessary for you) But it will shift the image based on offsetx and offsety parameters.
Mat translateImg(Mat &img, int offsetx, int offsety){
Mat trans_mat = (Mat_<double>(2,3) << 1, 0, offsetx, 0, 1, offsety);
warpAffine(img,img,trans_mat,img.size());
return img;
}
In your case - you want to shift image 10 pixels up, you call:
translateImg(image,0,-10);
And then your image will be shifted as you desire.
Is there a function to perform directly this operation with OpenCV?
https://github.com/opencv/opencv/issues/4413 (previously
http://web.archive.org/web/20170615214220/http://code.opencv.org/issues/2299)
or you would do this
cv::Mat out = cv::Mat::zeros(frame.size(), frame.type());
frame(cv::Rect(0,10, frame.cols,frame.rows-10)).copyTo(out(cv::Rect(0,0,frame.cols,frame.rows-10)));
this link maybe help this question, thanks
import cv2
import numpy as np
img = cv2.imread('images/input.jpg')
num_rows, num_cols = img.shape[:2]
translation_matrix = np.float32([ [1,0,70], [0,1,110] ])
img_translation = cv2.warpAffine(img, translation_matrix, (num_cols, num_rows))
cv2.imshow('Translation', img_translation)
cv2.waitKey()
and tx and ty could control the shift pixels on x and y direction respectively.
Here is a function I wrote, based on Zaw Lin's answer, to do frame/image shift in any direction by any amount of pixel rows or columns:
enum Direction{
ShiftUp=1, ShiftRight, ShiftDown, ShiftLeft
};
cv::Mat shiftFrame(cv::Mat frame, int pixels, Direction direction)
{
//create a same sized temporary Mat with all the pixels flagged as invalid (-1)
cv::Mat temp = cv::Mat::zeros(frame.size(), frame.type());
switch (direction)
{
case(ShiftUp) :
frame(cv::Rect(0, pixels, frame.cols, frame.rows - pixels)).copyTo(temp(cv::Rect(0, 0, temp.cols, temp.rows - pixels)));
break;
case(ShiftRight) :
frame(cv::Rect(0, 0, frame.cols - pixels, frame.rows)).copyTo(temp(cv::Rect(pixels, 0, frame.cols - pixels, frame.rows)));
break;
case(ShiftDown) :
frame(cv::Rect(0, 0, frame.cols, frame.rows - pixels)).copyTo(temp(cv::Rect(0, pixels, frame.cols, frame.rows - pixels)));
break;
case(ShiftLeft) :
frame(cv::Rect(pixels, 0, frame.cols - pixels, frame.rows)).copyTo(temp(cv::Rect(0, 0, frame.cols - pixels, frame.rows)));
break;
default:
std::cout << "Shift direction is not set properly" << std::endl;
}
return temp;
}
Since there's currently no Python solution and a Google search for shifting an image using Python brings you to this page, here's an Python solution using np.roll()
Shifting against x-axis
import cv2
import numpy as np
image = cv2.imread('1.jpg')
shift = 40
for i in range(image.shape[1] -1, image.shape[1] - shift, -1):
image = np.roll(image, -1, axis=1)
image[:, -1] = 0
cv2.imshow('image', image)
cv2.waitKey()
Shifting against y-axis
import cv2
import numpy as np
image = cv2.imread('1.jpg')
shift = 40
for i in range(image.shape[0] -1, image.shape[0] - shift, -1):
image = np.roll(image, -1, axis=0)
image[-1, :] = 0
cv2.imshow('image', image)
cv2.waitKey()
Is there a function to perform directly this operation with OpenCV?
http://code.opencv.org/issues/2299
or you would do this
cv::Mat out = cv::Mat::zeros(frame.size(), frame.type());
frame(cv::Rect(0,10,
frame.cols,frame.rows-10)).copyTo(out(cv::Rect(0,0,frame.cols,frame.rows-10)));
The code above only can be used to shift to one side (to the left, and to the top). Below code is the extended version of above code which can be used to shift into every direction.
int shiftCol = 10;
int shiftRow = 10;
Rect source = cv::Rect(max(0,-shiftCol),max(0,-shiftRow), frame.cols-abs(shiftCol),frame.rows-abs(shiftRow));
Rect target = cv::Rect(max(0,shiftCol),max(0,shiftRow),frame.cols-abs(shiftCol),frame.rows-abs(shiftRow));
frame(source).copyTo(out(target));
h, w = image.shape # for gray image
shift = 100 # any legal number 0 < x < h
img[:h-shift, :] = img[shift:, :]
img[h-shift:, :] = 0
My implementation uses the same as the accepted answer however it can move in any direction...
using namespace cv;
//and whatever header 'abs' requires...
Mat offsetImageWithPadding(const Mat& originalImage, int offsetX, int offsetY, Scalar backgroundColour){
cv::Mat padded = Mat(originalImage.rows + 2 * abs(offsetY), originalImage.cols + 2 * abs(offsetX), CV_8UC3, backgroundColour);
originalImage.copyTo(padded(Rect(abs(offsetX), abs(offsetY), originalImage.cols, originalImage.rows)));
return Mat(padded,Rect(abs(offsetX) + offsetX, abs(offsetY) + offsetY, originalImage.cols, originalImage.rows));
}
//example use with black borders along the right hand side and top:
Mat offsetImage = offsetImageWithPadding(originalImage, -10, 6, Scalar(0,0,0));
It's taken from my own working code but some variables changed, if it doesn't compile, very likely just a small thing needs changing - but you get the idea re. the abs function...
You can use a simple 2d filter/convolution to achieve your goal:
Taken straight from the OpenCV documentation. You will need to filter with a kernel that has height (desired_displacement_y * 2 + 1) and width (desired_displacement_x * 2 + 1).
Then you will need to set the kernel to all zeros except for the relative pixel position from where you want to copy. So if your kernel center is (0,0) you would set (10,0) to 1 for a displacement of 10 pixels.
Take the sample code from the website, and replace the kernel code in the middle with the following:
/// Update kernel size for a normalized box filter
kernel_size = 1 + ind * 2; //Center pixel plus displacement diameter (=radius * 2)
kernel = Mat::zeros( kernel_size, kernel_size, CV_32F );
kernel.at<float>(ind * 2, ind) = 1.0f; // Indices are zero-based, not relative
/// Apply filter
filter2D(src, dst, ddepth , kernel, anchor, delta, BORDER_CONSTANT );
Notice BORDER_CONSTANT in filter2D! You should now run the example and have a the picture scroll up by one pixel every 0.5 seconds. You could also draw the black pixels using drawing methods.
On why this works, see Wikipedia.
I first tried with pajus_cz's answer, but it was quite slow in practice. Also, I cannot afford to make a temporary copy, so I came up with this:
void translateY(cv::Mat& image, int yOffset)
{
int validHeight = std::max(image.rows - abs(yOffset), 0);
int firstSourceRow = std::max(-yOffset, 0);
int firstDestinationRow = std::max(yOffset, 0);
memmove(image.ptr(firstDestinationRow),
image.ptr(firstSourceRow),
validHeight * image.step);
}
It's orders of magnitude faster than the warpAffine-based solution. (But this of course may be completely irrelevant in your case.)
Python code some might find useful.
h, w, c = image.shape
shift = 4 #set shift magnitude
img_shift_right = np.zeros(image.shape)
img_shift_down = np.zeros(image.shape)
img_shift_left = np.zeros(image.shape)
img_shift_up = np.zeros(image.shape)
img_shift_right[:,shift:w, :] = image[:,:w-shift, :]
img_shift_down[shift:h, :, :] = image[:h-shift, :, :]
img_shift_left[:,:w-shift, :] = image[:,shift:, :]
img_shift_up[:h-shift, :, :] = image[shift:, :, :]