While uploading the IMG files, I am thinking about uploading the respective bounding box data as well through Pascal VOC files. How can I do that on Vertex AI or do I need to do any transformation of the Pascal VOC data?
You can do object detection using AutoML Vision which is available for Vertex-AI.
Preparing training dataset for Object detection in AutoML Vision requires your data to be in .csv format.
You can refer to this link to convert your Pascal VOC files to Cloud AutoML Vision csv.
To prepare and format data for object detection:
Prepare your training data according to the supported image files and
bounding box requirements mentioned here.
After preparing training data you can then create a CSV file with bounding boxes. Doc
For more details on how to build object detection models on AutoML Vision follow this quickstart.
You can refer to this link to convert your Pascal VOC files to Cloud
AutoML Vision csv.
This requires upgraded membership of roboflow.com. Following code serves the purpose locally on your machine!
Note: It requires you to have your images and annotations (xml files) stored on a GCS bucket.
from google.cloud import storage
import os
import csv
import xml.etree.ElementTree as ET
os.environ['GOOGLE_APPLICATION_CREDENTIALS']= 'path-to-credentials-json-file'
bucket_name = 'your-bucket-name'
client = storage.Client()
bucket = client.get_bucket(bucket_name)
blobs = client.list_blobs(bucket_name,prefix=None,delimiter='/')
#download all xmls files to local directory or one can paste all the xmls in current working directory and comment downloading part
print('Downloading xml files...')
xml_files = [blob.name for blob in blobs if blob.name.endswith('.xml')]
[bucket.blob(xml).download_to_filename(xml) for xml in xml_files]
allowed_extensions = ['jpg','png','jpeg','gif','bmp','ico']
csv_file_name = 'vertex_ai_annos.csv'
csvfile = open(csv_file_name,'w', newline='')
csvwriter = csv.writer(csvfile, escapechar=' ', quoting=csv.QUOTE_NONE)
blobs = client.list_blobs(bucket_name,prefix=None,delimiter='/') #reinitialize list_blobs iterator
for blob in blobs:
ext = blob.name.split('.')[-1]
if ext in allowed_extensions:
tree = ET.parse(f"{blob.name.split('.')[0]}.xml")
width = int(tree.find('size').find('width').text)
height = int(tree.find('size').find('height').text)
objs = tree.findall('object')
for obj in objs:
label = obj.find('name').text
xmin = int(obj.find('bndbox').find('xmin').text)/width
ymin = int(obj.find('bndbox').find('ymin').text)/height
xmax = int(obj.find('bndbox').find('xmax').text)/width
ymax = int(obj.find('bndbox').find('ymax').text)/height
data = f"gs://{bucket_name}/{blob.name},{label},{xmin},{ymin},,,{xmax},{ymax},,"
# print(data)
csvwriter.writerow([f'gs://{bucket_name}/{blob.name},{label},{xmin},{ymin},,,{xmax},{ymax},,'])
print(f'{csv_file_name} has been created!')
Related
In the google cloud documentation below:
https://cloud.google.com/vertex-ai/docs/training/using-managed-datasets#access_a_dataset_from_your_training_application
It says that the following environment variables are sent to the training container:
AIP_DATA_FORMAT: The format that your dataset is exported in. Possible values include: jsonl, csv, or bigquery.
AIP_TRAINING_DATA_URI: The location that your training data is stored at.
AIP_VALIDATION_DATA_URI: The location that your validation data is stored at.
AIP_TEST_DATA_URI: The location that your test data is stored at.
Where each of the URI values are wildcards that annotate training, validation, and test data files in .jsonl format as such:
gs://bucket_name/path/training-*
gs://bucket_name/path/validation-*
gs://bucket_name/path/test-*
Now, in your custom container that contains the python code, how do you actually access the contents of each of the files?
I've tried splitting the URI string using the following regex to obtain the bucket_name and the prefix info, and attempted the grab it using bucket.list_blobs(delimiter='/', prefix=prefix[:-1]) but it returns nothing when the files are definitely there. Here is a minimal example of the attempted code:
import os
import re
from google.cloud import storage
aip_training_data_uri = os.environ.get('AIP_TRAINING_DATA_URI')
match = re.match('gs://(.*?)/(.*)', aip_training_data_uri)
bucket_name, prefix = match.groups()
client = storage.Client()
bucket = client.bucket(bucket_name)
blobs = bucket.list_blobs(delimiter='/', prefix=prefix[:-1]) # "[:-1]" to remove wildcard asterisks
for blob in blobs:
print(blob.download_as_string()) # This returns an empty string
I am using the python library for querying Google Cloud Storage, and I am organizing information in Storage using a naming hierarchy. For example:
my_bucket/simulations/version_1/data...
my_bucket/simulations/version_2/data...
my_bucket/simulations/version_3/data...
my_bucket/other_data/more_data...
My question is: is it possible to query using list_blobs or some other method to retrieve a list that contains just the versions from the "simulations" directory, and not all of the blobs below simulations?
For reference, this returns all blobs in a paginated fashion:
cursor = bucket.list_blobs(prefix='simulations')
I've played around with the prefix and delimiter parameters of list_blobs method and this code worked:
from google.cloud import storage
def ls(bucket_name, prefix, delimiter):
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
cursor = bucket.list_blobs(prefix=prefix, delimiter=delimiter)
for blob in cursor:
pass
for prefix in cursor.prefixes:
print prefix
ls(your_bucket_name, 'simulations/', '/')
output:
simulations/version-1/
simulations/version-2/
simulations/version-3/
Note that this will only display the names of the directories inside the simulations/ directory, the files will be omitted.
I have trained weight matrix, I would like to extract features at each end every layer and store them in a file. How could I do that? Thanks.
Have a look at the Keras FAQ
One simple way is to create a new Model that will output the layers
that you are interested in:
from keras.models import Model
model = ... # create the original model
layer_name = 'my_layer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
intermediate_output = intermediate_layer_model.predict(data)
Alternatively, you can build a Keras function that will return the
output of a certain layer given a certain input, for example:
from keras import backend as K
get_3rd_layer_output = K.function([model.layers[0].input],
[model.layers[3].output])
layer_output = get_3rd_layer_output([X])[0]
Similarly, you could build a Theano and TensorFlow function directly.
Note that if your model has a different behavior in training and
testing phase (e.g. if it uses Dropout, BatchNormalization, etc.),
you will need to pass the learning phase flag to your function:
get_3rd_layer_output = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[3].output])
# output in test mode = 0
layer_output = get_3rd_layer_output([X, 0])[0]
# output in train mode = 1
layer_output = get_3rd_layer_output([X, 1])[0]
Then you just need to store your predictions in a file using e.g. np.save('filename.npz',intermediate_output )
Firstly, I am sorry if the title is long. I am working on face detection using python. I am trying to write a script where it will notify user when there is same picture or almost same picture/faces detected between two directories/folder.
Below is the script that I wrote so far.
import cv2
import glob, requests
def detect1():
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
for img in glob.glob('/Users/Ling/Pythonfiles/Faces/*.jpg'):
cv_img = cv2.imread(img)
gray = cv2.cvtColor(cv_img, cv2.COLOR_BGR2GRAY)
faces1 = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces1:
cv2.rectangle(cv_img,(x,y),(x+w,y+h),(255,0,0),2)
def detect2():
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
for image in glob.glob('/Users/Ling/Pythonfiles/testfolder/*.jpg'):
cv_image = cv2.imread(image)
gray = cv2.cvtColor(cv_image, cv2.COLOR_BGR2GRAY)
faces2 = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces2:
cv2.rectangle(cv_image,(x,y),(x+w,y+h),(255,0,0),2)
def notify():
if detect2 == detect1:
key = "<yourkey>"
sandbox = "<yoursandbox>.mailgun.org"
recipient = "<recipient's email>"
request_url = 'https://api.mailgun.net/v2/{0}/messages'.format(sandbox)
request = requests.post(request_url, auth=('api', key),
data={
'from': '<sender's email',
'to': recipient,
'subject': 'face detect',
'text': 'common face detected'
})
print 'Status: {0}'.format(request.status_code)
print 'Body: {0}'.format(request.text)
There is no error but there is no notification either. I have a folder with 10 pictures of random faces I downloaded it from Google Image(just for learning purpose)and another folder with 2 picture of people that their face is same as the some of the picture in the previous folder. The picture with the same face is in different angle.
I wrote the script by referring to tutorial from https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/
and add some line to send the notification if the program detect the same face from both folder.
My question is how do I exactly notify the user if there are same faces detected. I believe this code is incomplete and hoping that someone can give me suggestion on what to add/edit or what I should not write in this script.
Thank you in advance.
I don't know if I understand you correctly, but I think your looking for face recognition not only a face detection.
The Haar Feature-based Cascade Classifier learned very generell "How a face should look like". It detects the positions of a learned object/shape in a given input image and returns the bounding boxes.
So if you want to know if the detected face matches with a known face you need to train a recognizer. OpenCV has 3 build-in face recognizer: EigenFaceRecognizer, FisherfaceRecognizer, LBPHFaceRecognizer (Local Binary Patterns Histograms Face Recognizer).
use them with e.g. recognizer = cv2.createLBPHFaceRecognizer()
You need a training set for your users. Maybe your trainings folder could look like:
1_001.jpg, 1_002.jpg, 1_003.jpg, 2_001.jpg 2_002.jpg, ..., n_xyz.jpg
where n is the label (user id -> unique for each user) and xyz is maybe a description or a sequence number.
Update:
I used the Faces94 benchmark dataset for testing. Therefore I packed them into the folder trainingSamples and two of them (same person but different face) into the folder testFaces relative to my python script.
To rename all images in a folder matching with the pattern above I used a bash command rename
eg. asamma.[1-20].jpg to 001_[1-20].jpg
rename 's/^asamma./001_/' *
import cv2
import numpy as np
import os
class FaceRecognizer:
def __init__(self):
self.cascadeClassifier = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
self.faceRecognizer = cv2.face.createLBPHFaceRecognizer()
if os.path.isfile('faceRecognizer.xml'):
self.faceRecognizer.load('faceRecognizer.xml')
else:
images = []
labels = []
for file in os.listdir('trainingSamples/'):
image = cv2.imread('trainingSamples/'+file, 0)
images.append(image)
labels.append(int(file.split('_')[0]))
## if you don't have pre-cropped profile pictures you need to detect the face first
# faces = self.cascadeClassifier.detectMultiScale(image)
# for (x, y, w, h) in faces
# images.append(image[y:y+h, x:x+w])
# labels.append(int(file.split('_')[0]))
self.faceRecognizer.train(images, np.array(labels))
self.faceRecognizer.save('faceRecognizer.xml')
def predict(self, image, filename):
user, confidence = self.faceRecognizer.predict(image)
if confidence < 100.0:
print('found user with id {} in picture {} with a confidence of {}'.format(user, filename, confidence))
## if you don't have pre-cropped profile pictures you need to detect the face first
# faces = self.cascadeClassifier.detectMultiScale(image)
# for (x, y, w, h) in faces
# user, confidence = self.faceRecognizer.predict(image[y:y+h, x:x+w])
# # confidence of 0.0 means perfect recognition (same images)
# if confidence < 100.0:
# print('found user with id {} in picture {} with a confidence of {}'.format(user, filename, confidence))
faceRecognizer = FaceRecognizer()
for file in os.listdir('testFaces/'):
image = cv2.imread('testFaces/'+file, 0)
faceRecognizer.predict(image, file)
The code produces the output:
found user with id 4 in picture 004_20.jpg with a confidence of 27.836526552656732
found user with id 1 in picture 001_6.jpg with a confidence of 22.473253497606876`
So it correctly recognize user 4 and user 1.
The code is tested with OpenCV 3.1-dev on Ubuntu 15.10 using Python 3.4.3 and Python 2.7.9.
I am attempting to extract images that are in a PDF. The file I am working with is 2+ pages. Page 1 is text and pages 2-n are images (one per page, or it may be a single image spanning multiple pages; I do not have control over the origin).
I am able to parse the text out from page 1 but when I try to get the images I am getting 3 images per image page. I cannot determine the image type which makes saving it difficult. Additionally trying to save each pages 3 pictures as a single img provides no result (as in cannot be opened via finder on OSX)
Sample:
fp = open('the_file.pdf', 'rb')
parser = PDFParser(fp)
document = PDFDocument(parser)
rsrcmgr = PDFResourceManager()
laparams = LAParams()
device = PDFPageAggregator(rsrcmgr, laparams=laparams)
interpreter = PDFPageInterpreter(rsrcmgr, device)
for page in PDFPage.create_pages(document):
interpreter.process_page(page)
pdf_item = device.get_result()
for thing in pdf_item:
if isinstance(thing, LTImage):
save_image(thing)
if isinstance(thing, LTFigure):
find_images_in_thing(thing)
def find_images_in_thing(outer_layout):
for thing in outer_layout:
if isinstance(thing, LTImage):
save_image(thing)
save_image either writes a file per image in pageNum_imgNum format in 'wb' mode or a single image per page in 'a' mode. I have tried numerous file extensions with no luck.
Resources I've looked into:
http://denis.papathanasiou.org/posts/2010.08.04.post.html (outdatted pdfminer version)
http://nedbatchelder.com/blog/200712/extracting_jpgs_from_pdfs.html
It's been a while since this question has been asked, but I'll contribute for the sake of the community, and potentially for your benefit :)
I've been using an image parser called pdfimages, available through the poppler PDF processing framework. It also outputs several files per image; it seems like a relatively common behavior for PDF generators to 'tile' or 'strip' the images into multiple images that then need to be pieced together when scraping, but appear to be entirely intact while viewing the PDF. The formats/file extensions that I have seen through pdfimages and elsewhere are: png, tiff, jp2, jpg, ccitt. Have you tried all of those?
Have you tried something like this?
from binascii import b2a_hex
def determine_image_type (stream_first_4_bytes):
"""Find out the image file type based on the magic number comparison of the first 4 (or 2) bytes"""
file_type = None
bytes_as_hex = b2a_hex(stream_first_4_bytes).decode()
if bytes_as_hex.startswith('ffd8'):
file_type = '.jpeg'
elif bytes_as_hex == '89504e47':
file_type = '.png'
elif bytes_as_hex == '47494638':
file_type = '.gif'
elif bytes_as_hex.startswith('424d'):
file_type = '.bmp'
return file_type
A (partial) solution for the image tiling problem is posted here: PDF: extracted images are sliced / tiled
I would use in image library to find the image type:
import io
from PIL import Image
image = Image.open(io.BytesIO(thing.stream.get_data()))
print(image.format)