Keras ImageDataGenerator: random transform - computer-vision

I'm interested in augmenting my dataset with random image transformations. I'm using Keras ImageDataGenerator, and I'm getting the following error when trying to apply random_transform to a single image:
--> x = apply_transform(x, transform matrix, img_channel_axis, fill_mode, cval)
>>> RuntimeError: affine matrix has wrong number of rows.
I found the source code for the ImageDataGenerator here. However, I'm not sure how to debug the runtime error. Below is the code I have:
from keras.preprocessing.image import img_to_array, load_img
from keras.preprocessing.image import ImageDataGenerator
from keras.applications.inception_v3 import preprocess_input
image_path = './figures/zebra.jpg'
#data augmentation
train_datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
print "\nloading image..."
image = load_img(image_path, target_size=(299, 299))
image = img_to_array(image)
image = np.expand_dims(image, axis=0) # 1 x input_shape
image = preprocess_input(image)
train_datagen.fit(image)
image = train_datagen.random_transform(image)
The error occurs at the last line when calling random_transform.

The problem is that random_transform expects a 3D-array.
See the docstring:
def random_transform(self, x, seed=None):
"""Randomly augment a single image tensor.
# Arguments
x: 3D tensor, single image.
seed: random seed.
# Returns
A randomly transformed version of the input (same shape).
"""
So you'll need to call it before np.expand_dims.

Related

Applying Multi Otsu Threshold for my image

I have this image shown below
And, here I am trying to define the threshold to distinguish bimodal class by using the Otsu technique based on intensity and then visualise those in the histogram. So far I have written following codes:
import matplotlib.pyplot as plt
import numpy as np
from skimage import data, io, img_as_ubyte
from skimage.filters import threshold_multiotsu
# Read an image
image = io.imread("Fig_1.png")
# Apply multi-Otsu threshold
thresholds = threshold_multiotsu(image,classes=5)
# Digitize (segment) original image into multiple classes.
#np.digitize assign values 0, 1, 2, 3, ... to pixels in each class.
regions = np.digitize(image, bins=thresholds)
output = img_as_ubyte(regions) #Convert 64 bit integer values to uint8
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(10, 3.5))
# Plotting the original image.
ax[0].imshow(image, cmap='gray')
ax[0].set_title('Original')
ax[0].axis('off')
# Plotting the histogram and the two thresholds obtained from
# multi-Otsu.
ax[1].hist(image.ravel(), bins=255)
ax[1].set_title('Histogram')
for thresh in thresholds:
ax[1].axvline(thresh, color='r')
# Plotting the Multi Otsu result.
ax[2].imshow(regions, cmap='gray')
ax[2].set_title('Multi-Otsu result')
ax[2].axis('off')
plt.subplots_adjust()
plt.show()
This gives me the following result. Here As you can see Multi-Otsu result is totally black and does not show the two class of object present in the figure.
I choose classes=5 but this is bimodal hence putting classes=3 also giving me the same result.
Any advice on how to correct this? Thanks in advance.

Doing OCR to identify text written on trucks/cars or other vehicles

I am new to the world of Computer Vision.
I am trying to use Tesseract to detect numbers written on the side of trucks.
So for this example, I would like to see CMA CGM as the output.
I fed this image to Tesseract via command line
tesseract image.JPG out -psm 6
but it yielded a blank file.
Then I read the documentation of Tesserocr (python wrapper of Tesseract) and tried the following code
with PyTessBaseAPI() as api:
api.SetImage(image)
boxes = api.GetComponentImages(RIL.TEXTLINE, True)
print 'Found {} textline image components.'.format(len(boxes))
for i, (im, box, _, _) in enumerate(boxes):
# im is a PIL image object
# box is a dict with x, y, w and h keys
api.SetRectangle(box['x'], box['y'], box['w'], box['h'])
ocrResult = api.GetUTF8Text()
conf = api.MeanTextConf()
print (u"Box[{0}]: x={x}, y={y}, w={w}, h={h}, "
"confidence: {1}, text: {2}").format(i, conf, ocrResult, **box)
and again it was not able to read any characters in the image.
My question is how should I go about solving this problem? ( I am not looking for a ready made code, but approach on how to go about solving this problem).
Would I need to train tesseract with sample images or can I just write code using existing libraries to somehow detect the co-ordinates of the truck and try to do OCR only within the boundaries of the truck?
Tesseract expects document-only images, but you have non-document objects in your image. You need a sophisticated segmentation(then probably some image processing) process before feeding it to Tesseract-OCR.
I have a three-step solution
Take the part of the image you want to recognize
Apply Gaussian-blur
Apply simple-thresholding
You can use a range to get the part of the image.
For instance, if you select the
height range as: from (int(h/4) + 40 to int(h/2)-20)
width range as: from int(w/2) to int((w*3)/4)
Result
Take Part
Gaussian
Threshold
Pytesseract
CMA CGM
Code:
import cv2
import pytesseract
img = cv2.imread('YizU3.jpg')
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
(h, w) = gry.shape[:2]
gry = gry[int(h/4) + 40:int(h/2)-20, int(w/2):int((w*3)/4)]
blr = cv2.GaussianBlur(gry, (3, 3), 0)
thr = cv2.threshold(gry, 128, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr)
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)

Loading files to perform Kmean using sklearn

I have 100 files that contain system call traces. Each files is presented as seen below:
setpgrp ioctl setpgrp ioctl ioctl ....
I am trying to load these files and perform kmean calculation on them to cluster them based on similarities. Based on a tutorial on the sklearn webpage I written the following:
from sklearn.decomposition import TruncatedSVD
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import Normalizer
from sklearn import metrics
from sklearn.datasets import load_files
from sklearn.cluster import KMeans, MiniBatchKMeans
import numpy as np
# parse commandline arguments
op = OptionParser()
op.add_option("--lsa",
dest="n_components", type="int",
help="Preprocess documents with latent semantic analysis.")
op.add_option("--no-minibatch",
action="store_false", dest="minibatch", default=True,
help="Use ordinary k-means algorithm (in batch mode).")
op.add_option("--use-idf",
action="store_false", dest="use_idf", default=True,
help="Disable Inverse Document Frequency feature weighting.")
op.add_option("--n-features", type=int, default=10000,
help="Maximum number of features (dimensions)"
" to extract from text.")
op.add_option("--verbose",
action="store_true", dest="verbose", default=False,
help="Print progress reports inside k-means algorithm.")
print(__doc__)
op.print_help()
(opts, args) = op.parse_args()
if len(args) > 0:
op.error("this script takes no arguments.")
sys.exit(1)
print("Loading training data:")
trainingdata = load_files('C:\data\Training data')
print("%d documents" % len(trainingdata.data))
print()
print("Extracting features from the training trainingdata using a sparse vectorizer")
if opts.use_idf:
vectorizer = TfidfVectorizer(input="file",min_df=1)
X = vectorizer.fit_transform(trainingdata.data)
print("n_samples: %d, n_features: %d" % X.shape)
print()
if opts.n_components:
print("Performing dimensionality reduction using LSA")
# Vectorizer results are normalized, which makes KMeans behave as
# spherical k-means for better results. Since LSA/SVD results are
# not normalized, we have to redo the normalization.
svd = TruncatedSVD(opts.n_components)
lsa = make_pipeline(svd, Normalizer(copy=False))
X = lsa.fit_transform(X)
explained_variance = svd.explained_variance_ratio_.sum()
print("Explained variance of the SVD step: {}%".format(
int(explained_variance * 100)))
print()
However it seems that none of the files in the dataset directory get loaded into the memory when though all files are available. I get the following error when executing the program:
raise ValueError("empty vocabulary; perhaps the documents only"
ValueError: empty vocabulary; perhaps the documents only contain stop words
Can anyone tell me why the dataset is not being loaded? What am I doing wrong?
I finally managed to load the files. The approach to use Kmean in sklearn is to vectorize the training data (using tfidf or count_vectorizer), then transform your test data using the vectorization of your training data. Once that is done you can initialize the Kmean parameters, use the training data set vectors to create the kmean cluster. Finally you can cluster your test data around your training data centroid.
The following code does what is explained above.
#Read the data in a directory:
def readfile(dataDir):
data_set = []
for file in os.listdir(dataDir):
trainingfiles = os.path.join(dataDir, file)
if os.path.isfile(trainingfiles):
data = open(trainingfiles, 'r')
dataread=str.decode(data.read())
data_set.append(dataread)
return data_set
#fitting tfidf transfrom for training data
tfidf_vectorizer_trainingset = tfidf_vectorizer.fit_transform(readfile(trainingdataDir)).toarray()
#transform the test set based on the training set
tfidf_vectorizer_testset = tfidf_vectorizer.transform(readfile(testingdataDir)).toarray()
# Kmean Clustering parameters
kmean_parameters = KMeans(n_clusters=number_of_clusters, init='k-means++', max_iter=100, n_init=1)
#Cluster the training data based on the parameters
KmeanAnalysis_training = kmean_parameters.fit(tfidf_vectorizer_trainingset)
#transform the test data based on the clustering of the training data
KmeanAnalysis_test = kmean_parameters.transform(tfidf_vectorizer_testset)

error while detecting features with SIFT on training set

I have a set of images split in train and test and I'm trying to detect features using SIFT from the train set.
The problem is that with my code I'm getting:
TypeError: image is not a numpy array, neither a scalar
Here's my code:
import glob
from cv2 import SIFT
import numpy as np
#creating a list of images
images = []
for infile in glob.glob('path'):
images.append(infile)
np.random.shuffle(images)
my_set = images
#splitting my set in test and train parts
train = my_set[:120]
test = my_set[120:]
#get descriptors of train part
for image in train:
SIFT().detect(image)
I've tried to change the variables train and test like this:
train = np.array(my_set[:120])
but I get the same error.

scikit-learn PCA doesn't have 'score' method

I am trying to identify the type of noise based on that article:
Model selection with Probabilistic (PCA) and Factor Analysis (FA)
I am using scikit-learn-0.14.1.win32-py2.7 on win8 64bit
I know that it refers on version 0.15, however at the version 0.14 documentation it mentions that the score method is available for PCA so I guess it should normally work:
sklearn.decomposition.ProbabilisticPCA
The problem is that no matter which PCA I will use for the *cross_val_score*, I always get a type error message saying that the estimator PCA does not have a score method:
*TypeError: If no scoring is specified, the estimator passed should have a 'score' method. The estimator PCA(copy=True, n_components=None, whiten=False) does not.*
Any ideas why is that happening?
Many thanks in advance
Christos
X has 1000 samples of 40 features
here is a portion of the code:
import numpy as np
import csv
from scipy import linalg
from sklearn.decomposition import PCA, FactorAnalysis
from sklearn.cross_validation import cross_val_score
from sklearn.grid_search import GridSearchCV
from sklearn.covariance import ShrunkCovariance, LedoitWolf
#read in the training data
train_path = '<train data path>/train.csv'
reader = csv.reader(open(train_path,"rb"),delimiter=',')
train = list(reader)
X = np.array(train).astype('float')
n_samples = 1000
n_features = 40
n_components = np.arange(0, n_features, 4)
def compute_scores(X):
pca = PCA()
pca_scores = []
for n in n_components:
pca.n_components = n
pca_scores.append(np.mean(cross_val_score(pca, X, n_jobs=1)))
return pca_scores
pca_scores = compute_scores(X)
n_components_pca = n_components[np.argmax(pca_scores)]
Ok, I think I found the problem. it is not working with PCA, but it does work with PPCA
However, by not providing a cv number the cross_val_score automatically sets 3-fold cross validation
that created 3 sets with sizes 334, 333 and 333 (my initial training set contains 1000 samples)
Since nympy.mean cannot make a comparison between sets with different sizes (334 vs 333), python rises an exception.
thx