Python Tensorflow Summarize Two Histograms Together - tensorboard

I was wondering if there is a way to summarize two histograms in tensorflow together and get something resembling the behavior of tf.summary.histogram. The reason is that I need to summarize batch logits for two different operations and I need the histograms to be "superimposed" so that I can compare their dynamics with respect to one another during training by only looking at the log file using Tensorboard.

Related

How samples should be included in principal component analysis

I intend to use PCA to identify the sources of contaminants in environmental samples. We have data from both environmental samples and suspected sources. We want to use PCA to check which sources have a chemical composition that is similar to the environmental samples, therefore, could be the primary sources. The similarity in the composition is depicted using the score plot. I can do the PCA in two ways: the first method is to include the data from both samples and sources, and run a PCA. The second method is to include only the sample data in the PCA, and use the PCA model to predict the scores for the sources. I do not know Which method is correct? or there are some limitations or restrictions for both methods, and the selection of the method is dependent on the dataset.

Can we give the test data, without labelling them?

I came across this snippet in the Tensorflow documentation, MNIST For ML Beginners.
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
Now, I want to feed my own test images, without labelling them and would like the model to predict the labels, how do I achieve this?
Yes you can, but it would not be deep learning instead it would be clustering. ( Ex: K means Clustering )
Basic idea is like the following:
Create two placeholders for input and centroids
Decide a distance metric
Create graph
feed only dataset to run the graph

how to apply different Image Generator for different batches in Keras?

The training data is read from two .npy files. Say, train_set is regarded as X, and train_label is regarded as Y. Therefore, it is not a multiple input case. My task requires to augment the image patches in different manner. So how to define different Image Generator for different patches? Although there could be a lot of patches, I use 3 patches as an example:
for patch1:
datagen = ImageDataGenerator(rotation_range=20)
for patch2:
datagen = ImageDataGenerator(rotation_range=40)
for patch3:
datagen = ImageDataGenerator(rotation_range=60)
How to apply different generators on different patches, and how may I use the model.fit(...) or model.fit_generator(...) for the described scenario?
Also, Is there a way to rotate the image by a particular degree instead of a range?
Thanks!
I didn't do it myself, but I think one approach is to use the first datagen and pass the first group of training data with fit_generator and with the selected number of epochs. Then, save weight and use the second datagen and the second group with fit_generator. You also need to set initial_epoch and also need to load the weights. To generalize the question, what you need to do is to resume training with the second datagen. Please see https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model.

Voted Perceptron on Weka

I am trying to run the "Voted Perceptron" on iris data set using weka. However, when I load the data, the Voted Perceptron refuses to run, while it runs on many other data sets like ionosphere.arff, diabetes.arff etc.
Please help.
Because VotedPerceptron only works on datasets that are binary classifiers. Iris.arff has three different classification types while diabetes.arff and ionosphere.arff only have two.
If you want it to work you'll have to entirely remove one of iris.arff's classification types.

SGDClassifier with HashingVectorizer and TfidfTransformer

I would like to understand if it is possible to train an online SGDClassifier (with partial_fit) using HashingVectorizer and TfidfTransformer. Simply joining them in a Pipeline will not work as TfidfTransformer is stateful so that would break the online learning process. This post says it's not possible to use tf-idf in an online fashion but a comment on this post suggests that it may somehow be possible: "In particular if you use stateful transformers as TfidfTransformer you will need to do several passes on your data". Is that possible without loading the whole training set into memory? If so, how? If not, is there an alternative solution to combine HashingVectorizer with tf-idf on large datasets?
Is that possible without loading the whole training set into memory?
No. TfidfTransformer needs to have the entire X matrix in memory. You'll need to roll your own tf-idf estimator, use that to compute per-term document frequencies in one pass over the data, then do another pass to produce tf-idf features and fit a classifier to them.