What is Pipeline of the sckit-learn? - python-2.7

I'm studying sckit-learn. but I cant understand the proper process of Pipeline. Here's the example like followng;
It's my studying material. How the Pipeline works here?

Your pipeline allowed you to perform multiple steps by connecting them as a pipeline.
In your case, you first generated feature vectors from the TfidfVectorizer(), and then those are "piped" to the naive bayes classifier as training data.

Related

AI Platform Built-in Image Classification Algorithm doesn't export a model at end of training

I've been training using the new AI Platform Built-in Image Classification Algorithm. Often times, despite the training job completing successfully, a saved model is not output to the gs job directory. There are no errors in the logs. The only indication in the logs that something is wrong is the absence of the following lines:
Performing best model export
SavedModel written to: jobDirSubDir/saved_model.pb
Export best SavedModel from jobDirSubDir to jobDirSubDir/model
The job simply completes after the final evaluation is finished.
Any hints for how to troubleshoot this built-in algorithm would be greatly appreciated. Or, if it's open source please point me at the correct repo.
Thanks

How to do transfer learning in darknet for YoloV3

I want to do transfer learning in YOLOv3 in Darknet so I want to use the pre-trained model of YOLOv3 that was trained on COCO dataset and then further train it on my own dataset to detect additional objects. So what are the steps that I should do? How can I label my data so that it can be used in Darknet? Please help me because it's the first time that I use Darknet and YOLO.
It's all explained here: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects
Note that notation must be consistent. Any missing annotated object will result in a bad learning and so a bad prediction.
This question was answered in "Fine-tuning and transfer learning by the example of YOLO" (Fine-tuning and transfer learning by the example of YOLO).
The answer given by gameon67, suggesting this:
If you are using AlexeyAB's darknet repo (not darkflow), he suggests
to do Fine-Tuning instead of Transfer Learning by setting this param
in cfg file : stopbackward=1 .
Then input ./darknet partial yourConfigFile.cfg
yourWeightsFile.weights outPutName.LastLayer# LastLayer# such as :
./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.81 81 It
will create yolov3.conv.81 and will freeze the lower layer, then you
can train by using weights file yolov3.conv.81 instead of original
darknet53.conv.74.
References : https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

Caffe Batch processing no speedup

I would like to speedup the forward pass of classification of a CNN using caffe.
I have tried batch classification in Caffe using code provided in here:
Modifying the Caffe C++ prediction code for multiple inputs
This solution enables me to give a vector of Mat, but it does not speed up anything. Even though the input layer is modified.
I am processing pretty small images (3x64x64) on a powerful pc with two GTX1080, and there is no issue in terms of memory.
I tried also changing the deploy.prototxt, but I get the same result.
It seems that at one point the forward pass of the CNN becomes sequential.
I have seen someone pointing this out here also:
Batch processing mode in Caffe - no performance gains
Another similar thread, for python : batch size does not work for caffe with deploy.prototxt
I have seen some things about MemoryDataLayer, but I am not sure this will solve my problem.
So I am kind of lost on what to do exactly... does anyone have any information on how to speedup classification time.
Thanks for any help !

What is the difference between a Job and a Transformation?

When creating new objects in spoon there's two possibilities: Job and Transfromation. They've got a different set of possible components (although with some level of overlap) and the XML that is generated looks very similar. What's the difference between these two?
This is what I had most problems to understand when starting with Pentaho as well.
A job has one start place, and executes one step at a time, with one flow through the steps.
A transformation has many possible start places and all steps execute in parallel. If a step has a step before it, it will take the data in there, and use it.
In my use I usually schedule jobs, to run transformations in process to get and transform data.
This is a normal question so it's in the FAQ.
Turns out the answer is in the FAQ. From the Pentaho FAQ:
Q: In Spoon I can make jobs and transformations, what's the difference between the two?
A: Transformations are about moving and transforming rows from source to target. Jobs are more about high level flow control: executing transformations, sending mails on failure, transferring files via FTP, ...
Another key difference is that all the steps in a transformation execute in parallel, but the steps in a job execute in order.

Read svm data and retrain with more data?

I am implementing a facial expression recognition and am using SVM to classify given expression.
When I train, I use this command line
svm.train(myFeatureVector,myLabels,Mat(),Mat(), myParameters);
svm.save("myClassifier.yml");
which will later when I will predict using
response = svm.predict(incomingFeatureVector);
But then when I want to train more than once (exited the program and start again), it seems to have overwritten my previous svm file. Is there any way I could do read previous svm file and add more data into it (and then resave it ,etc) ? I looked up on this openCV documentation and found nothing. However, when I read on this page; there is a method called CvSVM::read. I don't know what that does/how to implement it.
Hope anyone can help me :(
What you are trying to do is incremental learning but unfortunately Support Vector Machines is a batch algorithm, hence if you want to add more data you have to retrain with the whole set again.
There are online learning alternatives, like Pegasos SVM but I am not aware of any that is implemented on OpenCV