Is it possible to train a tensorflow production model? - c++

If I build a model and train it, then deploy it. Can I set it up to train on data at runtime? E.g. if I wanted a net that could just train on constant input until I stopped it and tested it. Would I have to Implement that by talking to the protobuffer in C++?

The practical problem with neural networks in production is that you train on known output, but apply them in production in order to create output. That usually precludes in-production updates.
Yet, there's no magic involved. If in production you can still get the desired output (even in hindsight) for a given input, then you can backpropagate the resulting error term and adjust the network weights.
There's an additional challenge here: if you train the network in production, what data are you intending to train with? Initially you can't train on just the first few samples from the field, as you'd greatly overtrain on those. So you'll need to include the initial training set in the deployed solution, and expand on that.

Related

Tensorflow.js constant retraining

I have a application were we gather and classify images triggered by motion detection. We have a model trained on a lot of images that works OK. I have converted it to TF.js format and are able to make predictions in the browser, so far so good.
However we have cameras on a lot of different locations and the lighting and surroundings vary on each location whereas and we also put up new cameras each year. So we would need to retrain the model often and I am also afraid that the model will be to generic and not so accurate on each specific location.
All data we gather from the motion detection is uploaded to our server and we use a web interface to classify all the images as "false positive, positive etc" and store everything in a MYSQL database.
The best solution I think would to have a generic model trained on a lot of data. This model would be implemented on each each specific location. And while we manually interpret each image as we normally would do we would relearn the generic model so that it will be specific to each location.
To solve this we have to serve the models on our server our on some host and be able to write to the model since we are a lot of different people interpreting the data on different browsers and computers.
Would it be possible and a good solution? I would love some input before I invest more time in to this. I haven't found a whole lot of information about, serving writable models and reinforcement learning on tensorflow.js
So
I was wondering if it is possible to serve tensoflow.js on our server that was trained on our data. But for every manual intepretation the model would "relearn" with the new image.

How I can make a prediction in data mining?

I have an Excel file containing the following data.
I want to apply it on Weka by k-nearest neighbor classifier.
How I can make a prediction of the new instance?
How can I set the parameters of this instance to obtain prediction about it?
I don't think you have enough data to work with here. Your model will be wildly inaccurate. If you are starting with machine learning, I would recommend the Iris data set to start with. I started with machine learning here.
If you want to start with Weka, I would use a dataset from researchers, like the MNIST database of handwritten digits which can be found here, and a guide for it in python here. On the same site, there is a tutorial for the Weka gui, if you look hard enough.

Specific topics on Tensorflow for CNN

I have a mini project for my new course in Tensorflow for this semester with random topics. Since I have some background on Convolution Neuron Network, I intend to use it for my project. My computer can only run CPU version of TensorFlow.
However, as a new bee, I realize that there are a lot of topics such that MNIST, CIFAR-10, etc, thus I don't know which suitable topic I should pick out from them. I only have two weeks left. It would be great if the topic is not too complicated but too not easy for study because it matchs my intermediate level.
In your experience, could you give me some advice about the specific topic I should do for my project?
Moreover, it would be better if in this topic I can provide my own data to test my training, because my professor said that it is a plus point to get A grade in my project.
Thanks in advance,
I think that to answer this question you need to properly evaluate the marking criteria for your project. However, I can give you a brief overview of what you've just mentioned.
MNIST: MNIST is a Optical Character Recognition task for individual numbers 0-9 in images size 28px square. This is considered the "Hello World" of CNNs. It's pretty basic and might be too simplistic for your requirements. Hard to gauge without more information. Nonetheless, this will run pretty quickly with CPU Tensorflow and the online tutorial is pretty good.
CIFAR-10: CIFAR is a much bigger dataset of objects and vehicles. The image sizes are 32px square so individual image processing isn't too bad. But the dataset is very large and your CPU might struggle with it. It takes a long time to train. You could try training on a reduced dataset but I don't know how that would go. Again, depends on your course requirements.
Flowers-Poets: There is the Tensorflow for Poets re-training example which might not be suitable for your course, you could use the flowers dataset to build your own model.
Build-your-own-model: You could use tf.Layers to build your own network and experiment with it. tf.Layers is pretty easy to use. Alternatively you could look at the new Estimators API that will automate a lot of the training processes for you. There are a number of tutorials (of varying quality) on the Tensorflow website.
I hope that helps give you a run-down of what's out there. Other datasets to look at are PASCAL VOC and imageNet (however they are huge!). Models to look at experimenting with may include VGG-16 and AlexNet.

machine learning for any cancer diagnosis on image dataset with python

Blockquote
i am working on this project asssigned by university as final project. But the issue is i am not getting any help from the internet so i thought may be asking here can solve issue. i had read many articles but they had no code or guidance and i am confused what to do. Basically it is an image processing work with machine learning. Data set can be found easily but issue is python python learning algorithm and code
Blockquote
I presume if it's your final project you have to create the program yourself rather than ripping it straight from the internet. If you want a good starting point which you can customise Tensor Flow from Google is very good. You'll want to understand how it works (i.e. how machine learning works) but as a first step there's a good example of image processing on the website in the form of number recognition (which is also the "Hello World" of machine learning).
https://www.tensorflow.org/get_started/mnist/beginners
This also provides a good intro to machine learning with neural nets: https://www.youtube.com/watch?v=uXt8qF2Zzfo
One note on Tensor Flow, you'll probably have to use Python 3.5+ as in my experience it can be difficult getting it on 2.7.
First of all I need to know what type of data are you using because depending on your data, if it is a MRI or PET scan or CT, there could be different suggestion for using machine learning in python for detection.
However, I suppose your main dataset consist of MR images, I am attaching an article which I found it a great overview of different methods>
This project compares four different machine learning algorithms: Decision Tree, Majority, Nearest Neighbors, and Best Z-Score (an algorithm of my own design that is a slight variant of the Na¨ıve Bayes algorithm)
https://users.soe.ucsc.edu/~karplus/abe/Science_Fair_2012_report.pdf
Here, breast cancer and colorectal cancer have been considered and the algorithms that performed best (Best Z-Score and Nearest Neighbors) used all features in classifying a sample. Decision Tree used only 13 features for classifying a sample and gave mediocre results. Majority did not look at any features and did worst. All algorithms except Decision Tree were fast to train and test. Decision Tree was slow, because it had to look at each feature in turn, calculating the information gain of every possible choice of cutpoint.
My Solution:-
Lung Image Database Consortium provides open access dataset for Lung Cancer Images.
Download it then apply any machine learning algorithm to classify images having tumor cells or not.
I attached a link for reference paper. They applied neural network to classify the images.
For coding part, use python "OpenCV" for image pre-processing and segmentation.
When it comes for classification part, use any machine learning libraries (tensorflow, keras, torch, scikit-learn... much more) as you are compatible to work with and perform classification using any better outperforming algorithms as you wish.
That's it..
Link for Reference Journal

Attribute selection weka during test data classification

background:
I am using KDD99 data set with weka library to predict the IDS attacks
the training and all works fine its around 42 features based on which the attack predication works. But in a real time environment when i use the Sniffer to capture the packets i may not be able to fetch all the 42 features from the packet not it would be required as well. I would be getting around 10 features.
I am new to data mining and weka library
Now the problem is i would have used all the 42 features from training data set to train network and the i have 10 features in the test data.
Do i need to train the network with only 10 of these features which are going to get captured or is there a way i can train the network with 42 features and while classification i can request to consider the only 10 features is there a way to make attribute selection during the classification of data?
Can any one share me the Java snippet code if there is any solution.
The alert for the outdated KDD99 is useful and many thanks for it but still i was thinking what if i have less no. of features in Test data than training data how to address the problem? what should be the ideal way to solve in weka
Thanks in advance....