My data is a 19*793 matrix, and I want to learn the structure of bayesian network use my data in weka. I got some trouble, my data is only speed of freeway segments, it seems that my weka can't deal with it. How can cope with it?
Related
I have a large dataset which makes my lmdb huge. For 16.000 samples my database is already 20 GB. But in total I have 800.000 images which would end up in a huge amount of data. Is there any way to compress an lmdb? Or is it better to use HDF5 files? I would like to know if anyone knows probably the best solution for this problem.
If you look inside ReadImageToDatum function in io.cpp it can keep image in both compressed(jpg/png) format or raw format. To use compressed format you can compress the loaded image using cv::imencode. Now you just set the datum to the compressed data and set the encoded flag. Then you can store the datum in lmdb.
There are various techniques to reduce input size, but much of that depends on your application. For instance, the ILSVRC-2012 data set images can be resized to about 256x256 pixels without nasty effects on the training time or model accuracy. This reduces the data set from 240Gb to 40Gb. Can your data set suffer loss of fidelity from simple "physical" compression? How small do you have to have the data set?
I'm afraid that I haven't worked with HDF5 files enough to have an informed opinion.
After obtaining the image dataset, the feature database is constructed for all images which is a vector based on mean and sd of RGB color model and HSV color model for a portion of the image. How can I use a svm to retieve related images from the database once the query image is given.
Also how to use unsupervised learning for the above problem
Assuming the query images are unlabeled, applying SVM would require a way of knowing the labels for dataset images since SVM is a form of supervised learning, which seeks to correctly determine class labels for unlabeled data. You would need another method for generating class labels, such as unsupervised learning, so this approach does not seem relevant if you only have feature vectors but no class labels.
A neural network allows for unsupervised learning with unlabeled data, but is a rather complex approach and is the subject of academic research. You may want to consider a simpler machine learning approach such as k-Nearest Neighbors, which allows you to obtain the k closest training samples that are similar in your feature space. This algorithm is simple to implement and is found in many machine learning libraries. For example in Python you can use scikit learn.
I am unsure what type of images you are working with, but you might also want to explore using feature detector algorithms such as SIFT rather than just pixel intensities.
I was given a project on vehicle type identification with neural network and that is how I came to know the awesomeness of neural technology.
I am a beginner with this field, but I have sufficient materials to learn it. I just want to know some good places to start for this project specifically, as my biggest problem is that I don't have very much time. I would really appreciate any help. Most importantly, I want to learn how to match patterns with images (in my case, vehicles).
I'd also like to know if python is a good language to start this in, as I'm most comfortable with it.
I am having some images of cars as input and I need to classify those cars by there model number.
Eg: Audi A4,Audi A6,Audi A8,etc
You didn't say whether you can use an existing framework or need to implement the solution from scratch, but either way Python is excellent language for coding neural networks.
If you can use a framework, check out Theano, which is written in Python and is the most complete neural network framework available in any language:
http://www.deeplearning.net/software/theano/
If you need to write your implementation from scratch, look at the book 'Machine Learning, An Algorithmic Perspective' by Stephen Marsland. It contains example Python code for implementing a basic multilayered neural network.
As for how to proceed, you'll want to convert your images into 1-D input vectors. Don't worry about losing the 2-D information, the network will learn 'receptive fields' on its own that extract 2-D features. Normalize the pixel intensities to a -1 to 1 range (or better yet, 0 mean with a standard deviation of 1). If the images are already centered and normalized to roughly the same size than a simple feed-forward network should be sufficient. If the cars vary wildly in angle or distance from the camera, you may need to use a convolutional neural network, but that's much more complex to implement (there are examples in the Theano documentation). For a basic feed-forward network try using two hidden layers and anywhere from 0.5 to 1.5 x the number of pixels in each layer.
Break your dataset into separate training, validation, and testing sets (perhaps with a 0.6, 0.2, 0.2 ratio respectively) and make sure each image only appears in one set. Train ONLY on the training set, and don't use any regularization until you're getting close to 100% of the training instances correct. You can use the validation set to monitor progress on instances that you're not training on. Performance should be worse on the validation set than the training set. Stop training when the performance on the validation set stops improving. Once you've accomplished this you can try different regularization constants and choose the one that results in the best validation set performance. The test set will tell you how well your final result is performing (but don't change anything based on test set results, or you risk overfitting to that too!).
If your car images are very complex and varied and you cannot get a basic feed-forward net to perform well, you might consider using 'deep learning'. That is, add more layers and pre-train them using unsupervised training. There's a detailed tutorial on how to do this here (though all the code examples are in MatLab/Octave):
http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial
Again, that adds a lot of complexity. Try it with a basic feed-forward NN first.
I have read here:
Is there a way to use a Custom cross-sectional slicer of 3d image data?
... that the nrrd parser stores the image data as a 3D array. I want to be able to access this array in my scripts. How can this be done? I would like to use this data to do image statistics, and subsets to do region of interest statistics. I believe the data is a private variable which is just used by the slice function to create the volume slices, is that correct? If so how can I save it for later use as a public variable, or as a property of the volume object?
Please explain as simply as possible how to proceed as I am quite a novice at javascript.
Many thanks,
We didn't store the array for all volume parsers yet to slim down the memory usage. This can certainly be added since the infrastructure is there under the hood.
I assigned the issue to me
https://github.com/xtk/X/issues/84
I'm preparing for a project related to weather visualization. I have some .csv files (100 rows, 120 colums) that contain weather data measured in my country where each cell represents a point with some specific values e.g. temperature or wind speed. Those files may be perceived as sets of points of latitude and longitude that covers my whole country (in fact the values are measured every 5 miles). I'd like to transformed the data and put the values on maps to make some weather visualizations, both 2d and 3d, especially heatmaps or hightmaps. I'm wondering which technology is the best for that. I was thinking of openGL, but all the textures mapping and drawing operations seem a bit difficult to me (but maybe they're not). I'm also considering making Java applets but there are many Java graphics libraries and I don't know wich one is the best for my requirements. Maybe all those things are possible to do in XNA, whcih would be great, because I'm programming mainly in C#. I'd be glad if some more experienced programmers could recommend me an appropriate language and libraries. What would you use to make the work easy and efficient?
OpenGL is just the pencil and paper for program to draw things with. While it certainly is possible to use OpenGL for this, I strongly suggest using an off the shelf visualization tool, like Origin, Matlab Visualization, visualization modules of R, etc.
If you're more looking in the direction of a visualization programming toolkit, have a look at VTK http://www.vtk.org/
But naked OpenGL for that task: Only recommendable if you "speak OpenGL fluently".