For reinforcement learning experiments, I often run independent repetitions for each hyperparameter setting. Ideally, I would visualize the average of these repetitions (per setting), including confidence intervals around the mean learning curve. I suppose many RL researchers have this issue.
I run my hyperparameter experiments with Ray Tune, which automatically visualizes each independent run in Tensorboard (which is very useful). It would be really helpful if I could automatically aggregate the results over the repetitions (with confidence), and then compare the different hyperparameter settings (and plot them for papers). I could not find any method in Tune/Tensorboard to do this, nor an intergration with another framework that can do this.
As an example, I would ideally get a curve like below, but then directly in Tensorboard
I suppose more people will have this issue, and was curious whether anyone knows a package or quick solution to go from Ray Tune output to the above figure (without coding it manually).
Thanks a lot!
Best regards,
Thomas
Related
I have a binary image and I'm looking for a robust way to find the lines in the shape and the topology (how the lines connect).
I have experimented in matlab (although what I'm asking for is which methods to use).
I've tried using skeletonization on the binary image and then used hough-transform, works sometimes but not a robust solution. I struggled with boundary disturbance.
Could anyone point me in a direction of which methods to use here (and in what order).
Binary file for testing
Frankly speaking, I've been monitoring this question for some time in the hope to see useful answer. The task itself does not seem to be really complex (and I'll try to prove that), but elegant solution is still a bit far from me.
Some time ago I solved similar task and it does look like with my very basic home-grown solution your initial example is easily traceable:
It is just a simple scan (1), vertical and horizontal lines identification(2) along with further analysis of the more complex areas (3). Having all areas analysed (3) it is not that hard to find intersection points and optimise them as well (4).
The result is pretty rough, but it confirms the feasibility if this approach.
I do understand this is a bit far from Matlab, but I just want to highlight several important moments:
skeletonization will potentially break the initial geometry
further analysis of the skeleton seems to be a bit tricky and unreliable
with a bit of enhanced quality your images can be traced with way more simple approach
BTW, in my approach different operations can be performed in parallel. Scan step is adjustable and even with reduced number of scans result is pretty good:
With more steps intersection points can be identified more precisely:
I came to conclusion it is really important to use all information provided by an initial image. All simplifications, etc will remove valuable facts increasing overall complexity of the task.
Update
Would this approach work if the figure did not have a majority of vertical and horizontal lines?
Those steps are pretty independent, so there is no strict requirement to have vertical or horizontal lines. Naturally, it is more complex task to identify intersections and do some additional tweaking in order to enhance accuracy.
Easy to see there are some significant errors introduced by vertical lines at the beginning and at the end of the shape. Very straightforward optimisation gives us better results:
This is indeed a tricky problem, going from a binary image to a graph (i.e. topology). Fundamentally involving crossing from the discrete world of pixels and 2D image data, to a abstract data structure of nodes and connectivity...
But what can provide the “glue” between? Quite an open question I’m afraid, requiring a complex interpretation of visual data.
Fortunately, someone else has shared a good attempt here in python: http://planet.lengrand.fr/?post_id=267
(This obviously assumes a full python installation NetworkX and any other dependencies. I did this on Mac with homebrew via a few call such as > brew install opencv; pip install networkx; brew install graph-tool; brew install graphviz. The ipython notebook below also makes use of http://scikit-image.org and http://mahotas.readthedocs.org/en/latest/ - so quite a heady cocktail of Computer Vision and Image Processing code! Finally: you’ll of course need ipython installed...)
Here is an example (which first loads in the downloaded notebook from above - all run from: > ipython —pylab):
%run C8Skeleton_to_graph-01.ipynb
import scipy.io as sio # Need this to load matlab files...
mat = sio.loadmat('bw.mat')
img = np.zeros((30,70),np.uint8) # Buffered image border
img[5:25,5:65] = mat['BW'] # Insert matrix data into middle
skeleton = mh.thin(img) # Do skeletonization...
graph = nx.MultiGraph() # Graph we’ll create
C8_Skeleton_To_Graph_01(graph, skeleton) # Do it!
figure(1)
subplot(211)
plt.imshow(img,plt.get_cmap('gray'), vmin=0, vmax=1, origin='upper');
subplot(212)
plt.imshow(skeleton,plt.get_cmap('gray'), vmin=0, vmax=1, origin='upper');
figure(2)
nx.draw(graph)
Showing that skeletonization of your originally provided Matlab data (with buffer around edges):
Results in the following graph:
Notice the graph topology and layout correspond with the structure of the thinned image - including the "spurs" created at the end. This is an ongoing problem/research area with such an approach...
EDIT: but could be possibly solved by removing stray arcs (leading to leaf nodes with degree==1) in the graph. e.g.
remove = [node for node,degree in graph.degree().items() if degree == 1]
graph.remove_nodes_from(remove)
nx.draw(graph)
I am trying to write a very good ANPR (automatic number plate recognition) system for Brazil's cars plates. So far I have used the javaANPR method which is the X and Y projection to find the ROI (car plate). It works well but not so good with image that has a lot of shadow in the car. And I am using tesseract-ocr as well for character recognition.
I got 80% of success for really good car images, from cars not moving.
And I got less than 60% for not so good images from moving cars.
I have been resourcing online, reading papers, etc. What do you think could help me improve it ? Maybe marge two methods ? Use templateMatch as well ? Because I need about 95% - 98% of success rate.
I see the anpronline
Their demo: https://www.anpronline.net/demo.html
They have done a really good job. It worked on 100% of my images.
Are you guys aware of what OCR engine do they use ? Maybe this is a top secret.
But can you guys point me to the right direction of how to improve my OCR ?
I really appreciate any help.
Thanks
It's likely that any highly successful system has already been patented or classified as a trade secret. If you are not doing this for a business you could try and find the patent and replicate from that. Alternatively you could look in the scientific literature for state of the art algorithms. Google Scholar is a pretty good search engine for that kind of search.
I was given a project on vehicle type identification with neural network and that is how I came to know the awesomeness of neural technology.
I am a beginner with this field, but I have sufficient materials to learn it. I just want to know some good places to start for this project specifically, as my biggest problem is that I don't have very much time. I would really appreciate any help. Most importantly, I want to learn how to match patterns with images (in my case, vehicles).
I'd also like to know if python is a good language to start this in, as I'm most comfortable with it.
I am having some images of cars as input and I need to classify those cars by there model number.
Eg: Audi A4,Audi A6,Audi A8,etc
You didn't say whether you can use an existing framework or need to implement the solution from scratch, but either way Python is excellent language for coding neural networks.
If you can use a framework, check out Theano, which is written in Python and is the most complete neural network framework available in any language:
http://www.deeplearning.net/software/theano/
If you need to write your implementation from scratch, look at the book 'Machine Learning, An Algorithmic Perspective' by Stephen Marsland. It contains example Python code for implementing a basic multilayered neural network.
As for how to proceed, you'll want to convert your images into 1-D input vectors. Don't worry about losing the 2-D information, the network will learn 'receptive fields' on its own that extract 2-D features. Normalize the pixel intensities to a -1 to 1 range (or better yet, 0 mean with a standard deviation of 1). If the images are already centered and normalized to roughly the same size than a simple feed-forward network should be sufficient. If the cars vary wildly in angle or distance from the camera, you may need to use a convolutional neural network, but that's much more complex to implement (there are examples in the Theano documentation). For a basic feed-forward network try using two hidden layers and anywhere from 0.5 to 1.5 x the number of pixels in each layer.
Break your dataset into separate training, validation, and testing sets (perhaps with a 0.6, 0.2, 0.2 ratio respectively) and make sure each image only appears in one set. Train ONLY on the training set, and don't use any regularization until you're getting close to 100% of the training instances correct. You can use the validation set to monitor progress on instances that you're not training on. Performance should be worse on the validation set than the training set. Stop training when the performance on the validation set stops improving. Once you've accomplished this you can try different regularization constants and choose the one that results in the best validation set performance. The test set will tell you how well your final result is performing (but don't change anything based on test set results, or you risk overfitting to that too!).
If your car images are very complex and varied and you cannot get a basic feed-forward net to perform well, you might consider using 'deep learning'. That is, add more layers and pre-train them using unsupervised training. There's a detailed tutorial on how to do this here (though all the code examples are in MatLab/Octave):
http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial
Again, that adds a lot of complexity. Try it with a basic feed-forward NN first.
I'm doing some work on pathfinding.
So far I have tested my code on scenes composed of 2D cells.
I've also created a simple 3d scene to test my work on as well.
I'd like to test my work on some 3d scenes .. but it is time consuming to create them.
Does anyone know of any scene datasets that I could use to test my pathfinding algorithms on?
To get a better answer, you really need to specify the dimensionality of the configuration spaces that you want to consider. You aren't going to be tackling protein folding and docking problems (200+ degrees of freedom) with discrete graph searches. Even a relatively small planning problems (in terms of academic problems), of about 6 degrees of freedom can quickly become intractable.
Most of the best examples for planning tend to be published in research papers first, and then make their way into more general use. Some of the best work tends to be published in IEEE journals, or at the Intelligent Robots and Systems (IROS) and International Conference on Robotics and Automation (ICRA) conferences. It may also be worth using the bibliography of a well known reference in the field, such as "Motion Planning" by LaValle as a starting point for further research (available in bibtex here)
Mark Overmars work in the computational geometry and planning communities have made some of the problems considered in his publications very recognizable. It is worth checking if any his current grad students and collaborators have any data sets available at the moment.
If you're happy to still be doing some work in 2d, and to manually convert an image to geometric data, Kris Beevers website has a number of worked examples for a range of planners in 2d work spaces.
The Motion Strategy Library contains a number of classical motion planning problems for use in 2d and 3d workspaces, with varying dimensionality of configuration space depending on the problem. It includes:
L sections into a birdcage
trailers
multiple trailers
mazes
kinematic chains
non-holonomic cars
A more recent implementation of an academic motion planning library is The Open Motion Planning Library developed by the Kavraki lab. Because of licensing, I haven't checked personally, but I assume that they ship some examples and tests with their project.
A number of significantly more complex kinodynamic motion planning examples are now publicly available as part of the OpenRAVE project. Their gallery is eye opening.
when I need big 3D datasets, I usually use attractors or other dynamical series. You simply have to iterate as many time as you want and it will generate a nice set of 3D data.
Try this 'Peter de Jong Attractor':
Xn+1 = sin(a Yn) - cos(b Xn)
Yn+1 = sin(c Xn) - cos(d Yn)
Where (for example): a = 1.4, b = -2.3, c = 2.4, d = -2.1
Im trying to do a screen-flashing application, that flashes the screen according to the music(which will be frequencies, such as healing frequencies, etc...).
I already made the player and know how will I make the screen flash, but I need to make the screen flash super fast according to the music, for example if the music speeds up, the screen-flash will flash faster. I understand that I would achieve this by FFT or DSP(as I only need to know when the frequency raises from some Hz, lets say 20 to change the color, making the screen-flash).
But I've found that I understand NOTHING, even less try to implement it to my application.
Can somebody help me out to learn whichever both of them? My email is sismetic_chaos#hotmail.com. I really need help, I've been stucked for like 3 days not coding or doing anything at all, trying to understand, but I dont.
PS:My application is written in C++ and Qt.
PS:Thanks for taking the time to read this and the willingness to help.
Edit: Thanks to all for the answers, the problem is in no way solved yet, but I appreciate all the answers, I didnt thought I would get so many answers and info. Thanks to you all.
This is a difficult problem, requiring more than an FFT. I'll briefly describe how I implemented beat detection when I was writing software for professional DJ equipment.
First of all, you'll need to cut down the amount of data you're dealing with, since there are only two or three beats per second, but tens of thousands of samples. You'll also need to look at different frequency ranges, since some types of music carry the tempo in the bassline, and others in percussion or other instruments. So pass the signal through several band-pass filters (I chose 8 filters, each covering one octave, from low bass to high treble), and then downsample each band by averaging the power over a few hundred samples.
Every few seconds, you'll have a thousand or so samples in each band. Your next tool is an autocorrelation, to identify repetitive patterns in the music. The peaks of the autocorrelation tell you what the beat is more or less likely to be; but you'll need to make up some heuristics to compare all the frequency bands to find a beat that you can be confident in, and to avoid misleading syncopations. If you can manage that, then you'll have a reasonable guess at the tempo, but no idea of the phase (i.e. exactly when to flash the screen).
Now you can look at the a smoothed version of the audio data for peaks, some of which are likely to correspond to beats. Initially, look for the strongest peak over the course of a few seconds and take that as a downbeat. In conjunction with the tempo you estimated in the first stage, you can predict when the next beat is due, and measure where you actually saw something like a beat, and adjust your estimate to more closely match the data. You can also maintain a confidence level based on how well the predicted beats match the measured peaks; if that drops too low, then restart the beat detection from scratch.
There are a lot of fiddly details to this, and it took me some weeks to get it working nicely. It is a difficult problem.
Or for a simple visualisation effect, you could simply detect peaks and flash the screen for each one; it will probably look good enough.
The output of a FFT will give you the frequency spectrum of an audio sample, but extracting the tempo from the FFT output is probably not the way you want to go.
One thing you can do is to use peak detection to identify the volume "spikes" that typically occur on the "down-beats" of the music. If you can identify the down-beats, then you can use a resource like bpmdatabase.com to find the tempo of the song. The tempo will tell you how fast to flash and the peaks you detected will tell you when to start flashing. Have your app monitor your flashes to make sure that they generally occur at the same time as a peak (if the two start to diverge, then the tempo may have changed mid-song).
That may sound straightforward, but this is actually a very non-trivial thing to implement. You might want to read this SO question for more information. There are some quality links in the answers there.
If I'm completely mis-interpreting what you are trying to do and you need to do FFTs for something different, then you might want to look at using one of the existing FFT libraries to do the heavy lifting for you. Some examples are FFTW and KissFFT.
It sounds like maybe you're trying to get your visualizer to flash the screen in time with the
music somehow. I don't think calculating the FFT is going to help you here. At any
given instant, there will be many simultaneous frequency components, all over the audio spectrum (roughly 20 Hz to 20 kHz). But you're likely to be a lot more interested in the
musical tempo (beats per minute -- more like 5 Hz or below), and that's not going to show
up anywhere in an FFT of the raw audio signal.
You probably need something much simpler -- some sort of real-time peak detection.
Whenever you see a peak greater than some threshold above the average volume,
make your screen flash.
Of course, more complicated visualizations might well take advantage of the FFT,
but not the one you're describing.
My recommendation would be to find a library that does this for you. Unless you have a lot of mathematics to back you up, I think you will be wasting a ton of your time trying to learn FFTs when all you really want out is some sort of 'base hits per minute' number out which you can adjust your graphics to accordingly.
Check out this similar post:
here
It took me about three weeks to understand the mathematics behind FFTs and then another week to write something in Matlab using those concepts. If you are discouraged after three days, don't try and roll your own.
I hope this is helpful advice and not discouraging.
-Brian J. Stinar-
As previous answers have noted, an FFT is probably not the tool you need in order to solve your problem, which requires tempo detection rather than spectral analysis.
For an example of what can be done using FFT - and of how a particular FFT implementation was integrated into a Qt application, take a look at this blog post which describes the spectrum analyzer demo I developed. Code for the demo is shipped with Qt itself, in the demos/spectrum directory.