I was trying to change my activation function of my neural net from sigmoid to RELU (or more specifically SELU). Since I got a lot of exploding gradients with that change, I tried to use the batch normalization. I calculated the gradients of my error function w.r.t the learning parameters \beta and \gamma, but it seems that they are a bit different from the ones I saw in several (sadly only Python) examples.
Here, for example, the code example on the bottom of the page says dbeta = np.sum(dout, axis=0) and I wonder what exactly this dout is.
My derivatives look like this:
Derivation of error function w.r.t \beta
What am I doing wrong in this derivation?
Thank you a lot for your help.
I try to add batchnorm2d layer in a small CNN testet on MNIST with Libtorch C++ code with or without GPU use
Here
https://github.com/ollewelin/libtorch-GPU-CNN-test-MNIST-with-Batchnorm
And the precision increase a little then.
Search for
”bn1”
Or
”bn2”
In this code you find.
Installation at Ubuntu with GPU and Libtorch + OpenCV for C++ here:
https://github.com/ollewelin/torchlib-opencv-gpu
Related
Problem:
I'm trying to build a image processing program which detects scratches on a module.
As seen in the image below, a few scratches can be found.
One of the problems is that I have only two samples with scratches.
Question:
What would be the best way to find the scratches under the limited number of sample condition?
(if detection is too hard, accepted / not accepted classification is also fine)
What I tried:
I tried to detect the scratches by using GMM(gaussian mixture model)
-> It didn't work because of too many features. GMM is only effective on object such as textures.
I will try to implement Deep Learning, but I'm not sure if it will work or not.
Image Sample
It seems that dlib needs a loss layer that dictates how the layers most distant to our input layer are treated. I cannot find any documentation towards the loss layers but it seems that there is no way to have just some summation layer.
Summing up all the values of the last layer would be exactly what I need for the regression, though (see also: https://deeplearning4j.org/linear-regression)
I was thinking along the lines of writing a custom loss layer but could not find information about this, either.
So, have I overseen some corresponding layer here or is there a possibility to have what I need?
The loss layers in dlib are listed in the menu on dlib's machine learning page. Look for the words "loss layers". There is lots of documentation.
The current released version of dlib doesn't include a regression loss. However, if you get the current code from github you can use the new loss_mean_squared layer to do regression. See: https://github.com/davisking/dlib/blob/master/dlib/dnn/loss_abstract.h
Say I want to use some other product to create an MLP (R,Python, Matlab, whatever) but I want to run that network, i.e. just for prediction, under opencv. Assume that the parameters (e.g. activation function) are compatible between the training product and opencv.
How can I import my trained weights into the opencv MLP? Perhaps the training product uses an MxN matrix of weights for each layer where M is the input layer and M the output (and so W(i,j) would be the weight between input node i and output node j.) Perhaps the biases are stored in a separate N element vector. The specifics of the original format don't matter so much because as long as I know what the weights mean and how they are stored I can transform them however opencv needs them.
So, given that, how do I import these weights into a (run time prediction only) opencv MLP? What weight bias (etc?) format does opencv need and how do I set its weights+baises?
I've just run into the same problem. I haven't looked at Opencv's MLP class enough yet to know if there's an easier/simpler way to, but OpenCV let's you save and load MLP's from .xmls and .ymls so if you make an ANN in OpenCV, you can then save it to one of those formats, look at it to figure out the format OpenCV wants, and then save your network into that format from R/Python/MatLab or at least into some format and make a script to translate it from there to OpenCv's format. Once you have that done it should be as simple as instantiating opencv's mlp in the code you actually want to use it to predict on and calling the load("filename") function on it. (I realize this is a year after the fact, so hopefully you found an answer or a work around. If you found a better idea, tell me, I'd love to know).
You must parse your model like how the 'read' function of MLP in OpenCV parse the xml or yml. I think this will not too hard.
I'm trying to align two images taken from a handheld camera.
At first, I was trying to use the OpenCV warpPerspective method based on SIFT/SURF feature points. The problem is the feature-extract & matching process may be extremely slow when the image quality is high (3000x4000). I tried to scale-down the image before find feature-points, the result is not as good as before.(The Mat generated from findHomography shouldn't be affected by scaling down the image, right?) And sometimes, due to lack of good feature point matches, the result is quite strange.
After searching on this topic, it seems that solving the problem in Fourier domain will speed up the registration process. And I've found this question which leads me to the code here.
The only problem is the code is written in python with numpy (not even using OpenCV), which makes it quite hard to re-written to C++ code using OpenCV (In OpenCV, I can only find dft and there's no fftshift nor fft stuff, I'm not quite familiar with NumPy, and I'm not brave enough to simply ignore the missing methods). So I'm wondering why there is not such a Fourier-domain image registration implementation using C++?
Can you guys give me some suggestion on how to implement one, or give me a link to the already implemented C++ version? Or help me to turn the python code into C++ code?
Big thanks!
I'm fairly certain that the FFT method can only recover a similarity transform, that is, only a (2d) rotation, translation and scale. Your results might not be that great using a handheld camera.
This is not quite a direct answer to your question, but, as a suggestion for a speed improvement, have you tried using a faster feature detector and descriptor? In OpenCV SIFT/SURF are some of the slowest methods they have for feature extraction/matching. You could try testing some of their other methods first, they all work quite well and are faster than SIFT/SURF. Especially if you use their FLANN-based matcher.
I've had to do this in the past with similar sized imagery, and using the binary descriptors OpenCV has increases the speed significantly.
If you need only shift you can use OpenCV's phasecorrelate
Sorry if this seems like a silly or lazy "I-can't-find-it' question but I've been trying for a few days now to find a paper or anything of the like to explain how to generate speckle noise (on 2D images). I have found out that one of the more simple means of removing speckle noise is a mean filter (which I've already implemented) but absolutely nowhere can I find a way of generating the noise. Could someone please direct me to where I can learn to generate speckle noise? Furthermore would it be a stretch to ask if there was a simple way to do it in OpenCV (a C++ image processing library).
Thanks for any help you can provide.
Speckle noise is essentially a multiplicative noise, which may (or may not) have an additive noise as well (definitions vary depending upon circumstances). This paper provides a good overview of speckle noise, including descriptions and approaches to removing it.
Here is a some simple python code that can produce multiplicative speckle noise:
import cv
im = cv.LoadImage('tree.jpg', cv.CV_LOAD_IMAGE_GRAYSCALE)
mult_noise = cv.CreateImage((im.width,im.height), cv.IPL_DEPTH_32F, 1)
cv.RandArr(cv.RNG(6), mult_noise, cv.CV_RAND_NORMAL, 1, 0.1)
cv.Mul(im, mult_noise, im)
cv.ShowImage("tree with speckle noise", im)
cv.WaitKey(0)
no noise:
with speckle noise:
Speckle noise is linked to the physical imaging process, so I'm not sure it's easy (or even really possible) to simulate it in a general manner.
However, depending on your desired type of images, you can use other forms of noise to approach it. I guess that a multiplicative salt-and-pepper noise should more or less do the trick for simularing a SAR image.
Another (probably better) possibility is to explore the websites of NASA / ESA and look for SAR images (look for programs like Pleiades, Cosmo-Skymed and SAR Lupe). Some gated laser imaging labs have mnybe also released publicly some sample data.
It can be just a matter of adding gaussian noise to your image. cvRandArr seems like a good candidate.
You can also have something more sophisticated by pondering your noise with your signal, which is also easy since it's just some pixel-wide multiplication between original image and your noise.