CS231n softmax gradients on CIFAR10, what relative error would be considered accurate, or no bugs is code? - computer-vision

the loss function is just computed by dotting the training images and the weights, then using softmax to find the "probabilities" of each class.
Then, the loss is the average of -log(probability_of_correct).
and for the gradient check using the Euler's method with h=1e-5, i got:
numerical: -0.5030223143798196 analytic: -0.5030224447581528, relative error: 1.295949629044475e-07
numerical: 2.3656258338666802 analytic: 2.3656257281976703, relative error: 2.233426157970689e-08
numerical: -0.48364876714668265 analytic: -0.48364875518402267, relative error: 1.2367094609003816e-08
numerical: 2.652402151537281 analytic: 2.6524021167525063, relative error: 6.557221125801157e-09
numerical: 0.10394627025789303 analytic: 0.10394623860409008, relative error: 1.5226043077287554e-07
numerical: -4.134500781471928 analytic: -4.134500935000654, relative error: 1.856677879589216e-08
numerical: -0.28087999472958813 analytic: -0.28088004313830933, relative error: 8.617330877076604e-08
numerical: -1.9230911777912139 analytic: -1.9230912804536546, relative error: 2.6692036034498702e-08
numerical: -4.559967254835762 analytic: -4.5599673988207945, relative error: 1.5787945665579486e-08
numerical: -0.21220754542472517 analytic: -0.21220764976932824, relative error: 2.45855012380893e-07
But this relative error is a lot bigger than what was shown when doing the SVM.
This is understandable since there is a usage of "-log" here. However, it did not change much as I changed h(as an input to the grad_check_sparse) from 1e-5 to 1e-7.
I'm not sure if this is a normal thing to see or if I maybe made a mistake in my code.
my code:
num_classes = W.shape[1]
num_train = X.shape[0]
num_pics = W.shape[0]
scores = X # W
denominators = np.zeros([num_train])
numerators = np.zeros([num_train])
possibilities = np.zeros([num_train])
for i in range(num_train):
denominators[i] = np.sum(np.exp(scores[i]))
numerators[i] = np.exp(scores[i][y[i]])
possibilities[i] = numerators[i]/denominators[i]
loss -= np.log(possibilities[i])
loss /= num_train
loss += reg * np.sum(W * W)
or fully vectorized:
num_classes = W.shape[1]
num_train = X.shape[0]
num_pics = W.shape[0]
scores = X # W
scores = np.exp(scores)
denominators = np.sum(scores, axis=1)
numerators = scores[(list(range(num_train)), y)]
possibilities = numerators / denominators
loss = np.sum(np.log(possibilities)*-1)/num_train
my grad calculation:
if grad:
for i in range(num_train):
for k in range(num_pics):
dW[k][y[i]] -= np.exp(scores[i][y[i]])*X[i][k]\
/denominators[i]/possibilities[i]/num_train
for j in range(num_classes):
dW[k][j] += numerators[i]*np.exp(scores[i][j])*X[i][k]\
/denominators[i]/denominators[i]/possibilities[i]/num_train

Related

Is Time Resolution always the best in Continuous Wavelet Transform (CWT)?

Does wavelet move a sample point at a time in CWT? This seems to have the best time resolution at all time regardless of what scale is. In the example below, signal length is 2048, for whatever scale value, the calculated coef has a length of 2048.
import pywt
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
# Define signal
fs = 1024.0
dt = 1 / fs
signal_freuqncy = 100
scale = np.arange(2, 256)
wavelet = 'morl'
t = np.linspace(0, 2, int(2 * fs))
y = np.sin(signal_freuqncy*2*np.pi*t)
# Calculate continuous wavelet transform
coef, freqs = pywt.cwt(y, scale, wavelet, dt)
# Show w.r.t. time and frequency
plt.figure(figsize=(6, 4))
plt.pcolor(t, freqs, coef, shading = 'flat', cmap = 'jet')
# Set yscale, ylim and labels
# plt.yscale('log')
# plt.ylim([0, 20])
plt.ylabel('Frequency (Hz)')
plt.xlabel('Time (sec)')
cbar = plt.colorbar()
cbar.set_label('Energy Intensity')
plt.show()
If my guess is right, it is contradictory to my previous understanding that we use expanded wavelet for low freq. component because we want good freq. resolution but poor time resolution at low freq. range. Like shown in the picture.

Computational Physics, FFT analysis

I solved the following questions for a computational assignment, I got a really bad grade on it (67%) I would like to understand how to properly do these questions, in particular Q1.b and Q3. Please be as detailed as possible, I would really like to understand my msitakes
Generate data (sinusoidal functions). Use fft to analyze:
a) A superposition of three waves with constant, but different frequencies
b) A wave whose frequency depends on time
Plot the graphs, sample frequencies, amplitude and power spectra with appropriate axes.
Use the 3 waves from Exercise 1a), but change them to have the same frequency, phase and amplitude. Contaminate each of them with successively increasing amounts of
random, Gaussian-distributed noise.
1) Perform an FFT on the superposition of the three noise-contaminated waves.
Analyze and plot the output.
2) Filter the signal with a Gaussian function, plot the “clean” wave, and analyze the
result. Is the resultant wave 100% clean? Explain.
#1(b)
tmin = -2*pi
tmax - 2*pi
delta = 0.01
t = arange(tmin, tmax, delta)
y = sin(2.5*t*t)
plot(t, y, '-')
title('Figure 2: Plotting a wave whose frequency depends on time ')
xlabel('Time (s)')
ylabel('Y(t)')
show()
#b.2
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = np.arange(0,1,Ts) # time vector
ff = 5; # frequency of the signal
y = np.sin(2*np.pi*ff*t)
n = len(y) # length of the signal
k = np.arange(n)
T = n/Fs
frq = k/T # two sides frequency range
frq = frq[range(n/2)] # one side frequency range
Y = np.fft.fft(y)/n # fft computing and normalization
Y = Y[range(n/2)]
#Time vs. Amplitude
plot(t,y)
title('Figure 2: Time vs. Amplitude')
xlabel('Time')
ylabel('Amplitude')
plt.show()
#Amplitude Spectrum
plot(frq,abs(Y),'r')
title('Figure 2a: Amplitude Spectrum')
xlabel('Freq (Hz)')
ylabel('amplitude spectrum')
plt.show()
#Power Spectrum
plot(frq,abs(Y)**2,'r')
title('Figure 2b: Power Spectrum')
xlabel('Freq (Hz)')
ylabel('power spectrum')
plt.show()
#Exercise 3:
#part 1
t = np.linspace(-0.5*pi,0.5*pi,1000)
#contaminating our waves with successively increasing white noise
y_1 = sin(15*t) + np.random.normal(0,0.2*pi,1000)
y_2 = sin(15*t) + np.random.normal(0,0.3*pi,1000)
y_3 = sin(15*t) + np.random.normal(0,0.4*pi,1000)
y = y_1 + y_2 + y_3 # superposition of three contaminated waves
#Plotting the figure
plot(t,y,'-')
title('A superposition of three waves contaminated with Gaussian Noise')
xlabel('Time (s)')
ylabel('Y(t)')
show()
delta = pi/1000.0
n = len(y) ## calculate frequency in Hz
freq = fftfreq(n, delta) # Computing the FFT
Freq = fftfreq(len(y), delta) #Using Fast Fourier Transformation to #calculate frequencies
N = len(Freq)
fr = Freq[1:len(Freq)/2.0]
A = fft(y)
XF = A[1:len(A)/2.0]/float(len(A[1:len(A)/2.0]))
# Amplitude spectrum for contaminated waves
plt.plot(fr, abs(XF))
title('Figure 3a : Amplitude spectrum with Gaussian Noise')
xlabel('frequency')
ylabel('Amplitude')
show()
# Power spectrum for contaminated waves
plt.plot(fr,abs(XF)**2)
title('Figure 3b: Power spectrum with Gaussian Noise')
xlabel('frequency(cycles/year)')
ylabel('Power')
show()
# part 2
F_v = exp(-(abs(freq)-2)**2/2*0.5**2)
spectrum = A*F_v #Applying the Gaussian Filter to clean our waves
new_y = ifft(spectrum) #Computing the inverse FFT
plot(t,new_y,'-')
title('A superposition of three waves after Noise Filtering')
xlabel('Time (s)')
ylabel('Y(t)')
show()
Something like the code/images below would have been expected. I deviated in the plot of the sum of the three noisy waves to show off all three waves and the sum. Note that in the intensity spectrum of the noisy wave you don't see much. For those cases it can be instructive to also plot the logarithm of the spectrum (np.log) so you can see the noise better.
In the last plot I plotted both the Gaussian filter and the spectrum (different sizes) w/o rescaling just to show where the filter applies. It is effectively a low pass filter (lets low frequencies through), removing the higher frequency noise by multiplying it with numbers close to zero.
import numpy as np
import matplotlib.pyplot as p
%matplotlib inline
#1(b)
p.figure(figsize=(20,16))
p.subplot(431)
t = np.arange(0,10, 0.001) #units in seconds
#cleaner to show the frequency change explicitly than y = sin(2.5*t*t)
f= 1+ t*0.1 # linear up chirp, i.e. frequency goes up , frequency units in Hz (1/sec)
y = np.sin(2* np.pi* f* t)
p.plot(t, y, '-')
p.title('Figure 2: Plotting a wave whose frequency depends on time ')
p.xlabel('Time (s)')
p.ylabel('Y(t)')
#b.2
Fs = 150.0; # sampling rate
Ts = 1.0/Fs; # sampling interval
t = np.arange(0,1,Ts) # time vector
ff = 5; # frequency of the signal
y = np.sin(2*np.pi*ff*t)
n = len(y) # length of the signal
k = np.arange(n) ## ok, the FFT has as many points in frequency space, as the original in time
T = n/Fs ## correct ; T=sampling time, the total frequency range is 1/sample time
frq = k/T # two sided frequency range
frq = frq[range(n/2)] # one sided frequency range
Y = np.fft.fft(y)/n # fft computing and normalization
Y = Y[range(n/2)]
# Amplitude vs. Time
p.subplot(434)
p.plot(t,y)
p.title('y(t)') # Amplitude vs Time is commonly said, but strictly not true, the amplitude is unchanging
p.xlabel('Time')
p.ylabel('Amplitude')
#Amplitude Spectrum
p.subplot(435)
p.plot(frq,abs(Y),'r')
p.title('Figure 2a: Amplitude Spectrum')
p.xlabel('Freq (Hz)')
p.ylabel('amplitude spectrum')
#Power Spectrum
p.subplot(436)
p.plot(frq,abs(Y)**2,'r')
p.title('Figure 2b: Power Spectrum')
p.xlabel('Freq (Hz)')
p.ylabel('power spectrum')
#Exercise 3:
#part 1
t = np.linspace(-0.5*np.pi,0.5*np.pi,1000)
# #contaminating our waves with successively increasing white noise
y_1 = np.sin(15*t) + np.random.normal(0,0.1,1000) # no need to get pi involved in this amplitude
y_2 = np.sin(15*t) + np.random.normal(0,0.2,1000)
y_3 = np.sin(15*t) + np.random.normal(0,0.4,1000)
y = y_1 + y_2 + y_3 # superposition of three contaminated waves
#Plotting the figure
p.subplot(437)
p.plot(t,y_1+2,'-',lw=0.3)
p.plot(t,y_2,'-',lw=0.3)
p.plot(t,y_3-2,'-',lw=0.3)
p.plot(t,y-6 ,lw=1,color='black')
p.title('A superposition of three waves contaminated with Gaussian Noise')
p.xlabel('Time (s)')
p.ylabel('Y(t)')
delta = np.pi/1000.0
n = len(y) ## calculate frequency in Hz
# freq = np.fft(n, delta) # Computing the FFT <-- wrong, you don't calculate the FFT from a number, but from a time dep. vector/array
# Freq = np.fftfreq(len(y), delta) #Using Fast Fourier Transformation to #calculate frequencies
# N = len(Freq)
# fr = Freq[1:len(Freq)/2.0]
# A = fft(y)
# XF = A[1:len(A)/2.0]/float(len(A[1:len(A)/2.0]))
# Why not do as before?
k = np.arange(n) ## ok, the FFT has as many points in frequency space, as the original in time
T = n/Fs ## correct ; T=sampling time, the total frequency range is 1/sample time
frq = k/T # two sided frequency range
frq = frq[range(n/2)] # one sided frequency range
Y = np.fft.fft(y)/n # fft computing and normalization
Y = Y[range(n/2)]
# Amplitude spectrum for contaminated waves
p.subplot(438)
p.plot(frq, abs(Y))
p.title('Figure 3a : Amplitude spectrum with Gaussian Noise')
p.xlabel('frequency')
p.ylabel('Amplitude')
# Power spectrum for contaminated waves
p.subplot(439)
p.plot(frq,abs(Y)**2)
p.title('Figure 3b: Power spectrum with Gaussian Noise')
p.xlabel('frequency(cycles/year)')
p.ylabel('Power')
# part 2
p.subplot(4,3,11)
F_v = np.exp(-(np.abs(frq)-2)**2/2*0.5**2) ## this is a Gaussian, plot it separately to see it; play with the values
cleaned_spectrum = Y*F_v #Applying the Gaussian Filter to clean our waves ## multiplication in FreqDomain is convolution in time domain
p.plot(frq,F_v)
p.plot(frq,cleaned_spectrum)
p.subplot(4,3,10)
new_y = np.fft.ifft(cleaned_spectrum) #Computing the inverse FFT of the cleaned spectrum to see the cleaned wave
p.plot(t[range(n/2)],new_y,'-')
p.title('A superposition of three waves after Noise Filtering')
p.xlabel('Time (s)')
p.ylabel('Y(t)')

ordfilt2: Find requires variable sizing

I want to generate c++ code from the following Matlab function (Harris corner detection) that detects corners from an image.My constraint is that I have to generate a static library in C++ without variable-sizing support.
So, I have disabled variable size support from settings and also selected target platform as unspecified 32 bit processor.
In this way I'll be able to use it in Vivado HLS for an FPGA project.
However, when I generate the code, the line containing ordfilt2 function throws an error that FIND requires variable sizing.
Please, help me if there is a workaround to this problem.I have seen a similar question posted here Matlab error "Find requires variable sizing" . But I am not sure how this applies to my case.Thanks.
Here's the code:
function [cim] = harris(im , thresh)
dx = [-1 0 1; -1 0 1; -1 0 1]; % Derivative masks
dy = dx';
Ix = conv2(im, dx, 'same'); % Image derivatives
Iy = conv2(im, dy, 'same');
% Generate Gaussian filter of size 6*sigma (+/- 3sigma) and of
% minimum size 1x1.
sigma = 1.5;
g = fspecial('gaussian',max(1,fix(6*sigma)), sigma);
Ix2 = conv2(Ix.^2, g, 'same'); % Smoothed squared image derivatives
Iy2 = conv2(Iy.^2, g, 'same');
Ixy = conv2(Ix.*Iy, g, 'same');
cim = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps); % Harris corner measure
% Extract local maxima by performing a grey scale morphological
% dilation and then finding points in the corner strength image that
% match the dilated image and are also greater than the threshold.
radius = 1.5;
sze = 2*radius+1; % Size of mask.
mx = ordfilt2(cim,sze^2,ones(sze)); % Grey-scale dilate.
cim = (cim==mx)&(cim>thresh); % Find maxima.
end

Complex cross spectral density

mlab.csd from matplotlib: http://matplotlib.org/api/mlab_api.html#matplotlib.mlab.csd can be used to get real valued cross spectral density. If I want to get the phase information from the spectral density, I need a csd calculation which returns complex values. Is there one ?
This is discussed e.g. in this answer: https://stackoverflow.com/a/29306730/3920342
If you use csd of the mlab library you will get complex values so you can calculate phase angles (and the real valued coherence). In the following code s1 and and s2 contain the two signals (in time domain) to be correlated.
from matplotlib import mlab
# First create power sectral densities for normalization
(ps1, f) = mlab.psd(s1, Fs=1./dt, scale_by_freq=False)
(ps2, f) = mlab.psd(s2, Fs=1./dt, scale_by_freq=False)
plt.plot(f, ps1)
plt.plot(f, ps2)
# Then calculate cross spectral density
(csd, f) = mlab.csd(s1, s2, NFFT=256, Fs=1./dt,sides='default', scale_by_freq=False)
fig = plt.figure()
ax1 = fig.add_subplot(1, 2, 1)
# Normalize cross spectral absolute values by auto power spectral density
ax1.plot(f, np.absolute(csd)**2 / (ps1 * ps2))
ax2 = fig.add_subplot(1, 2, 2)
angle = np.angle(csd, deg=True)
angle[angle<-90] += 360
ax2.plot(f, angle)
# zoom in on frequency with maximum coherence
ax1.set_xlim(9, 11)
ax1.set_ylim(0, 1e-0)
ax1.set_title("Cross spectral density: Coherence")
ax2.set_xlim(9, 11)
ax2.set_ylim(0, 90)
ax2.set_title("Cross spectral density: Phase angle")
Here the real and imaginary(!) part of the cross spectral density:
This code is taken from the question How to use the cross-spectral density to calculate the phase shift of two related signals to create two signals s1 and s2:
"""
Compute the coherence of two signals
"""
import numpy as np
import matplotlib.pyplot as plt
# make a little extra space between the subplots
plt.subplots_adjust(wspace=0.5)
nfft = 256
dt = 0.01
t = np.arange(0, 30, dt)
nse1 = np.random.randn(len(t)) # white noise 1
nse2 = np.random.randn(len(t)) # white noise 2
r = np.exp(-t/0.05)
cnse1 = np.convolve(nse1, r, mode='same')*dt # colored noise 1
cnse2 = np.convolve(nse2, r, mode='same')*dt # colored noise 2
# two signals with a coherent part and a random part
s1 = 0.01*np.sin(2*np.pi*10*t) + cnse1
s2 = 0.01*np.sin(2*np.pi*10*t) + cnse2

Error using scipy.optimize nonlinear solvers

I am trying to solve a set of M simultaneous eqns with M variables. I input a M X 2 matrix in as an initial guess to my function and it returns a M X 2 matrix, where each entry would equal zero if my guess was correct. Thus my function can be represented as f_k(u1,u2,...uN) = 0 for k=1,2,...N. Below is the code for my function, (for simplicities sake I have left out the modules that go with this code, i.e. p. or phi. for instance. I was more wondering if anyone else has had this error before)
M = len(p.x_lat)
def main(u_A):
## unpack u_A
u_P = u_total[:,0]
u_W = u_total[:,1]
## calculate phi_A for all monomeric species
G_W = exp(-u_W)
phi_W = zeros(M)
phi_W[1:] = p.phi_Wb * G_W[1:]
## calculate phi_A for all polymeric species
G_P = exp(-u_P)
G_P[0] = 0.
G_fwd = phi.fwd_propagator(G_P,p.Np,0) #(function that takes G_P and propagates outward)
G_bkwd = phi.bkwd_propagator(G_P,p.Np,0) #(function that takes G_P and propagates inward)
phi_P = phi.phi_P(G_fwd,G_bkwd,p.norm_graft_density,p.Np) #(function that takes the two propagators and combines them to calculate a segment density at each point)
## calculate u_A components
u_intW = en.u_int_AB(p.chi_PW,phi_P,p.phi_Pb) + en.u_int_AB(p.chi_SW,p.phi_S,p.phi_Sb) #(fxn that calculates new potential from the new segment densities)
u_intW[0] = 0.
u_Wprime = u_W - u_intW
u_intP = en.u_int_AB(p.chi_PW,phi_W,p.phi_Wb) + en.u_int_AB(p.chi_PS,p.phi_S,p.phi_Sb) #(fxn that calculates new potential from the new segment densities)
u_intP[0] = 0.
u_Pprime = u_P - u_intP
## calculate f_A
phi_total = p.phi_S + phi_W + phi_P
u_prime = 0.5 * (u_Wprime + u_Pprime)
f_total = zeros( (M, 2) )
f_total[:,0] = 1. - 1./phi_total + u_prime - u_Wprime
f_total[:,1] = 1. - 1./phi_total + u_prime - u_Pprime
return f_total
I researched ways of solving nonlinear equations such as this one using python. I came across the scipy.optimize library with the several options for solvers http://docs.scipy.org/doc/scipy-0.13.0/reference/optimize.nonlin.html. I first tried to use the newton_krylov solver and received the following error message:
ValueError: Jacobian inversion yielded zero vector. This indicates a bug in the Jacobian approximation.
I also tried broyden1 solver and it never converged but simply stayed stagnant. Code for implementation of both below:
sol = newton_krylov(main, guess, verbose=1, f_tol=10e-7)
sol = broyden1(main, guess, verbose=1, f_tol=10e-7)
My initial guess is given below here:
## first guess of u_A(x)
u_P = zeros(M)
u_P[1] = -0.0001
u_P[M-1] = 0.0001
u_W = zeros(M)
u_W[1] = 0.0001
u_W[M-1] = -0.0001
u_total = zeros( (M,2) )
u_total[:,0] = u_P
u_total[:,1] = u_W
guess = u_total
Any help would be greatly appreciated!