I am working with VGA on my Basys3 FPGA, and I currently want to draw a zone plate, for which the equation is (1 + cos(k*r^2)) / 2, where r is the distance from the plate center, and k=2*pi/lambda is the wavenumber, which determines the scale of the plate. I am guessing the best course of action would be to use a cosine LUT, but I really have no idea how to create one. I somewhat understand the idea behind it, but I don't know how to write one and what values it should contain.
This is the code I am trying to test out:
The only problem with this now is that I do not know what values to fill the memory_type :=() with, so that it will equal the k*r^2 from the formula.
architecture Behavioral of VGAdraw is
signal i : integer range 0 to 29:=0;
signal r : integer :=2;
type memory_type is array (0 to 29) of integer range -128 to 127;
signal cosine : memory_type :=();
begin
process(CLK)
begin
if (CLK'EVENT and CLK = '1') then
if (cntHor >= 0) AND (cntHor <= cstHorAL - 1) then
RED <= conv_std_logic_vector ((1 - cosine (i)) / 2, 8) (7 downto 4);
GREEN <= conv_std_logic_vector ((1 - cosine (i)) / 2, 8) (7 downto 4);
BLUE <= conv_std_logic_vector ((1 - cosine (i)) / 2, 8) (7 downto 4);
i <= i + 1;
else
RED <= "0000";
GREEN <= "0000";
BLUE <= "0000";
end if;
end if;
end process;
end Behavioral;
cntHor - horizontal counter
cstHorAL - nr of pixels on an active line
I cannot post the image itself due to lack of reputation, but this is what it should look like: http://handforgedvideo.com/wp-content/uploads/2013/02/1920x1080p24_Luma_Zone_Plate_Main.png
Any help is appreciated.
Thank you!
Your general code isn't too far off, but as Morten pointed out, you don't specify the format of the input (theta) or the outputs (sin_data and cos_data). Are they fixed point values? Where's the fractional point? Are they just integers?
You say:
I am guessing the best course of action would be to use a cosine LUT, but I really have no idea how to create one.
I presume you mean by "LUT" a generic "lookup table". The use of "LUT" is ambiguous in your question since you also mention the Basys3. In FPGA literature, LUT is a specific type of logic structure on an FPGA. It also means "lookup table", but the size and complexity are limited to a few digital inputs. There are no "cosine LUT" objects available. I just wanted to be clear that by "LUT" you mean a generic lookup table.
Now, your code isn't too far off. It is indeed a lookup table to pass in theta and output sin_data and cos_data. The questions are a) whether or not your outputs accurately represent the function and b) whether or not your implementation is the most efficient.
For the former, I'm not sure since you don't specify the input and output format. Also, you don't specify the mapping between ϴ and your function. Is ϴ the argument to cos()? Or is it k? Or is it λ?
For the latter, take a look at Xilinx UG901. It gives examples of how to infer ROMs (see the 'ROM HDL Coding Techniques' section). Your code, as written, would probably be the least efficient method. You need two lookup tables with 4K entries each (sin_data and cos_data), so two 4K x 12bit. You'd be better off with a ROM build from a block RAM.
Related
Here is my code, throttle comes out to -18 when I run the program, and when I do the math I get 77.941... which is what I'm looking for. I know this is an EDQ "Extremely Dumb Question", and I am most likely to experience a FIF, "Fist In Forehead" moment any minute but I am stuck on it for now. FYI, programming it on an Atmega 328P using Arduino IDE on Windows 10.
Following example prints -18 and according to my calcualtions it should be 77.941...
int throttle = (((800 - 270) * 100) / 680);
Serial.println(throttle);
This is the visualized code...
throttle = (((throttleSensor - oldMinValue) * (newMax - newMin)) / (oldMax - oldMin));
I am trying to do this, Convert a number range to another range, maintaining ratio
Also, I should add, it works fine when the result is below 47, above that it flips to a negative number.
The short answer is, (800 - 270) * 100 = 53000. which is too large a number for the space that was allocated for the calculation results, integer overflow.
so changing the code from this...
int throttle = (((800 - 270) * 100) / 680);
to this...
long largeValue = 100;
int throttle = (((800 - 270) * largeValue) / 680);
fixes the problem. The number 100 or value of (newMax - newMin) has to be a "long" or the processor will miscalculate. Someone, please correct me on this if need be or post a better answer if you got one. Also if someone has a better suggestion for the title so it can be easier found for future people with the same problem, go ahead and commend it below.
Thanks to the StackOverflow community for helping me solve this issue!
as #edgar_wideman answer suggest your sub result (53000) does not fit into 16bit integer <-32768,+32767>. You can avoid long use by bitshifting (dividing by power of 2) like this:
int sh=1; // shift stuff so it fits 16 bit
int throttle = (((800 - 270) * (100>>sh)) / (680>>sh));
This might not be exactly what you are looking for, but there is a function for changing values from one range to an other.
int y = map(value, minOld, maxOld, minNew, maxNew);
for example:
int y = map(10, 0, 50, 0, 100); // y would be 20
I am using the following function written in C++, whose purpose is to take the integral of one array of data (y) with respect to another (x)
// Define function to perform numerical integration by the trapezoidal rule
double trapz (double xptr[], double yptr[], int Npoints)
{
// The trapzDiagFile object and associated output file are how I monitor what data the for loop actually sees.
std::ofstream trapzDiagFile;
trapzDiagFile.open("trapzDiagFile.txt",std::ofstream::out | std::ofstream::trunc);
double buffer = 0.0;
for (int n = 0; n < (Npoints - 1); n++)
{
buffer += 0.5 * (yptr[n+1] + yptr[n]) * (xptr[n+1] - xptr[n]);
trapzDiagFile << xptr[n] << "," << yptr[n] << std::endl;
}
trapzDiagFile.close();
return buffer;
}
I validated this function for the simple case where x contains 100 uniformly spaced points from 0 to 1, and y = x^2, and it returned 0.33334, as it should.
But when I use it for a different data set, it returns -3.431, which makes absolutely no sense. If you look in the attached image file, the integral I am referring to is the area under the curve between the dashed vertical lines.
It's definitely a positive number.
Moreover, I used the native trapz command in MATLAB on the same set of numbers and that returned 1.4376.
In addition, I translated the above C++ trapz function into MATLAB, line for line as closely as possible, and again got 1.4376.
I feel like there's something C++ related I'm not seeing here. If it is relevant, I am using minGW-w64.
Apologies for the vagueness of this post. If I knew more about what kind of issue I am seeing, it would be easier to be concise about it.
Plot of the dataset for which the trapz function (my homemade C++ version) returns -3.431:
Please check the value of xptr[Npoints - 1]. It may be less than xptr[Npoints - 2], and was not included in the values that you output.
I'm trying to do some calculations for my game, and I'm trying to calculate the distance between two points. Essentially, I'm using the equation of a circle to see if the points are inside of the radius that I define.
(x - x1)^2 + (y - y1)^2 <= r^2
My question is: how do I evaluate the conditional statement with SSE and interpret the results? So far I have this:
float distSqr4 = (pow(x4 - k->getPosition().x, 2) + pow(y4 - k->getPosition().y, 2));
float distSqr3 = (pow(x3 - k->getPosition().x, 2) + pow(y3 - k->getPosition().y, 2));
float distSqr2 = (pow(x2 - k->getPosition().x, 2) + pow(y2 - k->getPosition().y, 2));
float distSqr1 = (pow(x1 - k->getPosition().x, 2) + pow(y1 - k->getPosition().y, 2));
__m128 distances = _mm_set_ps(distSqr1, distSqr2, distSqr3, distSqr4);
__m128 maxDistSqr = _mm_set1_ps(k->getMaxDistance() * k->getMaxDistance());
__m128 result = _mm_cmple_ps(distances, maxDistSqr);
Once I get the result variable, I get lost. How do I use the result variable that I just got? My plan was, if the condition evaluated turned out to be true, to do some lighting calculations and then draw the pixel on the screen. How do I interpret true vs false in this case?
Any help towards the right direction is greatly appreciated!
My plan was, if the condition evaluated turned out to be true, to do some lighting calculations and then draw the pixel on the screen.
Then you really have little choice but to branch.
The big advantage of doing conditional tests using SSE is that it allows you to write branchless code, which can lead to massive speed improvements. But in your case, you pretty much have to branch because, if I'm understanding you correctly, you never want to output anything on the screen if the condition evaluated to false.
I mean, I guess you could do all of the calculations unconditionally (speculatively) and then just use the result of the conditional to twiddle bits in the pixel values, essentially causing you to draw off of the screen. That would give you branchless code, but it's pretty silly. There is a penalty for branch mispredictions, but it won't be anywhere near as expensive as all of the calculations and drawing code.
In other words, the parallelism you're using SIMD to exploit is exhausted once you have the final result. It's just a simple, scalar compare-and-branch. First you test whether the condition evaluated to true. If not, you'll jump over the code that does the lighting calculations and pixel-drawing. Otherwise, you'll just fall through to execute that code.
The tricky part is that the compiler won't let you use an __m128 variable in a regular old if statement, so you need to "convert" result to an integer that you can use as the basis for a conditional. The easiest way to do that would be the _mm_movemask_epi8 intrinsic.
So you would basically just do:
__m128 distances = _mm_set_ps(distSqr1, distSqr2, distSqr3, distSqr4);
__m128 maxDistSqr = _mm_set1_ps(k->getMaxDistance() * k->getMaxDistance());
__m128 result = _mm_cmple_ps(distances, maxDistSqr);
if (_mm_movemask_epi8(result) == (unsigned)-1)
{
// All distances were less-than-or-equal-to the maximum, so
// go ahead and calculate the lighting and draw the pixels.
CalcLightingAndDraw(…);
}
This works because _mm_cmple_ps sets each packed double-word to all 1s if the comparison is true, or all 0s if the comparison is false. _mm_movemask_epi8 then collapses that into an integer-sized mask and moves it to an integer value. You then can use that integer value in a normal conditional statement.
Note: With Clang and ICC, you can get away with passing a __m128 value to the _mm_movemask_epi8 intrinsic. On GCC, it insists upon a __m128i value. You can handle this with a cast: _mm_movemask_epi8((__m128i)result).
Of course, I'm assuming here that you are only going to do the drawing if all of the distances are less-than-or-equal-to the maximum distance. If you want to treat each of the four distances independently, then you need to add more conditional tests on the mask:
__m128 distances = _mm_set_ps(distSqr1, distSqr2, distSqr3, distSqr4);
__m128 maxDistSqr = _mm_set1_ps(k->getMaxDistance() * k->getMaxDistance());
__m128 result = _mm_cmple_ps(distances, maxDistSqr);
unsigned condition = _mm_movemask_epi8(result);
if (condition != 0)
{
// One or more of the distances were less-than-or-equal-to the maximum,
// so we have something to draw.
if ((condition & 0x000F) != 0)
{
// distSqr1 was less-than-or-equal-to the maximum
CalcLightingAndDraw(distSqr1);
}
if ((condition & 0x00F0) != 0)
{
// distSqr2 was less-than-or-equal-to the maximum
CalcLightingAndDraw(distSqr2);
}
if ((condition & 0x0F00) != 0)
{
// distSqr3 was less-than-or-equal-to the maximum
CalcLightingAndDraw(distSqr3);
}
if ((condition & 0xF000) != 0)
{
// distSqr4 was less-than-or-equal-to the maximum
CalcLightingAndDraw(distSqr4);
}
}
This won't result in very efficient code, since you have to do so many conditional test-and-branch operations. You might be able to continue parallelizing some of the lighting calculations inside of the main if block. I can't say for sure if this is workable, since I don't have enough details about your algorithm/design.
Otherwise, if you can't see any way to wring more parallelism out of the drawing code, the use of explicit SSE intrinsics isn't buying you much here. You were able to parallelize one comparison (_mm_cmple_ps), but the overhead of setting up for that comparison (_mm_set_ps, which will probably compile into vinsertps or unpcklps+movlhps instructions, assuming the inputs are already in XMM registers) will more than cancel out any trivial gains you might get. You'd arguably be just as well off writing the code like so:
float maxDistSqr = k->getMaxDistance() * k->getMaxDistance();
if (distSqr1 <= maxDistSqr)
{
CalcLightingAndDraw(distSqr1);
}
if (distSqr2 <= maxDistSqr)
{
CalcLightingAndDraw(distSqr2);
}
if (distSqr3 <= maxDistSqr)
{
CalcLightingAndDraw(distSqr3);
}
if (distSqr4 <= maxDistSqr)
{
CalcLightingAndDraw(distSqr4);
}
Let say I've a system that distribute 8820 values into 96 values, rounding using Banker's Round (call them pulse). The formula is:
pulse = BankerRound(8820 * i/96), with i[0,96[
Thus, this is the list of pulses:
0
92
184
276
368
459
551
643
735
827
919
1011
1102
1194
1286
1378
1470
1562
1654
1746
1838
1929
2021
2113
2205
2297
2389
2481
2572
2664
2756
2848
2940
3032
3124
3216
3308
3399
3491
3583
3675
3767
3859
3951
4042
4134
4226
4318
4410
4502
4594
4686
4778
4869
4961
5053
5145
5237
5329
5421
5512
5604
5696
5788
5880
5972
6064
6156
6248
6339
6431
6523
6615
6707
6799
6891
6982
7074
7166
7258
7350
7442
7534
7626
7718
7809
7901
7993
8085
8177
8269
8361
8452
8544
8636
8728
Now, suppose the system doesn't send to me these pulses directly. Instead, it send these pulse in 8820th (call them tick):
tick = value * 1/8820
The list of the ticks I get become:
0
0.010430839
0.020861678
0.031292517
0.041723356
0.052040816
0.062471655
0.072902494
0.083333333
0.093764172
0.104195011
0.11462585
0.124943311
0.13537415
0.145804989
0.156235828
0.166666667
0.177097506
0.187528345
0.197959184
0.208390023
0.218707483
0.229138322
0.239569161
0.25
0.260430839
0.270861678
0.281292517
0.291609977
0.302040816
0.312471655
0.322902494
0.333333333
0.343764172
0.354195011
0.36462585
0.375056689
0.38537415
0.395804989
0.406235828
0.416666667
0.427097506
0.437528345
0.447959184
0.458276644
0.468707483
0.479138322
0.489569161
0.5
0.510430839
0.520861678
0.531292517
0.541723356
0.552040816
0.562471655
0.572902494
0.583333333
0.593764172
0.604195011
0.61462585
0.624943311
0.63537415
0.645804989
0.656235828
0.666666667
0.677097506
0.687528345
0.697959184
0.708390023
0.718707483
0.729138322
0.739569161
0.75
0.760430839
0.770861678
0.781292517
0.791609977
0.802040816
0.812471655
0.822902494
0.833333333
0.843764172
0.854195011
0.86462585
0.875056689
0.88537415
0.895804989
0.906235828
0.916666667
0.927097506
0.937528345
0.947959184
0.958276644
0.968707483
0.979138322
0.989569161
Unfortunately, between these ticks it sends to me also fake ticks, that aren't multiply of original pulses. Such as 0,029024943, which is multiply of 256, which isn't in the pulse lists.
How can I find from this list which ticks are valid and which are fake?
I don't have the pulse list to compare with during the process, since 8820 will change during the time, so I don't have a list to compare step by step. I need to deduce it from ticks at each iteration.
What's the best math approch to this? Maybe reasoning only in tick and not pulse.
I've thought to find the closer error between nearest integer pulse and prev/next tick. Here in C++:
double pulse = tick * 96.;
double prevpulse = (tick - 1/8820.) * 96.;
double nextpulse = (tick + 1/8820.) * 96.;
int pulseRounded=round(pulse);
int buffer=lrint(tick * 8820.);
double pulseABS = abs(pulse - pulseRounded);
double prevpulseABS = abs(prevpulse - pulseRounded);
double nextpulseABS = abs(nextpulse - pulseRounded);
if (nextpulseABS > pulseABS && prevpulseABS > pulseABS) {
// is pulse
}
but for example tick 0.0417234 (pulse 368) fails since the prev tick error seems to be closer than it: prevpulseABS error (0.00543795) is smaller than pulseABS error (0.0054464).
That's because this comparison doesn't care about rounding I guess.
NEW POST:
Alright. Based on what I now understand, here's my revised answer.
You have the information you need to build a list of good values. Each time you switch to a new track:
vector<double> good_list;
good_list.reserve(96);
for(int i = 0; i < 96; i++)
good_list.push_back(BankerRound(8820.0 * i / 96.0) / 8820.0);
Then, each time you want to validate the input:
auto iter = find(good_list.begin(), good_list.end(), input);
if(iter != good_list.end()) //It's a match!
cout << "Happy days! It's a match!" << endl;
else
cout << "Oh bother. It's not a match." << endl;
The problem with mathematically determining the correct pulses is the BankerRound() function which will introduce an ever-growing error the higher values you input. You would then need a formula for a formula, and that's getting out of my wheelhouse. Or, you could keep track of the differences between successive values. Most of them would be the same. You'd only have to check between two possible errors. But that falls apart if you can jump tracks or jump around in one track.
OLD POST:
If I understand the question right, the only information you're getting should be coming in the form of (p/v = y) where you know 'y' (that's each element in your list of ticks you get from the device) and you know that 'p' is the Pulse and 'v' is the Values per Beat, but you don't know what either of them are. So, pulling one point of data from your post, you might have an equation like this:
p/v = 0.010430839
'v', in all the examples you've used thus far, is 8820, but from what I understand, that value is not a guaranteed constant. The next question then is: Do you have a way of determining what 'v' is before you start getting all these decimal values? If you do, you can work out mathematically what the smallest error can be (1/v) then take your decimal information, multiply it by 'v', round it to the nearest whole number and check to see if the difference between its rounded form and its non-rounded form falls in the bounds of your calculated error like so:
double input; //let input be elements in your list of doubles, such as 0.010430839
double allowed_error = 1.0 / values_per_beat;
double proposed = input * values_per_beat;
double rounded = std::round(proposed);
if(abs(rounded - proposed) < allowed_error){cout << "It's good!" << endl;}
If, however, you are not able to ascertain the values_per_beat ahead of time, then this becomes a statistical question. You must accumulate enough data samples, remove the outliers (the few that vary from the norm) and use that data. But that approach will not be realtime, which, given the terms you've been using (values per beat, bpm, the value 44100) it sounds like realtime might be what you're after.
Playing around with Excel, I think you want to multiply up to (what should be) whole numbers rather than looking for closest pulses.
Tick Pulse i Error OK
Tick*8820 Pulse*96/8820 ABS( i - INT( i+0.05 ) ) Error < 0.01
------------ ------------ ------------- ------------------------ ------------
0.029024943 255.9999973 2.786394528 0.786394528 FALSE
0.0417234 368.000388 4.0054464 0.0054464 TRUE
0 0 0 0 TRUE
0.010430839 91.99999998 1.001360544 0.001360544 TRUE
0.020861678 184 2.002721088 0.002721088 TRUE
0.031292517 275.9999999 3.004081632 0.004081632 TRUE
0.041723356 367.9999999 4.005442176 0.005442176 TRUE
0.052040816 458.9999971 4.995918336 0.004081664 TRUE
0.062471655 550.9999971 5.99727888 0.00272112 TRUE
0.072902494 642.9999971 6.998639424 0.001360576 TRUE
0.083333333 734.9999971 7.999999968 3.2E-08 TRUE
The table shows your two "problem" cases (the real wrong value, 256, and the one your code gets wrong, 368) followed by the first few "good" values.
If both 8820s vary at the same time, then obviously they will cancel out, and i will just be Tick*96.
The Error term is the difference between the calculated i and the nearest integer; if this less than 0.01, then it is a "good" value.
NOTE: the 0.05 and 0.01 values were chosen somewhat arbitrarily (aka inspired first time guess based on the numbers): adjust if needed. Although I've only shown the first few rows, all the 96 "good" values you gave show as TRUE.
The code (completely untested) would be something like:
double pulse = tick * 8820.0 ;
double i = pulse * 96.0 / 8820.0 ;
double error = abs( i - floor( i + 0.05 ) ) ;
if( error < 0.05 ) {
// is pulse
}
I assume your initializing your pulses in a for-loop, using int i as loop variable; then the problem is this line:
BankerRound(8820 * i/96);
8820 * i / 96 is an all integer operation and the result is integer again, cutting off the remainder (so in effect, always rounding towards zero already), and BankerRound actually has nothing to round any more. Try this instead:
BankerRound(8820 * i / 96.0);
Same problem applies if you are trying to calculate prev and next pulse, as you actually subtract and add 0 (again, 1/8820 is all integer and results in 0).
Edit:
From what I read from the commments, the 'system' is not – as I assumed previously – modifiable. Actually, it calculates ticks in the form of n / 96.0, n ∊ [0, 96) in ℕ
however including some kind of internal rounding appearently independent from the sample frequency, so there is some difference to the true value of n/96.0 and the ticks multiplied by 96 do not deliver exactly the integral values in [0, 96) (thanks KarstenKoop). And some of the delivered samples are simply invalid...
So the task is to detect, if tick * 96 is close enough to an integral value to be accepted as valid.
So we need to check:
double value = tick * 96.0;
bool isValid
= value - floor(value) < threshold
|| ceil(value) - value < threshold;
with some appropriately defined threshold. Assuming the values really are calculated as
double tick = round(8820*i/96.0)/8820.0;
then the maximal deviation would be slightly greater than 0.00544 (see below for a more exact value), thresholds somewhere in the sizes of 0.006, 0.0055, 0.00545, ... might be a choice.
Rounding might be a matter of internally used number of bits for the sensor value (if we have 13 bits available, ticks might actually be calculated as floor(8192 * i / 96.0) / 8192.0 with 8192 being 1 << 13 &ndash and floor accounting to integer division; just a guess...).
The exact value of the maximal deviation, using 8820 as factor, as exact as representable by double, was:
0.00544217687075132516838493756949901580810546875
The multiplication by 96 is actually not necessary, you can compare directly with the threshold divided by 96, which would be:
0.0000566893424036596371706764330156147480010986328125
I am creating a procedure that uses an if statement to perform a decision.
I have four variables: Altitude, Velocity, Angle and Temperature.
The procedure is as follows:
procedure Test3 is
begin
if (Integer(Status_System.Altitude_Measured)) >= Critical_Altitude
and (Integer(Status_System.Velocity_Measured)) >= Critical_Velocity
and (Integer(Status_System.Angle_Measured)) >= Critical_Angle
and (Integer(Status_System.Temperature_Measured)) >= Critical_Temperature
then
DT_Put ("message 1");
else
null;
end if;
end Test3;
This procedure is bassicaly taking the idea that if all the critcal values for the variables are met for each and every variable then it will print a message.
I want to be able to have a shorter way of paring up the statements so I can do the following:
If I have 4 variables: Altitude, velocity, angle and temperature
and I want to have a statement that says, If atleast 3 of these varibles (doesnt matter which three) are all exceeding their critical values
then display a message.
Is it even possible to do this?
I would hate to think that I would have to write each and every possible combination for the if statements.
In short, I want an if statement that says at least 3 of the variables shown are at their criticle value so print a message.
The same would be good for atleast 2 of these variables as well.
First, you should try to use specific types for altitude, velocity, angle and temperature. By using different types you'll leverage strong typing provided by Ada and avoid mistakes such as mixing or comparing altitudes and temperatures. Your example suggests that all of them are Integer. One possible definition could be (there are many others):
type Temperature is digits 5 range -273.15 .. 300.0;
type Angle is digits 5 range -180.0 .. 180.0;
The advantage of such definition is that you define both the range of values and the precision (captors all have a finite precision).
Counting the number of errors is one way do that.
In Ada 2012, you could write:
EDIT
Errors : Natural := 0;
...
Errors := Errors + (if Altitude_Measured > Critical_Altitude then 1 else 0);
Errors := Errors + (if Velocity_Measured > Critical_Velocity then 1 else 0);
Errors := Errors + (if Angle_Measured > Critical_Angle then 1 else 0);
Errors := Errors + (if Temperature_Measured > Critical_Temperature then 1 else 0);
if Errors >= 2 then
...
end if;
Boolean is an enumeration type with values (False, True). As with any enumeration type, the 'Pos attribute can be used to get the position of a value in the list of enumeration literals. Thus, Boolean'Pos(B) equals 0 if B is false, 1 if B is true.
Thus you could say
True_Count := Boolean'Pos(Integer(Status_System.Altitude_Measured) >= Critical_Altitude)
+ Boolean'Pos(Integer(Status_System.Velocity_Measured) >= Critical_Velocity)
+ Boolean'Pos(Integer(Status_System.Angle_Measured) >= Critical_Angle)
+ Boolean'Pos(Integer(Status_System.Temperature_Measured)) >= Critical_Temperature);