Verilog, testing for Zero flag - bit-manipulation

module NOR31_1x1(Y,A);
input [31:0] A;
output Y;
wire [29:0] norWire;
nor nor1(norWire[0], A[0], A[1]);
nor nor2(norWire[1], norWire[0], A[2]);
nor nor3(norWire[2], norWire[1], A[3]);
nor nor4(norWire[3], norWire[2], A[4]);
nor nor5(norWire[4], norWire[3], A[5]);
nor nor6(norWire[5], norWire[4], A[6]);
nor nor7(norWire[6], norWire[5], A[7]);
nor nor8(norWire[7], norWire[6], A[8]);
nor nor9(norWire[8], norWire[7], A[9]);
nor nor10(norWire[9], norWire[8], A[10]);
nor nor11(norWire[10], norWire[9], A[11]);
nor nor12(norWire[11], norWire[10], A[12]);
nor nor13(norWire[12], norWire[11], A[13]);
nor nor14(norWire[13], norWire[12], A[14]);
nor nor15(norWire[14], norWire[13], A[15]);
nor nor16(norWire[15], norWire[14], A[16]);
nor nor17(norWire[16], norWire[15], A[17]);
nor nor18(norWire[17], norWire[16], A[18]);
nor nor19(norWire[18], norWire[17], A[19]);
nor nor20(norWire[19], norWire[18], A[20]);
nor nor21(norWire[20], norWire[19], A[21]);
nor nor22(norWire[21], norWire[20], A[22]);
nor nor23(norWire[22], norWire[21], A[23]);
nor nor24(norWire[23], norWire[22], A[24]);
nor nor25(norWire[24], norWire[23], A[25]);
nor nor26(norWire[25], norWire[24], A[26]);
nor nor27(norWire[26], norWire[25], A[27]);
nor nor28(norWire[27], norWire[26], A[28]);
nor nor29(norWire[28], norWire[27], A[29]);
nor nor30(norWire[29], norWire[28], A[30]);
nor result(Y, norWire[29], A[31]);
endmodule
Hi, so I wrote the code above to test if Zero flag is set by nor each bit against another, It seems that the logic is correct, but the result keeps return 1 for zero flag regardless of any input. I ran through the simulation to test and it seems that norWire already contain some value despite not being set any value yet. Can I get some help debugging this please. I'm having a hard time debugging this due to being new to Verilog and the ModelSim simulator.

Let's limit this to three bits to illustrate the problem. If we label the bits A, B, and C what you're trying to express is:
!(A | B | C)
What you've written is:
!(!(A | B) | C)
where the output of one NOR (complete with the final negation) is being fed forward to the next stage.
So some judicious use of 'or' and 'not' primatives should get you there.

Related

Determining if a 16 bit binary number is negative or positive

I'm creating a library for a temperature sensor that has a 16-bit value in binary that is being returned. I'm trying to find the best way to check if that value returned is negative or positive. I'm curious as to whether or not I can check if the most significant bit is a 1 or a 0 and if that would be the best way to go about it, how to successfully implement it.
I know that I can convert it to decimal and check that way but I just was curious if there was an easier way. I've seen it implemented with shifting values but I don't fully understand that method. (I'm super new to c++)
float TMP117::readTempC(void)
{
int16_t digitalTemp; // Temperature stored in the TMP117 register
digitalTemp = readRegister(TEMP_RESULT); //Reads the temperature from the sensor
// Check if the value is a negative number
/*Insert code to check here*/
// Returns the digital temperature value multiplied by the resolution
// Resolution = .0078125
return digitalTemp*0.0078125;
}
I'm not sure how to check if the code works and I haven't been able to compile it and run it on the device because the new PCB design and sensor has not come in the mail yet.
I know that I can convert it to decimal and check that way
I am not sure what you mean. An integer is an integer, it is an arithmetic object you just compare it with zero:
if( digitalTemp < 0 )
{
// negative
}
else
{
// positive
}
You can as you suggest test the MSB, but there is no particular benefit, it lacks clarity, and will break or need modification if the type of digitalTemp changes.
if( (digitalTemp & 0x8000 )
{
// negative
}
else
{
// positive
}
"conversion to decimal", can only be interpreted as conversion to a decimal string representation of an integer, which does not make your task any simpler, and is entirely unnecessary.
I'm not sure how to check if the code works and I haven't been able to compile it and run it on the device because the new PCB design and sensor has not come in the mail yet.
Compile and run it on a PC in a test harness with stubs for teh hardware dependent functions. Frankly if you are new to C++, you are perhaps better off practising the fundamentals in a PC environment with generally better debug facilities and faster development/test iteration in any case.
In general
float TMP117::readTempC(void)
{
int16_t digitalTemp; // Temperature stored in the TMP117 register
digitalTemp = readRegister(TEMP_RESULT); //Reads the temperature from the sensor
// Check if the value is a negative number
if (digitalTemp < 0)
{
printf("Dang it is cold\n");
}
// Returns the digital temperature value multiplied by the resolution
// Resolution = .0078125
return digitalTemp*0.0078125;
}

Having some trouble with C++

I'm trying to write a program to check if a number between 1 and 9999999 is a number whose digits either stay the same or increases from left to right(The variable and function names are in Vietnamese)
#include<stdio.h>
int daykhonggiam(int n)
{
while (n>=10)
{
int donvi=n%10;
n=n/10;
if(donvi<n%10)
{
return 0;
}
}
return 1;
}
int main(void)
{
for(int i=1;i<=9999999;i++)
{
if(daykhonggiam(i)==1)
printf("%d\n",i);
}
}
The problem is, when i compile and run the code ,only some of the results were shown( the results from 5555999 to 9999999 ). when i hit f9 i can the the results run from 1 but the final screen only shows from 5555999 to 9999999. I tried an online compiler and all the results were shown.
So i guessing my dev c++ 5.11 is the problem here. Is there any chance any of you know why that's the case ?
It looks like printf just fills console buffer completely and old lines get removed. Try writing results to file or increase console buffer capacity somehow.
Open your terminal, click the top left corner, go to properties, and increase your buffer size. Save, and test. Repeat if needed.
That being said, I concur with previous comments that you should just use an output file.
Edit: After some testing, I could only get a max number of 9,000 lines to display concurrently. I'd pursue using an output file in your case.

Reading CF, PF, ZF, SF, OF

I am writing a virtual machine for my own assembly language, I want to be able to set the carry, parity, zero, sign and overflowflags as they are set in the x86-64 architecture, when I perform operations such as addition.
Notes:
I am using Microsoft Visual C++ 2015 & Intel C++ Compiler 16.0
I am compiling as a Win64 application.
My virtual machine (currently) only does arithmetic on 8-bit integers
I'm not (currently) interested in any other flags (e.g. AF)
My current solution is using the following function:
void update_flags(uint16_t input)
{
Registers::flags.carry = (input > UINT8_MAX);
Registers::flags.zero = (input == 0);
Registers::flags.sign = (input < 0);
Registers::flags.overflow = (int16_t(input) > INT8_MAX || int16_t(input) < INT8_MIN);
// I am assuming that overflow is handled by trunctation
uint8_t input8 = uint8_t(input);
// The parity flag
int ones = 0;
for (int i = 0; i < 8; ++i)
if (input8 & (1 << i) != 0) ++ones;
Registers::flags.parity = (ones % 2 == 0);
}
Which for addition, I would use as follows:
uint8_t a, b;
update_flags(uint16_t(a) + uint16_t(b));
uint8_t c = a + b;
EDIT:
To clarify, I want to know if there is a more efficient/neat way of doing this (such as by accessing RFLAGS directly)
Also my code may not work for other operations (e.g. multiplication)
EDIT 2 I have updated my code now to this:
void update_flags(uint32_t result)
{
Registers::flags.carry = (result > UINT8_MAX);
Registers::flags.zero = (result == 0);
Registers::flags.sign = (int32_t(result) < 0);
Registers::flags.overflow = (int32_t(result) > INT8_MAX || int32_t(result) < INT8_MIN);
Registers::flags.parity = (_mm_popcnt_u32(uint8_t(result)) % 2 == 0);
}
One more question, will my code for the carry flag work properly?, I also want it to be set correctly for "borrows" that occur during subtraction.
Note: The assembly language I am virtualising is of my own design, meant to be simple and based of Intel's implementation of x86-64 (i.e. Intel64), and so I would like these flags to behave in mostly the same way.
TL:DR: use lazy flag evaluation, see below.
input is a weird name. Most ISAs update flags based on the result of an operation, not the inputs. You're looking at the 16bit result of an 8bit operation, which is an interesting approach. In the C, you should just use unsigned int, which is guaranteed to be at least uint16_t. It will compile to better code on x86, where unsigned is 32bit. 16bit ops take an extra prefix and can lead to partial-register slowdowns.
That might help with the 8bx8b->16b mul problem you noted, depending on how you want to define the flag-updating for the mul instruction in the architecture you're emulating.
I don't think your overflow detection is correct. See this tutorial linked from the x86 tag wiki for how it's done.
This will probably not compile to very fast code, especially the parity flag. Do you need the ISA you're emulating/designing to have a parity flag? You never said you're emulating an x86, so I assume it's some toy architecture you're designing yourself.
An efficient emulator (esp. one that needs to support a parity flag) would probably benefit a lot from some kind of lazy flag evaluation. Save a value that you can compute flags from if needed, but don't actually compute anything until you get to an instruction that reads flags. Most instructions only write flags without reading them, and they just save the uint16_t result into your architectural state. Flag-reading instructions can either compute just the flag they need from that saved uint16_t, or compute all of them and store that somehow.
Assuming you can't get the compiler to actually read PF from the result, you might try _mm_popcnt_u32((uint8_t)x) & 1. Or, horizontally XOR all the bits together:
x = (x&0b00001111) ^ (x>>4)
x = (x&0b00000011) ^ (x>>2)
PF = (x&0b00000001) ^ (x>>1) // tweaking this to produce better asm is probably possible
I doubt any of the major compilers can peephole-optimize a bunch of checks on a result into LAHF + SETO al, or a PUSHF. Compilers can be led into using a flag condition to detect integer overflow to implement saturating addition, for example. But having it figure out that you want all the flags, and actually use LAHF instead of a series of setcc instruction, is probably not possible. The compiler would need a pattern-recognizer for when it can use LAHF, and probably nobody's implemented that because the use-cases are so vanishingly rare.
There's no C/C++ way to directly access flag results of an operation, which makes C a poor choice for implementing something like this. IDK if any other languages do have flag results, other than asm.
I expect you could gain a lot of performance by writing parts of the emulation in asm, but that would be platform-specific. More importantly, it's a lot more work.
I appear to have solved the problem, by splitting the arguments to update flags into an unsigned and signed result as follows:
void update_flags(int16_t unsigned_result, int16_t signed_result)
{
Registers::flags.zero = unsigned_result == 0;
Registers::flags.sign = signed_result < 0;
Registers::flags.carry = unsigned_result < 0 || unsigned_result > UINT8_MAX;
Registers::flags.overflow = signed_result < INT8_MIN || signed_result > INT8_MAX
}
For addition (which should produce the correct result for both signed & unsigned inputs) I would do the following:
int8_t a, b;
int16_t signed_result = int16_t(a) + int16_t(b);
int16_t unsigned_result = int16_t(uint8_t(a)) + int16_t(uint8_t(b));
update_flags(unsigned_result, signed_result);
int8_t c = a + b;
And signed multiplication I would do the following:
int8_t a, b;
int16_t result = int16_t(a) * int16_t(b);
update_flags(result, result);
int8_t c = a * b;
And so on for the other operations that update the flags
Note: I am assuming here that int16_t(a) sign extends, and int16_t(uint8_t(a)) zero extends.
I have also decided against having a parity flag, my _mm_popcnt_u32 solution should work if I change my mind later..
P.S. Thank you to everyone who responded, it was very helpful. Also if anyone can spot any mistakes in my code, that would be appreciated.

large loop for timing tests gets somehow optimized to nothing?

I am trying to test a series of libraries for matrix-vector computations. For that I just make a large loop and inside I call the routine I want to time. Very simple. However I sometimes see that when I increase the level of optimization for the compiler the time drops to zero no matter how large the loop is. See the example below where I try to time a C macro to compute cross products. What is the compiler doing? how can I avoid it but to allow maximum optimization for floating point arithmetics? Thank you in advance
The example below was compiled using g++ 4.7.2 on a computer with an i5 intel processor.
Using optimization level 1 (-O1) it takes 0.35 seconds. For level two or higher it drops down to zero. Remember, I want to time this so I want the computations to actually happen even if, for this simple test, unnecessary.
#include<iostream>
using namespace std;
typedef double Vector[3];
#define VecCross(A,assign_op,B,dummy_op,C) \
( A[0] assign_op (B[1] * C[2]) - (B[2] * C[1]), \
A[1] assign_op (B[2] * C[0]) - (B[0] * C[2]), \
A[2] assign_op (B[0] * C[1]) - (B[1] * C[0]) \
)
double get_time(){
return clock()/(double)CLOCKS_PER_SEC;
}
int main()
{
unsigned long n = 1000000000u;
double start;
{//C macro cross product
Vector u = {1,0,0};
Vector v = {1,1,0};
Vector w = {1.2,1.2,1.2};
start = get_time();
for(unsigned long i=0;i<n;i++){
VecCross (w, =, u, X, v);
}
cout << "C macro cross product: " << get_time()-start << endl;
}
return 0;
}
Ask yourself, what does your program actually do, in terms of what is visible to the end-user?
It displays the result of a calculation: get_time()-start. The contents of your loop have no bearing on the outcome of that calculation, because you never actually use the variables being modified inside the loop.
Therefore, the compiler optimises out the entire loop since it is irrelevant.
One solution is to output the final state of the variables being modified in the loop, as part of your cout statement, thus forcing the compiler to compute the loop. However, a smart compiler could also figure out that the loop always calculates the same thing, and it can simply insert the result directly into your cout statement, because there's no need to actually calculate it at run-time. As a workaround to this, you could for example require that one of the inputs to the loop be provided at run-time (e.g. read it in from a file, command line argument, cin, etc.).
For more (and possibly better) solutions, check out this duplicate thread: Force compiler to not optimize side-effect-less statements

warning errors problem

Program:
void DibLaplacian8Direct(CDib sourceImg)
{
register int i,j;
int w = sourceImg.GetWidth();
int h = sourceImg.GetHeight();
CDib cpyImage = sourceImg;
BYTE** pSourceImg = sourceImg.GetPtr();
BYTE** pCpyImage = cpyImage.GetPtr();
float G;
for(j =1;j<h-1;j++)
{
for(i =1;i<w-1;i++)
{
G = -1*pCpyImage[j-1][i-1] + -1*pCpyImage[j-1][i] + (-1)*pCpyImage[j-1][i+1]+
(-1)*pCpyImage[j][i-1] + 8*pCpyImage[j][i] + (-1)*pCpyImage[j][i+1]+
-1*pCpyImage[j+1][i-1] + (-1)*pCpyImage[j+1][i] + -1*pCpyImage[j+1][i+1];
pSourceImg[j][i] = (BYTE)G;
}
}
}
warning error:
warning.. Can't coonvert from int to float..
Warning 1 warning C4819: The file contains a character that cannot be represented in the current code page (1257). Save the file in Unicode format to prevent data loss D:\2nd\imagetool\dibfilter.cpp 1 1 ImageTool
I do't understand that why its making me warning of int to float.
and for warning 1,
I am using VS 2010.. i do't know that i am getting warning in StdAfx.h include file .
Amy one can help me with this .
The first warning is due to the fact that a float has only six significant figures whereas an int can have more. If it does, then accuracy is lost.
In general, you cannot convert an integer to floating point without possible losing data. Also, you cannot convert from floating point back to integer without losing the deceimal places, so you get a warning again.
A simple minimalistic code example of the above case:
#include<iostream>
using namespace std;
int main()
{
int a=10;
int b=3;
float c;
c=a/b;
cout << c << endl;
return 0;
}
If you are sure of the data being in the range and there wont be any loss of accuracy you can use typecasting to get rid of the warning.
G = (float) (.....)
Check this for the second warning.
To get rid of the second warning you need to save the file in Unicode format.
Go to file->advanced save options and under that select the new encoding you want to save it as. UTF-8 or UNICODE codepage 1200 are the settings you want.
It is important to understand what the compiler is telling you with warning 20. The issue is that floating point numbers have only 23 bits of precision, while ints have 31. If your numbers is larger than 2^23, you will lose the low bits by storing in a float.
Now your number can never be larger than 2^23, so you are fine. Still, it is important to know what is going on here. There is a reason for that warning, and simply putting in the cast without understanding what is going on may mess you up some day.
In your specific case, I am not at all clear on why you are using a float. You are adding nine integers, none of which can be greater than 2^11. You have plenty of precision in an int to do that. Using a float is just going to slow your program down. (Probably quite a bit.)
Finally, that last cast to BYTE is worrisome. What happens if your value is out of range? Probably not what you want. For example if BYTE is unsigned, and your float ends up -3.0, you will be storing 253 as the result.