Say I have the following:
always # (posedge clk) beign
...
if (~x) begin
y<=1'b0
end
...
end
And x, driven in another block, happens to be transitioning from 0 to 1 at the moment I'm checking it. How does if() evaluate in this case? Does it take the starting value of x (0) or the final value of x (1)?
I have looked over other answers to similar questions, but none discuss the particular case when the signal is transitioning.
Looking at the waves, the time precision does not allow me to discern if the signal x has settled to 0 or 1.
It depends on how x is driven.
If it is synchronous to the clock signal clk, then the value of x just before the clock edge (0) will be used. In a Verilog simulation, this means that x must be driven off the posedge clk and using a nonblocking assignment:
always #(posedge clk)
x <= some_expression;
Otherwise, you will have a race condition, and the value of x that is used will be indeterminate: it may be 0 or 1.
Related
I am quite new to C++ atomics and memory order and cannot wrap my head around one point. Consider the example taken directly from there: https://preshing.com/20120612/an-introduction-to-lock-free-programming/#sequential-consistency
std::atomic<int> X(0), Y(0);
int r1, r2;
void thread1()
{
X.store(1);
r1 = Y.load();
}
void thread2()
{
Y.store(1);
r2 = X.load();
}
There are sixteen possible outcomes in the total memory order:
thread 1 store -> thread 1 load -> thread 2 store -> thread 2 load
thread 1 store -> thread 2 store -> thread 1 load -> thread 2 load
...
Does sequential consistent program guarantee that if a particular store operation on some atomic variable happens before a load operation performed on the same atomic variable (but in another thread), the load will always see latest value stored (i.e., second point on the list above, where two stores happens before two loads in the total order)? In other words, if one put assert(r1 != 0 && r2 != 0) later in the program, would it be possible to fire the assert? According to the article such situation is not possible to take place. However, there is a quote taken from another thread where Anthony Williams commented on that: Concurrency: Atomic and volatile in C++11 memory model
"The default memory ordering of std::memory_order_seq_cst provides a single global total order for all std::memory_order_seq_cst operations across all variables. This doesn't mean that you can't get stale values, but it does mean that the value you do get determines and is determined by where in this total order your operation lies.
Who is right, who is wrong? Or maybe it's only my misunderstanding and both answers are correct.
All the statements you quoted are correct. I think the confusion is coming from ambiguity in terms like "latest" and "stale", which could refer either to the sequentially consistent total order, or to ordering in real time. Those two orders do not have to be consistent with each other, and only the former is relevant to describing the program's observable behavior.
Let's start by looking at your program and then come back to the terminology afterwards.
There are sixteen possible outcomes in the total memory order:
No, there are only six. Let's call the operations XS, XL, YS, YL for the stores and loads to X and Y respectively. The total order has to be consistent with the sequencing (program order) of each thread, hence the name "sequential consistency". So XS has to precede YL in the total order, and YS has to precede XL.
Does sequential consistent program guarantee that if a particular store operation on some atomic variable happens before a load operation performed on the same atomic variable (but in another thread), the load will always see latest value stored (i.e., second point on the list above, where two stores happens before two loads in the total order)?
Careful, let's not use the phrase "happens before", as that refers to a different partial order in the memory model, which we do not have to consider when interested only in ordering of seq_cst atomic operations.
Sequential consistency does guarantee reading the "latest" value stored, where by "latest" we mean with respect to the total order. To be precise, each load L of a variable X takes its value from the unique store S to X which satisfies the following conditions: S precedes L, and every other store to X precedes S. So in your program, XL will return 1 if it follows XS in the total order, otherwise it will return 0.
Thus here are the possible total orders, and the corresponding values returned by XL and YL (your r2 and r1 respectively):
XS, YL, YS, XL: here XL == 1 and YL == 0.
XS, YS, YL, XL: here XL == 1 and YL == 1.
XS, YS, XL, YL: here XL == 1 and YL == 1.
YS, XS, YL, XL: here XL == 1 and YL == 1.
YS, XS, XL, YL: here XL == 1 and YL == 1.
YS, XL, XS, YL: here XL == 0 and YL == 1.
Note there are no orderings resulting in XL == 0 and YL == 0. That would require XL to precede XS, and YL to precede YS. But program order already requires that XS precedes YL and YS precedes XL. That would make a cycle, which by definition of a total order is not allowed.
In other words, if one put assert(r1 != 0 && r2 != 0) later in the program, would it be possible to fire the assert? According to the article such situation is not possible to take place.
I think you misread Preshing's article, or maybe you just have a typo in your question. Preshing is saying that r1 and r2 cannot both be zero, i.e., that assert(r1 != 0 || r2 != 0) would not fire. That is absolutely correct. But your assertion with && certainly could fire, in the case of orders 1 or 6 above.
"This doesn't mean that you can't get stale values, but it does mean that the value you do get determines and is determined by where in this total order your operation lies." [Anthony Williams]
Here Anthony means "stale" in the sense of real time. For instance, it is quite possible that XS executes at time 12:00:00.0000001 and XL executes at time 12:00:00.0000002, but XL still loads the value 0. There can be real-time "lag" before an operation becomes globally visible.
But if this happens, it means we are in a total ordering in which XL precedes XS. That makes the total ordering inconsistent with wall clock time, but that is allowed. What cannot happen is for such "lag" to reverse the ordering of visibility for two operations from the same thread. In this example, the machine might have to delay the execution of YL until time 12:00:00.0000003 so that it does not become visible before XS. The compiler would be responsible for inserting appropriate barrier instructions to ensure that this will happen.
(This sets aside the fact that on a modern CPU, it doesn't even make sense to talk about the "time" at which an instruction executes. An instruction can execute in several stages spanning many clock cycles, and even within a single core, this may be happening for several instructions at once. The machine is required to preserve the illusion of program order for the core observing its own operations, but not necessarily when they are observed by other cores.)
Because of the total order, it is actually valid to treat all seq_cst operations as happening at distinct ticks of some global "clock", where visibility and causality are preserved with respect to this clock. It's just that this clock may not always be running forwards in time with respect to the clock on your wall.
I'm making a simple clock divider in modelsim.
When testing i notice that one if statement is never being executed. Any idea why?
It's the if count > 3 then statement. Modelsim gives the correct value of the counter integer (4, 5, 6, etc) but will never go into the if statement.
------------------------------------------------
-------- CLOCK DIVIDER
------------------------------------------------
entity clock_divider is
port ( clk,reset: in std_logic;
clock_out: out std_logic);
end clock_divider;
architecture bhv of clock_divider is
signal tmp : std_logic:='0';
begin
process(clk,reset,tmp)
variable count: integer:=1;
begin
if(reset='1') then
count := 0;
tmp <= '0';
elsif rising_edge(clk) then
count := count + 1;
if count > 3 then
tmp <= not(tmp);
count := 0;
end if;
end if;
clock_out <= tmp;
end process;
end bhv;
It's because you're using a variable for count. It should be a signal.
Each time the process is carried out in simulation, count will be set to 1.
Variables sometimes get wrongly seen as 'local signals', sort of like local variables in software, which they're not. A design using variables like this will behave differently in simulation to what is produced by a compiler during synthesis. The compiler will promote the variable to a signal but hidden away, with a warning given amongst any other warnings.
Instead, use a signal for count. The HDL design really must imply the actual circuit to be produced, which is DFFs for count.
Variables are useful in the clocked processes of synthesizable designs for (a) reshaping signal values, such as forming buses, or (b) naming nodes along a combinatorial tree. They're not for implying DFFs.
When I try to assign a variable x within the body of an if-statement I get unexpected results if the variable also occurs in the condition of the if-statement.
For example, the code
model algorithmTest_25p05p2021
Real x(start=0);
Real y(start=0);
algorithm
y := sin(time);
x := sin(time);
if x < 0 then // replace with y < 0 --> x is correctly truncated
x := 0;
end if;
end algorithmTest_25p05p2021;
results in
I used OMEdit in OpenModelica 1.17.0, simulation time 120s, maximum step time of 1s.
I can't wrap my head around what is going on here.
In my understanding, the algorithm section implies that x is initialized to its start value 0. After initialization, I thought the statements in the algorithm section are executed in order. So, before the if-statement, x is set to the value of x=sin(time).
Afterwards, I expected that the if-statement would set x=0 if sin(time) < 0 and leave x to be x=sin(time) if sin(time)>=0.
You see what happens instead: x stays zero after the condition triggers for the first time.
What confuses me even more is that replacing the "x<0"-condition with a "y<0"-condition fixes the issue.
What have I missed here? Any pointers to the Modelica specification?
EDIT (27.05.2021):
As this behaviour appears to be a bug in OpenModelica 1.17.0, I posted it on their Github, see
https://github.com/OpenModelica/OpenModelica/issues/7484
It must be a bug.
Clearly there should be an event for x<0, but the event-logic is only important when x is close to zero and thus should have a minimal impact on graph.
The relevant sections of the specification I can find are:
If-statements are only evaluated if true
https://specification.modelica.org/maint/3.5/statements-and-algorithm-sections.html#if-statement
If the condition and the hidden state is inconsistent an event is generated:
https://specification.modelica.org/maint/3.5/equations.html#events-and-synchronization
Conceptually x is initialized with its start-value, but it doesn't matter since it is unconditionally assigned https://specification.modelica.org/maint/3.5/statements-and-algorithm-sections.html#execution-of-an-algorithm-in-a-model
I got familiar with a little bit of Verilog at school and now, one year later, I bought a Basys 3 FPGA board. My goal is to learn VHDL.
I have been reading a free book called "Free Range VHDL" which assists greatly in understanding the VHDL language. I have also searched through github repos containing VHDL code for reference.
My biggest concern is the difference between sequential and concurrent execution. I understand the meaning of these two words but I still cannot imagine why we can use "process" for combinational logic (i.e. seven segment decoder). I have implemented my seven segment decoder as conditional assignment of concurrent statements. What would be the difference if I implemented the decoder using process and a switch statement? I do not understand the word sequential execution of process when it comes to combinational logic. I would understand it if it was a sequential machine-a state machine.
Can somebody please explain this concept?
Here is my code for a seven-segment decoder:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity hex_display_decoder is
Port ( D: in STD_LOGIC_VECTOR (3 downto 0);
SEG_OUT : out STD_LOGIC_VECTOR (6 downto 0));
end hex_display_decoder;
architecture dataflow of hex_display_decoder is
begin
with D select
SEG_OUT <= "1000000" when "0000",
"1111001" when "0001",
"0100100" when "0010",
"0110000" when "0011",
"0011001" when "0100",
"0010010" when "0101",
"0000010" when "0110",
"1111000" when "0111",
"0000000" when "1000",
"0010000" when "1001",
"0001000" when "1010",
"0000011" when "1011",
"1000110" when "1100",
"0100001" when "1101",
"0000110" when "1110",
"0001110" when "1111",
"1111111" when others;
end dataflow;
Thank you,
Jake Hladik
My biggest concern is difference between sequential and concurrent
execution. I understand the meaning of these two words but I still
cannot imagine why we can use "process" for combinational logic (ex.
seven segment decoder).
You are confounding two things:
The type of logic, which can be sequential or combinational.
The order of execution of statements, which can be sequential or concurrent.
Types of logic
In logic design:
A combinational circuit is one that implements a pure logic function without any state. There is no need for a clock in a combinational circuit.
A sequential circuit is one that changes every clock cycle and that remembers its state (using flip-flops) between clock cycles.
The following VHDL process is combinational:
process(x, y) begin
z <= x or y;
end process;
We know it is combinational because:
It does not have a clock.
All its inputs are in its sensitivity list (the parenthesis after the process keyword). That means a change to any one of these inputs will cause the process to be re-evaluated.
The following VHDL process is sequential:
process(clk) begin
if rising_edge(clk) then
if rst = '1' then
z <= '0';
else
z <= z xor y;
end if;
end if;
end process;
We know it is sequential because:
It is only sensitive to changes on its clock (clk).
Its output only changes value on a rising edge of the clock.
The output value of z depends on its previous value (z is on both sides of the assignment).
Model of Execution
To make a long story short, processes are executed as follow in VHDL:
Statements within a process are executed sequentially (i.e. one after the
other in order).
Processes run concurrently relative to one another.
Processes in Disguise
So-called concurrent statements, essentially all statements outside a process, are actually processes in disguise. For example, this concurrent signal assignment (i.e. an assignment to a signal outside a process):
z <= x or y;
Is equivalent to this process:
process(x, y) begin
z <= x or y;
end process;
That is, it is equivalent to the same assignment within a process that has all of its inputs in the sensitivity list. And by equivalent, I mean the VHDL standard (IEEE 1076) actually defines the behaviour of concurrent signal assignments by their equivalent process.
What that means is that, even though you didn't know it, this statement of yours in hex_display_decoder:
SEG_OUT <= "1000000" when "0000",
"1111001" when "0001",
"0100100" when "0010",
"0110000" when "0011",
"0011001" when "0100",
"0010010" when "0101",
"0000010" when "0110",
"1111000" when "0111",
"0000000" when "1000",
"0010000" when "1001",
"0001000" when "1010",
"0000011" when "1011",
"1000110" when "1100",
"0100001" when "1101",
"0000110" when "1110",
"0001110" when "1111",
"1111111" when others;
is already a process.
Which, in turn, means
What would be the difference if I implemented the decoder using
process and a switch statement?
None at all.
I was reading Bjarne Stroustrup's C++11 FAQ and I'm having trouble understanding an example in the memory model section.
He gives the following code snippet:
// start with x==0 and y==0
if (x) y = 1; // thread 1
if (y) x = 1; // thread 2
The FAQ says there is not a data race here. I don't understand. The memory location x is read by thread 1 and written to by thread 2 without any synchronization (and the same goes for y). That's two accesses, one of which is a write. Isn't that the definition of a data race?
Further, it says that "every current C++ compiler (that I know of) gives the one right answer." What is this one right answer? Couldn't the answer vary depending on whether one thread's comparison happens before or after the other thread's write (or if the other thread's write is even visible to the reading thread)?
// start with x==0 and y==0
if (x) y = 1; // thread 1
if (y) x = 1; // thread 2
Since neither x nor y is true, the other won't be set to true either. No matter the order the instructions are executed, the (correct) result is always x remains 0, y remains 0.
The memory location x is ... written to by thread 2
Is it really? Why do you say so?
If y is 0 then x is not written to by thread 2. And y starts out 0. Similarly, x cannot be non-zero unless somehow y is non-zero "before" thread 1 runs, and that cannot happen. The general point here is that conditional writes that don't execute don't cause a data race.
This is a non-trivial fact of the memory model, though, because a compiler that is not aware of threading would be permitted (assuming y is not volatile) to transform the code if (x) y = 1; to int tmp = y; y = 1; if (!x) y = tmp;. Then there would be a data race. I can't imagine why it would want to do that exact transformation, but that doesn't matter, the point is that optimizers for non-threaded environments can do things that would violate the threaded memory model. So when Stroustrup says that every compiler he knows of gives the right answer (right under C++11's threading model, that is), that's a non-trivial statement about the readiness of those compilers for C++11 threading.
A more realistic transformation of if (x) y = 1 would be y = x ? 1 : y;. I believe that this would cause a data race in your example, and that there is no special treatment in the standard for the assignment y = y that makes it safe to execute unsequenced with respect to a read of y in another thread. You might find it hard to imagine hardware on which it doesn't work, and anyway I may be wrong, which is why I used a different example above that's less realistic but has a blatant data race.
There has to be a total ordering of the writes, because of the fact that no thread can write to the variable x or y until some other thread has first written a 1 to either variable. In other words you have basically three different scenarios:
thread 1 gets to write to y because x was written to at some previous point before the if statement, and then if thread 2 comes later, it writes to x the same value of 1, and doesn't change it's previous value of 1.
thread 2 gets to write to x because y was changed at some point before the if statement, and then thread 1 will write to y if it comes later the same value of 1.
If there are only two threads, then the if statements are jumped over because x and y remain 0.
Neither of the writes occurs, so there is no race. Both x and y remain zero.
(This is talking about the problem of phantom writes. Suppose one thread speculatively did the write before checking the condition, then attempted to correct things after. That would break the other thread, so it isn't allowed.)
Memory model set the supportable size of code and data areas.before comparing linking source code,we need to specify the memory model that is he can set the size limitsthe data and code.