How statements are executed concurrently in combinational logic using VHDL? - concurrency

I wonder how signal assignment statements are executed concurrently in combinational logic using VHDL? For the following code for example the three statements are supposed to run concurrently. What I have a doubt in is that how the 'y' output signal is immediately changed when I run the simulation although if the statements ran concurrently 'y' will not see the effect of 'wire1' and 'wire2' (only if the statements are executed more than one time).
entity test1 is port (a, b, c, d : in bit; y : out bit);
end entity test1;
------------------------------------------------------
architecture basic of test1 is
signal wire1, wire2 : bit;
begin
wire1 <= a and b;
wire2 <= c and d;
y <= wire1 and wire2;
end architecture basic;

Since VHDL is used for simulating digital circuits, this must work similarly to the actual circuits, where (after a small delay usually ignored in simulations) circuits continously follow their inputs.
I assume you wonder how the implementation achieves this behaviour:
The simulator will keep track of which signal depends on which other symbol and reevaluates the expression whenever one of the inputs changes.
So when a changes, wire1 will be updated, and in turn trigger an update to y. This will continue as long as combinatorial updates are necessary. So in the simulation the updates are indeed well ordered, although no simulation time has passed. The "time" between such updates is often called a "delta cycle".

Related

could you help me undrstand the parrallelisme in VHDL?

I understand that in a process the instructions are executed sequentially and that the value of a signal is not updated until the end of the process, but I can not understand the principle of parallelism? for example in the following code I know that both instructions will be executed in parallel (at the same time) but I do not know if Q will have the new value of Sig2 or the precidente also when we calculate Sig2 do we use the new value of Sig1 or the precidente ?
Sig1<=a and b;
Sig2<=Sig1 and a;
Q<=Sig2;
As VHDL uses event driven semantics, nothing actually executes in parallel. It just has the appearance of parallelism. The concurrent assignments you show execute whenever the RHS operands change—there is no implied ordering. If a changes from 1 to 0, you cannot depend on which order the first two statements execute. It's possible the 2nd assignment executes first, then the 1st assignment executes second, followed by the 3rd assignment executes third (because Sig2 has changed) and then the 2nd assignment executes again because Sig1 has changed.
Most tools will try to order the statements to minimize the number of assignment re-executions and may even optimize it as if you wrote:
Q <= a and b;
and eliminate Sig1 and Sig2 from the simulation.

implementing a flip-flop with concurrent statement

It is stated in VHDL programming that for combinational circuits, concurrent statements are used while for sequential circuits, both concurrent and sequential statements are applicable. Now the question is:
What will happen if I write a sequential code in a concurrent form? For example, I don't use process and write a flip flop with when..else
architecture x of y is
begin
q <= '0' when rst=1 else
d when (clock'event and clock='1') else
q;
end;
Is that a correct and synesthesizable code? If it is an incorrect code, what is wrong with that exactly (apart form syntax errors)?
You say: "It is stated in VHDL programming that for combinational circuits, concurrent statements are used while for sequential circuits, both concurrent and sequential statements are applicable.". That is simply not true. You can model both combinational and sequential code using either concurrent or sequential statements.
It is unusual to model sequential logic using concurrent statements. (I say that because I see a lot of other people's code in my job and I almost never see it). However, it is possible. Your code does have a syntax error and a more fundamental error. This modified version of your code synthesises to a rising-edge triggered flip-flop with an asynchronous, active-high reset, as you expected:
q <= '0' when rst='1' else
d when clock'event and clock='1';
The syntax error was that you had rst=1 instead of rst='1'. The more fundamental error was that you don't need the else q. This is unnecessary, because signals in VHDL retain the value previously assigned until a new value is assigned. Therefore, in VHDL code modelling sequential logic, it is never necessary to write q <= q (or its equivalent). In your case, in the MCVE I constructed q was an output and so your else q gave a syntax error because you cannot read outputs.
Here's the MCVE:
library IEEE;
use IEEE.std_logic_1164.all;
entity concurrent_flop is
port (clock, rst, d : in std_logic;
q : out std_logic);
end entity concurrent_flop;
architecture concurrent_flop of concurrent_flop is
begin
q <= '0' when rst='1' else
d when clock'event and clock='1';
end architecture concurrent_flop;
I wrote an MCVE to check what I was about to say was correct. You could have done the same. Doing so is a great way of learning VHDL. EDA Playground is often a good place to try things out (shameless plug), but was no good in this case, because one cannot synthesise VHDL on EDA Playground.

concurrent and conditional signal assignment (VHDL)

In VHDL, there are two types for signal assignment:
concurrent ----> when...else
----> select...when...else
sequential ----> if...else
----> case...when
Problem is that some say that when...else conditions are checked line by line (king of sequential) while select...when...else conditionals are checked once. See this reference for example.
I say that when..else is also a sequential assignment because you are checking line by line. In other words, I say that there no need to say if..else within a process is equivalent to when..else. Why they assume when..else is a concurrent assignment?
Where you are hinting at in your problem has nothing to do with concurrent assignments or sequential statements. It has more to do with the difference between if and case. Before we get to that first lets understand a few equivalents. The concurrent conditional assignment:
Y <= A when ASel = '1' else B when BSel = '1' else C ;
Is exactly equivalent to a process with the following code:
process(A, ASel, B, BSel, C)
begin
if ASel = '1' then
Y <= A ;
elsif BSel = '1' then
Y <= B ;
else
Y <= C ;
end if ;
end process ;
Likewise the concurrent selected assignment:
With MuxSel select
Y <= A when "00", B when "01", C when others ;
Is equivalent to a process with the following:
process(MuxSel, A, B , C)
begin
case MuxSel is
when "00" => Y <= A;
when "01" => Y <= B ;
when others => Y <= C ;
end case ;
end process ;
From a coding perspective, the sequential forms above have a little more coding capability than the assignment form because case and if allow blocks of code, where the assignment form only assigns to one signal. However other than that, they have the same language restrictions and produce the same hardware (as much as synthesis tools do that). In addition for many simple hardware problems, the assignment form works well and is a concise capture of the problem.
So where your thoughts are leading really comes down to the difference between if and case. If statements (and their equivalent conditional assignments) that have have multiple "elsif" in (or implied in) them tend to create priority logic or at least cascaded logic. Where as case (and their equivalent selected assignments) tend to be well suited for things like multiplexers and their logic structure tends to be more of a balanced tree structure.
Sometimes tools will refactor an if statement to allow it to be equivalent to a case statement. Also for some targets (particularly LUT based logic like Xilinx and Altera), the difference between them in terms of hardware effiency does not show up until there are enough "elsif" branches though.
With VHDL-2008, the assignment forms are also allowed in sequential code. The transformation is the same except without the process wrapper.
Concurrent vs Sequential is about independence of execution.
A concurrent statement is simply a statement that is evaluated and/or executed independently of the code that surrounds it. Processes are concurrent. Component/Entity Instances are concurrent. Signal assignments and procedure calls that are done in the architecture are concurrent.
Sequential statements (other than wait) run when the code around it also runs.
Interesting note, while a process is concurrent (because it runs independently of other processes and concurrent assignments), it contains sequential statements.
Often when we write RTL code, the processes that we write are simple enough that it is hard to see the sequential nature of them. It really takes a statemachine or a testbench to see the true sequential nature of a process.

Please, clarify the concept of sequential and concurrent execution in VHDL

I got familiar with a little bit of Verilog at school and now, one year later, I bought a Basys 3 FPGA board. My goal is to learn VHDL.
I have been reading a free book called "Free Range VHDL" which assists greatly in understanding the VHDL language. I have also searched through github repos containing VHDL code for reference.
My biggest concern is the difference between sequential and concurrent execution. I understand the meaning of these two words but I still cannot imagine why we can use "process" for combinational logic (i.e. seven segment decoder). I have implemented my seven segment decoder as conditional assignment of concurrent statements. What would be the difference if I implemented the decoder using process and a switch statement? I do not understand the word sequential execution of process when it comes to combinational logic. I would understand it if it was a sequential machine-a state machine.
Can somebody please explain this concept?
Here is my code for a seven-segment decoder:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity hex_display_decoder is
Port ( D: in STD_LOGIC_VECTOR (3 downto 0);
SEG_OUT : out STD_LOGIC_VECTOR (6 downto 0));
end hex_display_decoder;
architecture dataflow of hex_display_decoder is
begin
with D select
SEG_OUT <= "1000000" when "0000",
"1111001" when "0001",
"0100100" when "0010",
"0110000" when "0011",
"0011001" when "0100",
"0010010" when "0101",
"0000010" when "0110",
"1111000" when "0111",
"0000000" when "1000",
"0010000" when "1001",
"0001000" when "1010",
"0000011" when "1011",
"1000110" when "1100",
"0100001" when "1101",
"0000110" when "1110",
"0001110" when "1111",
"1111111" when others;
end dataflow;
Thank you,
Jake Hladik
My biggest concern is difference between sequential and concurrent
execution. I understand the meaning of these two words but I still
cannot imagine why we can use "process" for combinational logic (ex.
seven segment decoder).
You are confounding two things:
The type of logic, which can be sequential or combinational.
The order of execution of statements, which can be sequential or concurrent.
Types of logic
In logic design:
A combinational circuit is one that implements a pure logic function without any state. There is no need for a clock in a combinational circuit.
A sequential circuit is one that changes every clock cycle and that remembers its state (using flip-flops) between clock cycles.
The following VHDL process is combinational:
process(x, y) begin
z <= x or y;
end process;
We know it is combinational because:
It does not have a clock.
All its inputs are in its sensitivity list (the parenthesis after the process keyword). That means a change to any one of these inputs will cause the process to be re-evaluated.
The following VHDL process is sequential:
process(clk) begin
if rising_edge(clk) then
if rst = '1' then
z <= '0';
else
z <= z xor y;
end if;
end if;
end process;
We know it is sequential because:
It is only sensitive to changes on its clock (clk).
Its output only changes value on a rising edge of the clock.
The output value of z depends on its previous value (z is on both sides of the assignment).
Model of Execution
To make a long story short, processes are executed as follow in VHDL:
Statements within a process are executed sequentially (i.e. one after the
other in order).
Processes run concurrently relative to one another.
Processes in Disguise
So-called concurrent statements, essentially all statements outside a process, are actually processes in disguise. For example, this concurrent signal assignment (i.e. an assignment to a signal outside a process):
z <= x or y;
Is equivalent to this process:
process(x, y) begin
z <= x or y;
end process;
That is, it is equivalent to the same assignment within a process that has all of its inputs in the sensitivity list. And by equivalent, I mean the VHDL standard (IEEE 1076) actually defines the behaviour of concurrent signal assignments by their equivalent process.
What that means is that, even though you didn't know it, this statement of yours in hex_display_decoder:
SEG_OUT <= "1000000" when "0000",
"1111001" when "0001",
"0100100" when "0010",
"0110000" when "0011",
"0011001" when "0100",
"0010010" when "0101",
"0000010" when "0110",
"1111000" when "0111",
"0000000" when "1000",
"0010000" when "1001",
"0001000" when "1010",
"0000011" when "1011",
"1000110" when "1100",
"0100001" when "1101",
"0000110" when "1110",
"0001110" when "1111",
"1111111" when others;
is already a process.
Which, in turn, means
What would be the difference if I implemented the decoder using
process and a switch statement?
None at all.

How/why do functional languages (specifically Erlang) scale well?

I have been watching the growing visibility of functional programming languages and features for a while. I looked into them and didn't see the reason for the appeal.
Then, recently I attended Kevin Smith's "Basics of Erlang" presentation at Codemash.
I enjoyed the presentation and learned that a lot of the attributes of functional programming make it much easier to avoid threading/concurrency issues. I understand the lack of state and mutability makes it impossible for multiple threads to alter the same data, but Kevin said (if I understood correctly) all communication takes place through messages and the mesages are processed synchronously (again avoiding concurrency issues).
But I have read that Erlang is used in highly scalable applications (the whole reason Ericsson created it in the first place). How can it be efficient handling thousands of requests per second if everything is handled as a synchronously processed message? Isn't that why we started moving towards asynchronous processing - so we can take advantage of running multiple threads of operation at the same time and achieve scalability? It seems like this architecture, while safer, is a step backwards in terms of scalability. What am I missing?
I understand the creators of Erlang intentionally avoided supporting threading to avoid concurrency problems, but I thought multi-threading was necessary to achieve scalability.
How can functional programming languages be inherently thread-safe, yet still scale?
A functional language doesn't (in general) rely on mutating a variable. Because of this, we don't have to protect the "shared state" of a variable, because the value is fixed. This in turn avoids the majority of the hoop jumping that traditional languages have to go through to implement an algorithm across processors or machines.
Erlang takes it further than traditional functional languages by baking in a message passing system that allows everything to operate on an event based system where a piece of code only worries about receiving messages and sending messages, not worrying about a bigger picture.
What this means is that the programmer is (nominally) unconcerned that the message will be handled on another processor or machine: simply sending the message is good enough for it to continue. If it cares about a response, it will wait for it as another message.
The end result of this is that each snippet is independent of every other snippet. No shared code, no shared state and all interactions coming from a a message system that can be distributed among many pieces of hardware (or not).
Contrast this with a traditional system: we have to place mutexes and semaphores around "protected" variables and code execution. We have tight binding in a function call via the stack (waiting for the return to occur). All of this creates bottlenecks that are less of a problem in a shared nothing system like Erlang.
EDIT: I should also point out that Erlang is asynchronous. You send your message and maybe/someday another message arrives back. Or not.
Spencer's point about out of order execution is also important and well answered.
The message queue system is cool because it effectively produces a "fire-and-wait-for-result" effect which is the synchronous part you're reading about. What makes this incredibly awesome is that it means lines do not need to be executed sequentially. Consider the following code:
r = methodWithALotOfDiskProcessing();
x = r + 1;
y = methodWithALotOfNetworkProcessing();
w = x * y
Consider for a moment that methodWithALotOfDiskProcessing() takes about 2 seconds to complete and that methodWithALotOfNetworkProcessing() takes about 1 second to complete. In a procedural language this code would take about 3 seconds to run because the lines would be executed sequentially. We're wasting time waiting for one method to complete that could run concurrently with the other without competing for a single resource. In a functional language lines of code don't dictate when the processor will attempt them. A functional language would try something like the following:
Execute line 1 ... wait.
Execute line 2 ... wait for r value.
Execute line 3 ... wait.
Execute line 4 ... wait for x and y value.
Line 3 returned ... y value set, message line 4.
Line 1 returned ... r value set, message line 2.
Line 2 returned ... x value set, message line 4.
Line 4 returned ... done.
How cool is that? By going ahead with the code and only waiting where necessary we've reduced the waiting time to two seconds automagically! :D So yes, while the code is synchronous it tends to have a different meaning than in procedural languages.
EDIT:
Once you grasp this concept in conjunction with Godeke's post it's easy to imagine how simple it becomes to take advantage of multiple processors, server farms, redundant data stores and who knows what else.
It's likely that you're mixing up synchronous with sequential.
The body of a function in erlang is being processed sequentially.
So what Spencer said about this "automagical effect" doesn't hold true for erlang. You could model this behaviour with erlang though.
For example you could spawn a process that calculates the number of words in a line.
As we're having several lines, we spawn one such process for each line and receive the answers to calculate a sum from it.
That way, we spawn processes that do the "heavy" computations (utilizing additional cores if available) and later we collect the results.
-module(countwords).
-export([count_words_in_lines/1]).
count_words_in_lines(Lines) ->
% For each line in lines run spawn_summarizer with the process id (pid)
% and a line to work on as arguments.
% This is a list comprehension and spawn_summarizer will return the pid
% of the process that was created. So the variable Pids will hold a list
% of process ids.
Pids = [spawn_summarizer(self(), Line) || Line <- Lines],
% For each pid receive the answer. This will happen in the same order in
% which the processes were created, because we saved [pid1, pid2, ...] in
% the variable Pids and now we consume this list.
Results = [receive_result(Pid) || Pid <- Pids],
% Sum up the results.
WordCount = lists:sum(Results),
io:format("We've got ~p words, Sir!~n", [WordCount]).
spawn_summarizer(S, Line) ->
% Create a anonymous function and save it in the variable F.
F = fun() ->
% Split line into words.
ListOfWords = string:tokens(Line, " "),
Length = length(ListOfWords),
io:format("process ~p calculated ~p words~n", [self(), Length]),
% Send a tuple containing our pid and Length to S.
S ! {self(), Length}
end,
% There is no return in erlang, instead the last value in a function is
% returned implicitly.
% Spawn the anonymous function and return the pid of the new process.
spawn(F).
% The Variable Pid gets bound in the function head.
% In erlang, you can only assign to a variable once.
receive_result(Pid) ->
receive
% Pattern-matching: the block behind "->" will execute only if we receive
% a tuple that matches the one below. The variable Pid is already bound,
% so we are waiting here for the answer of a specific process.
% N is unbound so we accept any value.
{Pid, N} ->
io:format("Received \"~p\" from process ~p~n", [N, Pid]),
N
end.
And this is what it looks like, when we run this in the shell:
Eshell V5.6.5 (abort with ^G)
1> Lines = ["This is a string of text", "and this is another", "and yet another", "it's getting boring now"].
["This is a string of text","and this is another",
"and yet another","it's getting boring now"]
2> c(countwords).
{ok,countwords}
3> countwords:count_words_in_lines(Lines).
process <0.39.0> calculated 6 words
process <0.40.0> calculated 4 words
process <0.41.0> calculated 3 words
process <0.42.0> calculated 4 words
Received "6" from process <0.39.0>
Received "4" from process <0.40.0>
Received "3" from process <0.41.0>
Received "4" from process <0.42.0>
We've got 17 words, Sir!
ok
4>
The key thing that enables Erlang to scale is related to concurrency.
An operating system provides concurrency by two mechanisms:
operating system processes
operating system threads
Processes don't share state – one process can't crash another by design.
Threads share state – one thread can crash another by design – that's your problem.
With Erlang – one operating system process is used by the virtual machine and the VM provides concurrency to Erlang programme not by using operating system threads but by providing Erlang processes – that is Erlang implements its own timeslicer.
These Erlang process talk to each other by sending messages (handled by the Erlang VM not the operating system). The Erlang processes address each other using a process ID (PID) which has a three-part address <<N3.N2.N1>>:
process no N1 on
VM N2 on
physical machine N3
Two processes on the same VM, on different VM's on the same machine or two machines communicate in the same way – your scaling is therefore independent of the number of physical machines you deploy your application on (in the first approximation).
Erlang is only threadsafe in a trivial sense – it doesn't have threads. (The language that is, the SMP/multi-core VM uses one operating system thread per core).
You may have a misunderstanding of how Erlang works. The Erlang runtime minimizes context-switching on a CPU, but if there are multiple CPUs available, then all are used to process messages. You don't have "threads" in the sense that you do in other languages, but you can have a lot of messages being processed concurrently.
Erlang messages are purely asynchronous, if you want a synchronous reply to your message you need to explicitly code for that. What was possibly said was that messages in a process message box is processed sequentially. Any message sent to a process goes sits in that process message box, and the process gets to pick one message from that box process it and then move on to the next one, in the order it sees fit. This is a very sequential act and the receive block does exactly that.
Looks like you have mixed up synchronous and sequential as chris mentioned.
Referential transparency: See http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
In a purely functional language, order of evaluation doesn't matter - in a function application fn(arg1, .. argn), the n arguments can be evaluated in parallel. That guarantees a high level of (automatic) parallelism.
Erlang uses a process modell where a process can run in the same virtual machine, or on a different processor -- there is no way to tell. That is only possible because messages are copied between processes, there is no shared (mutable) state. Multi-processor paralellism goes a lot farther than multi-threading, since threads depend upon shared memory, this there can only be 8 threads running in parallel on a 8-core CPU, while multi-processing can scale to thousands of parallel processes.