I have an array initialization based on an implied do loop, given an odd size N.
J=(N+1)/2
XLOC(1:N) = (/ (I-J, I=1,N) /)
In the context of F90+ is it recommended to use the (/ .. /) syntax, or is more efficient to use a FORALL statement.
Example: for N=19 then XLOC=(-9,-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7,8,9)
How else would you initialize this array?
Edit 1
How would you initialize this array with more readable code?
For such a simple construct both are likely to lead to the same code because compilers are good at optimizing. The FORALL statement is not so much a looping statement but an initialization statement that has many restrictions that can inhibit optimizations. If a simple loop will work, I'd use it.
Also see this previous answer: Do Fortran 95 constructs such as WHERE, FORALL and SPREAD generally result in faster parallel code?
There is no reason they should be less efficient that actual do loops. If you find a case, where they are, report it as an missed optimization bug to your compiler vendor!
Related
I have a fortran code using a derived type as follows:
if(type%value(1).LT.0D0 OR type%value(1).GT.1D0) then
if(type%value(1).LT.0D0) then
do something
end if
deallocate(type%value)
end if
In this scenario the statement type%value(1).LT.0D0 is checked twice. Is there a way to avoid this? More generally, is there a better approach to validate this?
Well, you have three paths:
type % value(1) < 0
type % value(1) > 1
0 <= type % value(1) <= 1
In all three paths the statements will be different, so you need two if statements.
Do something then deallocate
Only deallocate
Nothing
How you branch between the three paths is up to you, and the specifics of your code. The way you describe above is totally reasonable. If the evaluation is more computational intensive than a simple floating point comparison, it might be useful to store the results of this comparison in a logical variable, but other than that, it's fine.
Other ways are:
if (type % value(1) .LT. 0D0) then
do something
deallocate(type % value)
elseif (type % value(1) .GT. 1D0) then
deallocate(type % value)
end if
In a sense, this repeats the deallocate statement, but it distinguishes the paths better. I'm testing this out, I don't know the exact Fortran Standard, but my understanding is that if the value is less than 0, then it is deallocated, but the elseif isn't even tested, so it doesn't matter that it's no longer allocated at that point.
Ultimately, your method is fine. Make sure that the code is easy to read for you or whoever has to read the code in the future. I don't see any performance reason to chose one over the other.
I need writing a function which takes as input
a = [12,39,48,36]
and produces as output
b=[4,4,4,13,13,13,16,16,16,12,12,12]
where the idea is to repeat one element three times or two times (this should be variable) and divided by 2 or 3.
I tried doing this:
c=[12,39,48,36]
a=size(c)
for i in a
repeat(c[i]/3,3)
end
You need to vectorize the division operator with a dot ..
Additionally I understand that you want results to be Int - you can vectorizing casting to Int too:
repeat(Int.(a./3), inner=3)
Przemyslaw's answer, repeat(Int.(a./3), inner=3), is excellent and is how you should write your code for conciseness and clarity. Let me in this answer analyze your attempted solution and offer a revised solution which preserves your intent. (I find that this is often useful for educational purposes).
Your code is:
c = [12,39,48,36]
a = size(c)
for i in a
repeat(c[i]/3, 3)
end
The immediate fix is:
c = [12,39,48,36]
output = Int[]
for x in c
append!(output, fill(x/3, 3))
end
Here are the changes I made:
You need an array to actually store the output. The repeat function, which you use in your loop, would produce a result, but this result would be thrown away! Instead, we define an initially empty output = Int[] and then append! each repeated block.
Your for loop specification is iterating over a size tuple (4,), which generates just a single number 4. (Probably, you misunderstand the purpose of the size function: it is primarily useful for multidimensional arrays.) To fix it, you could do a = 1:length(c) instead of a = size(c). But you don't actually need the index i, you only require the elements x of c directly, so we can simplify the loop to just for x in c.
Finally, repeat is designed for arrays. It does not work for a single scalar (this is probably the error you are seeing); you can use the more appropriate fill(scalar, n) to get [scalar, ..., scalar].
Hi I am new here and want to solve this problem:
do k=1,31
Data H(1,k)/0/
End do
do l=1,21
Data H(l,1)/0.5*(l-1)/
End do
do m=31,41
Data H(17,m)/0/
End do
do n=17,21
Data H(n,41)/0.5*(n-17)/
End do
I get error for l and n saying that it is a syntax error in DATA statement. Anyone know how to solve this problem?
You have three problems here, and not just with the "l" and "n" loops.
The first problem is that the values in a data statement cannot be arbitrary expressions. In particular, they must be constants; 0.5*(l-1) is not a constant.
The second problem is that the bounds in the object lists must also be constant (expressions); l is not a constant expression.
For the first, it's also worth noting that * in a data value list has a special meaning, and it isn't the multiplication operator. * gives a repeat count, and a repeat count of 0.5 is not valid.
You can fix the second point quite simply, by using such constructions as
data H(1,1:31) /31*0./ ! Note the repeat count specifier
outside a loop, or using an implied loop
data (H(1,k),k=1,31) /31*0./
To do something for the "l" loop is more tedious
data H(1:21,1) /0., 0.5, 1., 1.5, ... /
and we have to be very careful about the number of values specified. This cannot be dynamic.
The third problem is that you cannot specify explicit initialization for an element more than once. Look at your first two loops: if this worked you'd be initializing H(1,1) twice. Even though the same value is given, this is still invalid.
Well, actually you have four problems. The fourth is related to the point about dynamic number of values. You probably don't want to be doing explicit initialization. Whilst it's possible to do what it looks like you want to do, just use assignment where these restrictions don't apply.
do l=1,21
H(l,1) = 0.5*(l-1)
End do
Yes, there are times when complicated explicit initialization is a desirable thing, but in this case, in what I assume is new code, keeping things simple is good. An "initialization" portion of your code which does the assignments is far more "modern".
Hey I am a little bit confused over prolog recursion and iteration. I am giving code for sum of a list in recursion and iteration respectively and want to know if each of them is correct or not...
add_r([],0).
add_r([H|T],X) :- add_r(T,X1),X is H + X1.
add_i(List,Sum) :- add_i(List,0,Sum).
add_i([H|T],I,Sum) :- I1 is I + H , add_i(T,I1,Sum).
add_i([], I1, I1).
here add_r is recursive program and add_i is iterative (according to me)...I may be wrong.Here "I" is used for iteration control.
Please correct me if I am wrong.
If you use the terminology of Abelson & Sussman (Structure and Interpretation of Computer Programs) you are quite correct.
In this case "iterative" means the state of the process is fully described by just a few variables and "recursive" means the number of variables grows with each call. Also, a "recursive" process has 2 stages: grow and reduction and when it grows it leaves "choice-points" etc (all the differences are described in SICP).
In Prolog the term "tail recursion" is used more often than "iterative" in regard to your second example.
Strictly speaking, Prolog doesn't allow iteration, because variables are 'write once' (kind of...).
Both predicates are recursive, and seem correct to me.
The difference between them it's that add_i is tail recursive (the recursive call appears as last), and thus the compiler can optimize it (see last call optimization, or Tail Call), replacing the recursive call with a jump, thus avoiding the linear stack space required by add_r .
The C++ comma operator is used to chain individual expressions, yielding the value of the last executed expression as the result.
For example the skeleton code (6 statements, 6 expressions):
step1;
step2;
if (condition)
step3;
return step4;
else
return step5;
May be rewritten to: (1 statement, 6 expressions)
return step1,
step2,
condition?
step3, step4 :
step5;
I noticed that it is not possible to perform step-by-step debugging of such code, as the expression chain seems to be executed as a whole. Does it means that the compiler is able to perform special optimizations which are not possible with the traditional statement approach (specially if the steps are const or inline)?
Note: I'm not talking about the coding style merit of that way of expressing sequence of expressions! Just about the possible optimisations allowed by replacing statements by expressions.
Most compilers will break your code down into "basic blocks", which are stretches of code with no jumps/branches in or out. Optimisations will be performed on a graph of these blocks: that graph captures all the control flow in the function. The basic blocks are equivalent in your two versions of the code, so I doubt that you'd get different optimisations. That the basic blocks are the same isn't entirely obvious: it relies on the fact that the control flow between the steps is the same in both cases, and so are the sequence points. The most plausible difference is that you might find in the second case there is only one block including a "return", and in the first case there are two. The blocks are still equivalent, since the optimiser can replace two blocks that "do the same thing" with one block that is jumped to from two different places. That's a very common optimisation.
It's possible, of course, that a particular compiler doesn't ignore or eliminate the differences between your two functions when optimising. But there's really no way of saying whether any differences would make the result faster or slower, without examining what that compiler is doing. In short there's no difference between the possible optimisations, but it doesn't necessarily follow that there's no difference between the actual optimisations.
The reason you can't single-step your second version of the code is just down to how the debugger works, not the compiler. Single-step usually means, "run to the next statement", so if you break your code into multiple statements, you can more easily debug each one. Otherwise, if your debugger has an assembly view, then in the second case you could switch to that and single-step the assembly, allowing you to see how it progresses. Or if any of your steps involve function calls, then you may be able to "do the hokey-cokey", by repeatedly doing "step in, step out" of the functions, and separate them that way.
Using the comma operator neither promotes nor hinders optimization in any circumstances I'm aware of, because the C++ standard guarantee is only that evaluation will be in left-to-right order, not that statement execution necessarily will be. (This is the same guarantee you get with statement line order.)
What it is likely to do, though, is turn your code into a confusing mess, since many programmers are unaware that the comma-as-operator even exists, and are apt to confuse it with commas used as parameter separators. (Want to really make your code unreadable? Call a function like my_func((++i, y), x).)
The "best" use of the comma operator I've seen is to work with multiple variables in the iteration statement of a for loop:
for (int i = 0, j = 0;
i < 10 && j < 12;
i += j, ++j) // each time through the loop we're tinkering with BOTH i and j
{
}
Very unlikely IMHO. The thing get's compiled down to assembler/machine code, then further low-level optimizations are done, so it probably turns out to the same thing.
OTOH, if the comma operator is overloaded, the game changes completely. But I'm sure you know that. ;)
The obligatory list:
Don't worry about rewriting almost equivalent code to gain performance
If you have a perf-problem, profile to see what the problem is
If you can't get it faster by algorithmic ops, look at the disassembly and see that the compiler does what you intended
If not, ask here and post source and disassembly for both versions. :)