I have a broadly used function foo(int a, int b) and I want to provide a special version of foo that performs differently if a is say 1.
a) I don't want to go through the whole code base and change all occurrences of foo(1, b) to foo1(b) because the rules on arguments may change and I dont want to keep going through the code base whenever the rules on arguments change.
b) I don't want to burden function foo with an "if (a == 1)" test because of performance issues.
It seems to me to be a fundamental skill of the compiler to call the right code based on what it can see in front of it. Or is this a possible missing feature of C++ that requires macros or something to handle currently.
Simply write
inline int foo(int a, int b)
{
if (a==1) {
// skip complex code and call easy code
call_easy(b);
} else {
// complex code here
do_complex(a, b);
}
}
When you call
foo(1, 10);
the optimizer will/should simply insert a call_easy(b).
Any decent optimizer will inline the function and detect if the function has been called with a==1. Also I think that the entire constexpr mentioned in other posts is nice, but not really necessary in your case. constexpr is very useful, if you want to resolve values at compile time. But you simply asked to switch code paths based on a value at runtime. The optimizer should be able to detect that.
In order to detect that, the optimizer needs to see your function definition at all places where your function is called. Hence the inline requirement - although compilers such as Visual Studio have a "generate code at link time" feature, that reduces this requirement somewhat.
Finally you might want to look at C++ attributes [[likely]] (I think). I haven't worked with them yet, but they are supposed to tell the compiler which execution path is likely and give a hint to the optimizer.
And why don't you experiment a little and look at the generated code in the debugger/disassemble. That will give you a feel for the optimizer. Don't forget that the optimizer is likely only active in Release Builds :)
Templates work in compile time and you want to decide in runtime which is never possible. If and only if you really can call your function with constexpr values, than you can change to a template, but the call becomes foo<1,2>() instead of foo(1,2); "performance issues"... that's really funny! If that single compare assembler instruction is the performance problem... yes, than you have done everything super perfect :-)
BTW: If you already call with constexpr values and the function is visible in the compilation unit, you can be sure the compiler already knows to optimize it away...
But there is another way to handle such things if you really have constexpr values sometimes and your algorithm inside the function can be constexpr evaluated. In that case, you can decide inside the function if your function was called in a constexpr context. If that is the case, you can do a full compile time algorithm which also can contain your if ( a== 1) which will be fully evaluated in compile time. If the function is not called in constexpr context, the function is running as before without any additional overhead.
To do such decision in compile time we need the actual C++ standard ( C++20 )!
constexpr int foo( int a, int)
{
if (std::is_constant_evaluated() )
{ // this part is fully evaluated in compile time!
if ( a == 1 )
{
return 1;
}
else
{
return 2;
}
}
else
{ // and the rest runs as before in runtime
if ( a == 0 )
{
return 3;
}
else
{
return 4;
}
}
}
int main()
{
constexpr int res1 = foo( 1,0 ); // fully evaluated during compile time
constexpr int res2 = foo( 2,0 ); // also full compile time
std::cout << res1 << std::endl;
std::cout << res2 << std::endl;
std::cout << foo( 5, 0) << std::endl; // here we go in runtime
std::cout << foo( 0, 0) << std::endl; // here we go in runtime
}
That code will return:
1
2
4
3
So we do not need to go with classic templates, no need to change the rest of the code but have full compile time optimization if possible.
#Sebastian's suggestion works at least in the simple case with all optimisation levels except -O0 in g++ 9.3.0 on Ubuntu 20.04 in c++20 mode. Thanks again.
See below disassembly always calling directly the correct subfunction func1 or func2 instead of the top function func(). A similar disassembly after -O0 shows only the top level func() being called leaving the decision to run-time which is not desired.
I hope this will work in production code and perhaps with multiple hard coded arguments.
Breakpoint 1, main () at p1.cpp:24
24 int main() {
(gdb) disass /m
Dump of assembler code for function main():
6 inline void func(int a, int b) {
7
8 if (a == 1)
9 func1(b);
10 else
11 func2(a,b);
12 }
13
14 void func1(int b) {
15 std::cout << "func1 " << " " << " " << b << std::endl;
16 }
17
18 void func2(int a, int b) {
19 std::cout << "func2 " << a << " " << b << std::endl;
20 }
21
22 };
23
24 int main() {
=> 0x0000555555555286 <+0>: endbr64
0x000055555555528a <+4>: push %rbp
0x000055555555528b <+5>: push %rbx
0x000055555555528c <+6>: sub $0x18,%rsp
0x0000555555555290 <+10>: mov $0x28,%ebp
0x0000555555555295 <+15>: mov %fs:0x0(%rbp),%rax
0x000055555555529a <+20>: mov %rax,0x8(%rsp)
0x000055555555529f <+25>: xor %eax,%eax
25
26 X x1;
27
28 int b=1;
29 x1.func(1,b);
0x00005555555552a1 <+27>: lea 0x7(%rsp),%rbx
0x00005555555552a6 <+32>: mov $0x1,%esi
0x00005555555552ab <+37>: mov %rbx,%rdi
0x00005555555552ae <+40>: callq 0x55555555531e <X::func1(int)>
30
31 b=2;
32 x1.func(2,b);
0x00005555555552b3 <+45>: mov $0x2,%edx
0x00005555555552b8 <+50>: mov $0x2,%esi
0x00005555555552bd <+55>: mov %rbx,%rdi
0x00005555555552c0 <+58>: callq 0x5555555553de <X::func2(int, int)>
33
34 b=3;
35 x1.func(1,b);
0x00005555555552c5 <+63>: mov $0x3,%esi
0x00005555555552ca <+68>: mov %rbx,%rdi
0x00005555555552cd <+71>: callq 0x55555555531e <X::func1(int)>
36
37 b=4;
38 x1.func(2,b);
0x00005555555552d2 <+76>: mov $0x4,%edx
0x00005555555552d7 <+81>: mov $0x2,%esi
0x00005555555552dc <+86>: mov %rbx,%rdi
0x00005555555552df <+89>: callq 0x5555555553de <X::func2(int, int)>
39
40 return 0;
0x00005555555552e4 <+94>: mov 0x8(%rsp),%rax
0x00005555555552e9 <+99>: xor %fs:0x0(%rbp),%rax
0x00005555555552ee <+104>: jne 0x5555555552fc <main()+118>
0x00005555555552f0 <+106>: mov $0x0,%eax
0x00005555555552f5 <+111>: add $0x18,%rsp
0x00005555555552f9 <+115>: pop %rbx
0x00005555555552fa <+116>: pop %rbp
0x00005555555552fb <+117>: retq
0x00005555555552fc <+118>: callq 0x555555555100 <__stack_chk_fail#plt>
End of assembler dump.
This question already has answers here:
Is (4 > y > 1) a valid statement in C++? How do you evaluate it if so?
(5 answers)
Language support for chained comparison operators (x < y < z)
(5 answers)
Closed 3 years ago.
I'm new to programming and have a question about using multiple operators on a single line.
Say, I have
int x = 0;
int y = 1;
int z = 2;
In this example, I can use a chain of assignment operators: x = y = z;
Yet how come I can't use: x < y < z;?
You can do that, but the results will not be what you expect.
bool can be implicitly casted to int. In such case, false value will be 0 and true value will be 1.
Let's say we have the following:
int x = -2;
int y = -1;
int z = 0;
Expression x < y < z will be evaluated as such:
x < y < z
(x < y) < z
(-2 < -1) < 0
(true) < 0
1 < 0
false
Operator = is different, because it works differently. It returns its left hand side operand (after the assignment operation), so you can chain it:
x = y = z
x = (y = z)
//y holds the value of z now
x = (y)
//x holds the value of y now
gcc gives me the following warning after trying to use x < y < z:
prog.cc:18:3: warning: comparisons like 'X<=Y<=Z' do not have their mathematical meaning [-Wparentheses]
18 | x < y < z;
| ~~^~~
Which is pretty self-explanatory. It works, but not as one may expect.
Note: Class can define it's own operator=, which may also do unexpected things when chained (nothing says "I hate you" better than operator which doesn't follow basic rules and idioms). Fortunately, this cannot be done for primitive types like int
class A
{
public:
A& operator= (const A& other)
{
n = other.n + 1;
return *this;
}
int n = 0;
};
int main()
{
A a, b, c;
a = b = c;
std::cout << a.n << ' ' << b.n << ' ' << c.n; //2 1 0, these objects are not equal!
}
Or even simpler:
class A
{
public:
void operator= (const A& other)
{
}
int n = 0;
};
int main()
{
A a, b, c;
a = b = c; //doesn't compile
}
x = y = z
You can think of the built-in assignment operator, =, for fundamental types returning a reference to the object being assigned to. That's why it's not surprising that the above works.
y = z returns a reference to y, then
x = y
x < y < z
The "less than" operator, <, returns true or false which would make one of the comparisons compare against true or false, not the actual variable.
x < y returns true or false, then
true or false < z where the boolean gets promoted to int which results in
1 or 0 < z
Workaround:
x < y < z should be written:
x < y && y < z
If you do this kind of manual BinaryPredicate chaining a lot, or have a lot of operands, it's easy to make mistakes and forget a condition somewhere in the chain. In that case, you can create helper functions to do the chaining for you. Example:
// matching exactly two operands
template<class BinaryPredicate, class T>
inline bool chain_binary_predicate(BinaryPredicate p, const T& v1, const T& v2)
{
return p(v1, v2);
}
// matching three or more operands
template<class BinaryPredicate, class T, class... Ts>
inline bool chain_binary_predicate(BinaryPredicate p, const T& v1, const T& v2,
const Ts&... vs)
{
return p(v1, v2) && chain_binary_predicate(p, v2, vs...);
}
And here's an example using std::less:
// bool r = 1<2 && 2<3 && 3<4 && 4<5 && 5<6 && 6<7 && 7<8
bool r = chain_binary_predicate(std::less<int>{}, 1, 2, 3, 4, 5, 6, 7, 8); // true
It is because you see those expressions as "chain of operators", but C++ has no such concept. C++ will execute each operator separately, in an order determined by their precedence and associativity (https://en.cppreference.com/w/cpp/language/operator_precedence).
(Expanded after C Perkins's comment)
James, your confusion comes from looking at x = y = z; as some special case of chained operators. In fact it follows the same rules as every other case.
This expression behaves like it does because the assignment = is right-to-left associative and returns its right-hand operand. There are no special rules, don't expect them for x < y < z.
By the way, x == y == z will not work the way you might expect either.
See also this answer.
C and C++ don't actually have the idea of "chained" operations. Each operation has a precedence, and they just follow the precedence using the results of the last operation like a math problem.
Note: I go into a low level explanation which I find to be helpful.
If you want to read a historical explanation, Davislor's answer may be helpful to you.
I also put a TL;DR at the bottom.
For example, std::cout isn't actually chained:
std::cout << "Hello!" << std::endl;
Is actually using the property that << evaluates from left to right and reusing a *this return value, so it actually does this:
std::ostream &tmp = std::ostream::operator<<(std::cout, "Hello!");
tmp.operator<<(std::endl);
(This is why printf is usually faster than std::cout in non-trivial outputs, as it doesn't require multiple function calls).
You can actually see this in the generated assembly (with the right flags):
#include <iostream>
int main(void)
{
std::cout << "Hello!" << std::endl;
}
clang++ --target=x86_64-linux-gnu -Oz -fno-exceptions -fomit-frame-pointer -fno-unwind-tables -fno-PIC -masm=intel -S
I am showing x86_64 assembly below, but don't worry, I documented it explaining each instruction so anyone should be able to understand.
I demangled and simplified the symbols. Nobody wants to read std::basic_ostream<char, std::char_traits<char> > 50 times.
# Logically, read-only code data goes in the .text section. :/
.globl main
main:
# Align the stack by pushing a scratch register.
# Small ABI lesson:
# Functions must have the stack 16 byte aligned, and that
# includes the extra 8 byte return address pushed by
# the call instruction.
push rax
# Small ABI lesson:
# On the System-V (non-Windows) ABI, the first two
# function parameters go in rdi and rsi.
# Windows uses rcx and rdx instead.
# Return values go into rax.
# Move the reference to std::cout into the first parameter (rdi)
# "offset" means an offset from the current instruction,
# but for most purposes, it is used for objects and literals
# in the same file.
mov edi, offset std::cout
# Move the pointer to our string literal into the second parameter (rsi/esi)
mov esi, offset .L.str
# rax = std::operator<<(rdi /* std::cout */, rsi /* "Hello!" */);
call std::operator<<(std::ostream&, const char*)
# Small ABI lesson:
# In almost all ABIs, member function calls are actually normal
# functions with the first argument being the 'this' pointer, so this:
# Foo foo;
# foo.bar(3);
# is actually called like this:
# Foo::bar(&foo /* this */, 3);
# Move the returned reference to the 'this' pointer parameter (rdi).
mov rdi, rax
# Move the address of std::endl to the first 'real' parameter (rsi/esi).
mov esi, offset std::ostream& std::endl(std::ostream&)
# rax = rdi.operator<<(rsi /* std::endl */)
call std::ostream::operator<<(std::ostream& (*)(std::ostream&))
# Zero out the return value.
# On x86, `xor dst, dst` is preferred to `mov dst, 0`.
xor eax, eax
# Realign the stack by popping to a scratch register.
pop rcx
# return eax
ret
# Bunch of generated template code from iostream
# Logically, text goes in the .rodata section. :/
.rodata
.L.str:
.asciiz "Hello!"
Anyways, the = operator is a right to left operator.
struct Foo {
Foo();
// Why you don't forget Foo(const Foo&);
Foo& operator=(const Foo& other);
int x; // avoid any cheating
};
void set3Foos(Foo& a, Foo& b, Foo& c)
{
a = b = c;
}
void set3Foos(Foo& a, Foo& b, Foo& c)
{
// a = (b = c)
Foo& tmp = b.operator=(c);
a.operator=(tmp);
}
Note: This is why the Rule of 3/Rule of 5 is important, and why inlining these is also important:
set3Foos(Foo&, Foo&, Foo&):
# Align the stack *and* save a preserved register
push rbx
# Backup `a` (rdi) into a preserved register.
mov rbx, rdi
# Move `b` (rsi) into the first 'this' parameter (rdi)
mov rdi, rsi
# Move `c` (rdx) into the second parameter (rsi)
mov rsi, rdx
# rax = rdi.operator=(rsi)
call Foo::operator=(const Foo&)
# Move `a` (rbx) into the first 'this' parameter (rdi)
mov rdi, rbx
# Move the returned Foo reference `tmp` (rax) into the second parameter (rsi)
mov rsi, rax
# rax = rdi.operator=(rsi)
call Foo::operator=(const Foo&)
# Restore the preserved register
pop rbx
# Return
ret
These "chain" because they all return the same type.
But < returns bool.
bool isInRange(int x, int y, int z)
{
return x < y < z;
}
It evaluates from left to right:
bool isInRange(int x, int y, int z)
{
bool tmp = x < y;
bool ret = (tmp ? 1 : 0) < z;
return ret;
}
isInRange(int, int, int):
# ret = 0 (we need manual zeroing because setl doesn't zero for us)
xor eax, eax
# (compare x, y)
cmp edi, esi
# ret = ((x < y) ? 1 : 0);
setl al
# (compare ret, z)
cmp eax, edx
# ret = ((ret < z) ? 1 : 0);
setl al
# return ret
ret
TL;DR:
x < y < z is pretty useless.
You probably want the && operator if you want to check x < y and y < z.
bool isInRange(int x, int y, int z)
{
return (x < y) && (y < z);
}
bool isInRange(int x, int y, int z)
{
if (!(x < y))
return false;
return y < z;
}
The historical reason for this is that C++ inherited these operators from C, which inherited them from an earlier language named B, which was based on BCPL, based on CPL, based on Algol.
Algol introduced “assignations” in 1968, which made assignments into expressions that returned a value. This allowed an assignment statement to pass its result along to the right-hand side of another assignment statement. This allowed chaining assignments. The = operator had to be parsed from right to left for this to work, which is the opposite of every other operator, but programmers had been used to that quirk since the ’60s. All the C-family languages inherited this, and C introduced a few others that work the same way.
The reason that serious bugs like if (euid = 0) or a < b < c compile at all is because of a simplification made in BCPL: truth values and numbers have the same type and can be used interchangeably. The B in BCPL stood for “Basic,” and the way it made itself so simple was to ditch the type system. All expressions were weakly-typed and the size of a machine register. Just one set of operators &, |, ^ and ~ did double duty for both integer and Boolean expressions, which let the language eliminate the Boolean type. Thus, a < b < c converts a < b into the numeric value of true or false, and compares that to c. In order for ~ to work as both bitwise and logical not, BCPL needed to define true as ~false, which is ~0. On most machines, that represents -1, but on some, it could be INT_MIN, a trap value, or -0. So, you could pass the “rvalue” of true to an arithmetic expression, but it wouldn’t be meaningful.
B, the predecessor of C, decided to keep the general idea, but go back to the Algol value of 1 for TRUE. This meant that ~ no longer changed TRUE to FALSE or vice versa. Since B didn’t have strong typing that could determine at compile time whether to use logical or bitwise not, it needed to create a separate ! operator. It also defined all nonzero integer values as truthy. It kept using bitwise & and |, even though these were now broken (1&2 is false even though both operands are truthy).
C added the && and || operators, to allow short-circuit optimization and, secondarily, to fix that problem with AND. It chose not to add a logical-xor, true to their philosophy of letting us shoot ourselves in the foot, so ^ breaks if we use it on a pair of different truthy numbers. (If you want a robust logical-xor, !!p ^ !!q.) Then, the designers made the very dubious choice not to add back a Boolean type, even though they had completely undone every benefit of eliminating it in the first place, and not having one now made the language more complicated, not less. Both C++ and the C standard library would later define bool, but by then it was too late. They were stuck with three more operators than they’d started with, and they had made typing = when you meant == into a deadly trap that has caused many security bugs.
Modern compilers try to mitigate the problems by assuming that any use of =, < and so on that violates most coding standards is probably a typo, and at least warning you about it. If you really meant to do that—one common example is if (errcode = library_call()) to both check if the call failed and save the error code in case it did—the convention is that an extra pair of parentheses tells the compiler you really meant it. So, a compiler would accept if ( 0 != (errcode = library_call()) ) without complaint. In C++17, you could also write if ( const auto errcode = library_call() ) or if ( const auto errcode = library_call(); errcode != 0 ). Similarly, the compiler would accept (foo < bar) < baz, but what you probably meant is foo < bar && bar < baz.
Even though it looks like you are assigning to multiple variables at the same time, it is actually a chain of sequential assignments. Specifically, y = z is evaluated first. The built-in = operator assigns the value of z to y and then returns an lvalue reference to y (source). That reference is then used to assign to x. So the code is basically equivalent to this
y = z;
x = y;
Applying the same logic to the comparison statement, with the difference that this one is evaluated left to right (source), we get the equivalent of
const bool first_comparison = x < y;
first_comparison < z;
Now, bool can be cast to int, but that is not what you want most of the time. As to why the language doesn't do what you want, it's because these operators are only defined as binary operators. Chained assignment just works because it can spare the return value so it was designed to return a reference to enable these semantics, but comparisons are required to return a bool and therefore they cannot be chained in a meaningful way without introducing new potentially breaking features to the language.
You can use x<y<z, but it does not get the result that you expect !
x<y<z is evaluated as (x<y)<z. Then x<y results in a boolean that will be either true or false. When you try to compare a boolean with the integer z, it gets integer promotion, with false being 0 and true being 1 (this is clearly defined by the C++ standard).
Demonstration:
int x=1,y=2,z=3;
cout << "x<y: "<< (x<y) << endl; // 1 since 1 is smaller than 2
cout << "x<y<z: "<< (x<y<z) <<endl; // 1 since boolean (x<y) is true, which is
// promoted to 1, which is smaller than 3
z=1;
cout << "x<y<z: "<< (x<y<z) <<endl; // 1 since boolean (x<y) is true, which is
// promoted to 1, which is not smaler than 1
You can use x=y=z, but it might not be what you expect either!
Be aware that = is the assignment operator and not the comparison for equality! = works right to left, copying the value on the right into the "lvalue" on the left. So here, it copies the value of z into y, then copies the value in y into x.
If you use this expression in a conditional (if, while, ...), it will be true if x is in the end something different from 0 and false in all other cases, whatever the initial values of x, y and z. ``
Demonstration:
int x=1,y=2,z=3;
if (x=y=z)
cout << "Ouch! it's true and now all variables are 3" <<endl;
z=0;
if (x=y=z)
cout <<"Whatever"<<end;
else
cout << "Ouch! it's false and now all the variables are 0"<<endl;
You can use x==y==z, but it might still not be what you expect!
Same as for x<y<z except that the comparison is for equality. So you'll end up comparing a promoted boolean with and integer value, and not at all that all values are equal!
Conclusions
If you want to compare more than 2 items in a chained way, just rewrite the expression comparing termes two by two:
(x<y && y<z) // same truth than mathematically x<y<z
(x==y && y==z) // true if and only if all three terms are equal
Chaining the assignment operator is allowed, but tricky. It is sometimes used to initialize several variables at once. But it's not to be recommended as a general practice.
int i, j;
for (i=j=0; i<10 && j<5; j++) // trick !!
j+=2;
for (int i=0, j=0; i<10 && j<5; j++) // comma operator is cleaner
j+=2;
I can use x = y = z. Why not x < y < z?
You're essentially asking about syntax-idiomatic consistency here.
Well, just take consistency in the other direction: You should just avoid using x = y = z. After all, it is not an assertion that x, y and z are equal - it is rather two consecutive assignments; and at the same time, because it's reminiscent of indication of equality - this double-assignment a bit confusing.
So, just write:
y = z;
x = y;
instead, unless there's a very particular reason to push everything into a single statement.
I am having hard time to understand the scope of the following code:
(define (create-counter (x 1))
(let ([count 0])
(lambda()
(let ([temp count])
(set! count (+ x count)) temp))))
if I use:
(let ((c (create-counter ))) (+ (c) (c) (c) (c)))
the code work however if i tried with:
(+ (create-counter)(create-counter)(create-counter)(create-counter))
This does not work and give me a 0. Can someone please help me to understand this thoroughly? if possible, please compare to other language like C/C++ it would be easier for me to catch the hold of this. Thanks
(define (create-counter (x 1))
(let ([count 0])
(lambda()
(let ([temp count])
(set! count (+ x count)) temp))))
Translates to:
auto create_counter(int x=1){
int count=0;
return [x,count]()mutable{
int r=count;
count+=x;
return r;
};
}
A simple C++14 function returning a closure object.
When you do this:
(let ((c (create-counter ))) (+ (c) (c) (c) (c)))
It is:
auto c = create_counter();
auto r = c()+c()+c()+c();
return r;
It creates one counter, then runs it 4 times, returning 0 1 2 3 and adding to 6.
In this case:
(+ ((create-counter))((create-counter))((create-counter))((create-counter)))
It is:
auto r = create_counter()()+create_counter()()+create_counter()()+create_counter()();
return r;
Which creates 4 counters, and runs each one once. The first time you run a counter you get 0. So this adds to 0.
The closure object has state. It returns a bigger number each time you call it.
Now you may not be familiar with C++11/14 lamnda.
auto create_counter(int x=1){
int count=0;
return [x,count]()mutable{
int r=count;
count+=x;
return r;
};
}
Is
struct counter {
int x,count;
int operator()(){
int r=count;
count+=x;
return r;
};
};
counter create_counter(int x=1){
return {x,0};
}
with some syntax sugar.
I fixed what seems to be a syntax error in your original code. I am no expert, so maybe I got it wrong.
As an aside, a briefer create counter looks like:
auto create_counter(int x=1){
return [=,count=0]()mutable{
int r=count;
count+=x;
return r;
};
}
When you call "create-counter", it creates a counter and then returns a procedure that refers to that particular counter. When you call "create-counter" four times, you're creating four separate counters; each procedure refers to its own counter. When you call "create-counter" once and then the resulting procedure four times, it's creating just one counter, and incrementing it four times.
It's a bit hard to compare this to C, since C and C++ are quite weak in the area of closures; it's not easy to return a function that's defined inside of another function.
The closest analog might be a "counter" object in C++; think of "create-counter" as the constructor for an object containing a single integer, and the resulting procedure as an "increment" method that increments the counter contained in that object. In your second example, then, you're creating four distinct objects, where in your first example, you're creating one object and calling its "increment" method four times.
I was reading up about the difference between tail recursion and Traditional recursion and find it mentioned that "Tail Recursion however is a form of recursion that doesn’t use any stack space, and thus is a way to use recursion safely."'
I am struggling to understand how.
Comparing finding factorial of a number using the Traditional and tail recursion
Traditional recursion
/* traditional recursion */
fun(5);
int fun(int n)
{
if(n == 0)
return 1;
return n * fun(n-1);
}
Here, the call stack would look like
5 * fact(4)
|
4 * fact(3)
|
3 * fact(2)
|
2 * fact(1)
|
1 * fact(0)
|
1
Tail recursion
/* tail recursion */
fun(5,1)
int fun(int n, int sofar)
{
int ret = 0;
if(n == 0)
return sofar;
ret = fun(n-1,sofar*n);
return ret;
}
However, even here, the variable 'sofar' would hold - 5,20,60,120,120 at different points.
But once return is called from the base case which is recursive invocation #4, it still has to return 120 to recursive invocation #3, then to#2, #1 and back to main.
So, I mean to say that the stack is used and everytime you return to the previous call, the variables at that point in time can be seen, which means it is being saved at each step.
Unless, the tail recursion was written like below, I am not being able to understand how it saves stack space.
/* tail recursion */
fun(5,1)
int fun(int n, int sofar)
{
int ret = 0;
if(n == 0)
return 'sofar' back to main function, stop recursing back; just a one-shot return
ret = fun(n-1,sofar*n);
return ret;
}
PS : I have read few threads on SO and came to understand what tail recursion is, however, this question is more related to why it saves stack space. I could not find a similar question where this was discussed.
The trick is that if the compiler notices the tail recursion, it can compile a goto instead. It will generate something like the following code:
int fun_optimized(int n, int sofar)
{
start:
if(n == 0)
return sofar;
sofar = sofar*n;
n = n-1;
goto start;
}
And as you can see, the stack space is reused for each iteration.
Note that this optimization can only be done if the recursive call is the very last action in the function, that is tail recursion (try doing it manually to the non-tail case and you'll see that's just impossible).
A function call is tail recursive when function call (recursive) is performed as final action. Since the current recursive instance is done executing at that point, no need to maintaining its stack frame.
In this case, creating a stack frame on top of the current stack frame is nothing more than waste.
When compiler recognizes a recursion to be a tail recursion then it does not create nesting stack frames for each of the call instead it use the current stack frame. This is equivalent in effect to a goto statement. This make make that function call iterative rather recursive.
Note that in traditional recursion, every recursive call must have to complete before compiler performing the multiplication operations:
fun(5)
5 * fun(4)
5 * (4 * fun(3))
5 * (4 * (3 * fun(2)))
5 * (4 * (3 * (2 * fun(1))))
5 * (4 * (3 * (2 * 1)))
120
Nested stack frame needed in this case. Look at wiki for more information.
In case of tail recursion, with each call of fun, variable sofar is updated:
fun(5, 1)
fun(4, 5)
fun(3, 20)
fun(2, 60)
fun(1, 120)
120
No need to save stack frame of current recursive call.
I have some question from the exam in which I need to deduce the output of the following code:
01 int foo(int a) {
02 print 'F';
03 if (a <= 1) return 1;
04 return bar(a, foo(a-1));
05 }
06
07 int bar(int x, int y) {
08 print 'B';
09 if (x > y) return baz(x, y);
10 return baz(y, x);
11 }
12
13 int baz(int x, int y) {
14 print 'Z'
15 if (y == 0) return 0;
16 return baz(x, y-1) + x;
17 }
18
19 void main() {
20 foo(3);
21 }
my question is what tactic will be the best to solve this kind of the questions? I'm not allowed to use PC of course
P.S. You can use eager evaluation as in c++ or normal order evaluation(output will be different of course, but I'm interested in tactics only), I tried to solve it using stack, every time write the function which I call, but anyway it is complicated
thanks in advance for any help
I would use a "bottom-to-top" attempt:
baz is the function that is called, but doesn't call other functions (except itself). It outputs 'Z' exactly y + 1 times, the return code is x*y (you add x after each call).
bar is the "next higher" function, it outputs 'B' once and calls baz with its lower argument as the second parameter - the return code is x*y, too.
foo is the "top" function (right after main) and its the most complicated function. It outputs 'F', not only once, but a times (because of the foo(a-1) at the end that is evaluated before the bar call. The bar call multiplies a and foo(a-1), which will multiply a-1 and foo(a-2) and so on, until foo(1) is evaluated and returns 1. So the return code is a * (a-1) * ... 2 * 1, so a!.
This is not a complete analysis, f.e. we don't know in which order the characters will be output, but it is a rough scheme of what happens - and as you and other people in the comments pointed out, this is what you want - tactics instead of a complete answer.
What I'd probably do is to start with the main() function at the top left corner of the page, write down the first line executed, keeping track of local variables etc., then write the next line under it and so on.
But when a function is called, also move right by one column, writing down the function's name and the actual value of the input arguments for that invocation first and then proceding with the lines in that function.
When you return from the function, move left and write the return value between the two columns.
Also, keep a separate area for the "standard output", where all the printed text goes.
These steps should take you through most of "think like a computer" problems.