string strLine;//not constant
int index = 0;
while(index < strLine.length()){//strLine is not modified};
how many times strLine.length() is evaluated
do we need to put use nLength with nLength assigned to strLine.length() just before loop
length will be evaluated every time you go via the loop, however since length is constant time (O(1)) it doesn't make much difference and adding a variable for storing this value will probably have a negligible effect with a small hit on code readability (as well as breaking the code if the string is ever changed).
length() is defined in headers which are included in your source file, so it can be inlined by compiler, it's internal calls also could be inlined, so if compiler would be able to detect that your string instance isn't changed in the loop then it can optimize access to length of a string, so it will be evaluated only once.
In any case I don't think that storing value of string's length is really necessary. Maybe it can save you some nanosecs, but your code will be bigger and there will be some risk when you will decide to change that string inside loop.
Each time it is called ... (each while evaluation).
If you are not changing the string lenght you are better of with a temporary variable like:
string strLine;
int stringLength = strLine.length();
int index = 0;
while(index < stringLength);
I think there is a second question lurking inside this, and that's "which implementation is more clear?"
If, semantically, you mean for the length of strLine to never change inside the body of the loop, make it obvious by assigning to a well named variable. I'd even make it const. This makes it clear to other programmers (and yourself) that the comparison value is never changing.
The other thing this does it make it easier to see what that value is when you're stepping through the code in a debugger. Hover-over works a lot better on a local than it does on a function call.
Saying, "leave it as a function call; the compiler will optimize it" strikes me as premature pessimization. Even though length() is O(1), if not inlined (you can't guarantee that optimizations aren't disabled) it's is still a nontrivial function call. By using a local variable, you clarify your meaning, and you get a possibly non-trivial performance optimization.
Do what makes your intent most clear.
strLine.length() will be evaluated while( i < strLine.length() )
Having said that if the string is constant, most compilers will optimize this( with proper settings ).
If you are going to use a temporally variable use a const qualifier, so the compiler can add optimizations knowing that the value will not change:
string strLine;//not constant
int index = 0;
const int strLenght = strLine.Length();
while(index < strLine.length()){//strLine is not modified};
Chances are that the compiler itself make those optimizations when accessing the Length() method anyway.
Edit: my assembly is a little rusty, but i think that the evaluation takes place just once.
Given this code:
int main()
{
std::string strLine="hello world";
for (int i=0; i < strLine.length(); ++i)
{
std::cout << strLine[i] <<std::endl;
}
}
Generates this assembly:
for (int i=0; i < strLine.length(); ++i)
0040104A cmp dword ptr [esp+20h],esi
0040104E jbe main+86h (401086h)
But for this code
std::string strLine="hello world";
const int strLength = strLine.length();
for (int i=0; i < strLength ; ++i)
{
std::cout << strLine[i] <<std::endl;
}
generates this one:
for (int i=0; i < strLength ; ++i)
0040104F cmp edi,esi
00401051 jle main+87h (401087h)
The same assembly is generated if a const qualifier is not used, so in this case it doesn't make a difference.
Tried with VSC++ 2005
As stated, since the string::length function is likely entirely defined in a header, and is required to be O(1), it's almost certain to evaluate to a simple member access, and get inlined into your code. Since you don't declare the string as volatile, the compiler is allowed to imagine that no outside code is going to change it, and optimize the call to a single memory access and leave the value in a register if it finds that that is a good idea.
By grabbing and caching the value yourself, you increase the chances that the compiler will be able to do the same thing. In many cases, the compiler will not even generate the code to write the string length into the stack, and just leave it in a register. Of course, if you call out to different functions that the compiler cannot inline, then the value will have to be written to the stack to prevent the function calls from turfing the register.
Since you are not changing the string, shouldn't you be using
const string strLine;
Just, because then the compiler gets some more information on what can and what cannot change - not sure exactly how smart a C++ compiler can get, though.
strLine.length() will be evaluated every time you go around the loop.
You're correct in that it would be more efficient to use nLength, especially if strLine is long.
Related
What is wasted in the example from the Cpp Core Guidelines?
P.9: Don't waste time or space
[...]
void lower(zstring s)
{
for (int i = 0; i < strlen(s); ++i) s[i] = tolower(s[i]);
}
Yes, this is an example from production code. We leave it to the reader to figure out what's wasted.
from https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rp-waste
strlen is calculated at every iteration of the loop.
strlen is called every time the loop condition is checked, and takes O(n) time per call, so the total time for the loop is O(n^2).
A lot of time is wasted and a segmentation fault may occur as the author of the code's increasing s, not i in the loop:
for (int i = 0; i < strlen(s); ++s)
//right here ^^^^
As other aswers have already stated, strlen(s) is called multiple times because it is in the condition, implying that it should be cached and reused instead.
But strlen(s) does not actually need to be called at all ! s is (or is implicitly convertible to) a nul-terminated char array, since that's what strlen expects. So we can just use this very property for our own loop.
void lower(zstring s) {
for (char *p = s; *p; ++p)
*p = std::tolower((unsigned char)*p);
}
Unless they have any very unintuitive semantics in the zstring class, the function in it's current form is a complete waste of both time and space, as its "result" can't be used after the function - it is passed in as value, and isn't returned.
So to avoid wasting the time to uselessly compute the lowercase which can't be used, and the space in copying the passed parameter, I would pass by reference!
I've been using C/C++ for about three years and I can't believe I've never encountered this issue before!
This following code compiles (I've just tried using gcc):
#include <iostream>
int change_i(int i) {
int j = 8;
return j;
}
int main() {
int i = 10;
change_i(10);
std::cout << "i = " << i << std::endl;
}
And, the program prints i = 10, as you might expect.
My question is -- why does this compile? I would have expected an error, or at least a warning, saying there was a value returned which is unused.
Naively, I would consider this a similar case to when you accidentally forget the return call in a non-void function. I understand it's different and I can see why there's nothing inherently wrong with this code, but it seems dangerous. I've just spotted a similar error in some very old code of mine, representing a bug which goes back a long time. I obviously meant to do:
i = change_i(10);
But forgot, so it was never changed (I know this example is silly, the exact code is much more complicated). Any thoughts would be much appreciated!
It compiles because calling a function and ignoring the return result is very common. In fact, the last line of main does so too.
std::cout << "i = " << i << std::endl;
is actually short for:
(std::cout).operator<<("i =").operator<<(i).operator<<(std::endl);
... and you are not using the value returned from the final operator<<.
Some static checkers have options to warn when function returns are ignored (and then options to annotate a function whose returns are often ignored). Gcc has an option to mark a function as requiring the return value be used (__attribute__((warn_unused_result))) - but it only works if the return type doesn't have a destructor :-(.
Ignoring the return value of a function is perfectly valid. Take this for example:
printf("hello\n");
We're ignoring the return value of printf here, which returns the number of characters printed. In most cases, you don't care how many characters are printed. If compilers warned about this, everyone's code would show tons of warnings.
This actually a specific case of ignoring the value of an expression, where in this case the value of the expression is the return value of a function.
Similarly, if you do this:
i++;
You have an expression whose value is discarded (i.e. the value of i before being incremented), however the ++ operator still increments the variable.
An assignment is also an expression:
i = j = k;
Here, you have two assignment expressions. One is j = k, whose value is the value of k (which was just assigned to j). This value is then used as the right hand side an another assignment to i. The value of the i = (j = k) expression is then discarded.
This is very different from not returning a value from a non-void function. In that case, the value returned by the function is undefined, and attempting to use that value results in undefined behavior.
There is nothing undefined about ignoring the value of an expression.
The short reason it is allowed is because that's what the standard specifies.
The statement
change_i(10);
discards the value returned by change_i().
The longer reason is that most expressions both have an effect and produce a result. So
i = change_i(10);
will set i to be 8, but the assignment expression itself also has a result of 8. This is why (if j is of type int)
j = i = change_i(10);
will cause both j and i to have the value of 8. This sort of logic can continue indefinitely - which is why expressions can be chained, such as k = i = j = 10. So - from a language perspective - it does not make sense to require that a value returned by a function is assigned to a variable.
If you want to explicitly discard the result of a function call, it is possible to do
(void)change_i(10);
and a statement like
j = (void)change_i(10);
will not compile, typically due to a mismatch of types (an int cannot be assigned the value of something of type void).
All that said, several compilers (and static code analysers) can actually be configured to give a warning if the caller does not use a value returned by a function. Such warnings are turned off by default - so it is necessary to compile with appropriate settings (e.g. command line options).
I've been using C/C++ for about three years
I can suppose that during these three years you used standard C function printf. For example
#include <stdio.h>
int main( void )
{
printf( "Hello World!\n" );
}
The function has return type that differs from void. However I am sure that in most cases you did not use the return value of the function.:)
If to require that the compiler would issue an error when the return value of a function is not used then the code similar to the shown above would not compile because the compiler does not have an access to the source code of the function and can not determine whether the function has a side effect.:)
Consider another standard C functions - string functions.
For example function strcpy is declared like
char * strcpy( char *destination, const char *source );
If you have for example the following character arrays
char source[] = "Hello World!";
char destination[sizeof( source )];
then the function usually is called like
strcpy( destination, source );
There is no sense to use its return value when you need just to copy a string. Moreover for the shown example you even may not write
destination = strcpy( destination, source );
The compiler will issue an error.
So as you can see there is sense to ignore sometimes return values of functions.
For your own example the compiler could issue a message that the function does not have a side effect so its call is obsolete. In any case it should issue a message that the function parameter is not used.:)
Take into account that sometimes the compiler does not see a function definition that is present in some other compilation unit or in a library. So the compiler is unable to determine whether a function has a side effect,
In most cases compilers deal with function declarations. Sometimes the function definitions are not available for compilers in C and C++.
Question #1: Is declaring a variable inside a loop a good practice or bad practice?
I've read the other threads about whether or not there is a performance issue (most said no), and that you should always declare variables as close to where they are going to be used. What I'm wondering is whether or not this should be avoided or if it's actually preferred.
Example:
for(int counter = 0; counter <= 10; counter++)
{
string someString = "testing";
cout << someString;
}
Question #2: Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
This is excellent practice.
By creating variables inside loops, you ensure their scope is restricted to inside the loop. It cannot be referenced nor called outside of the loop.
This way:
If the name of the variable is a bit "generic" (like "i"), there is no risk to mix it with another variable of same name somewhere later in your code (can also be mitigated using the -Wshadow warning instruction on GCC)
The compiler knows that the variable scope is limited to inside the loop, and therefore will issue a proper error message if the variable is by mistake referenced elsewhere.
Last but not least, some dedicated optimization can be performed more efficiently by the compiler (most importantly register allocation), since it knows that the variable cannot be used outside of the loop. For example, no need to store the result for later re-use.
In short, you are right to do it.
Note however that the variable is not supposed to retain its value between each loop. In such case, you may need to initialize it every time. You can also create a larger block, encompassing the loop, whose sole purpose is to declare variables which must retain their value from one loop to another. This typically includes the loop counter itself.
{
int i, retainValue;
for (i=0; i<N; i++)
{
int tmpValue;
/* tmpValue is uninitialized */
/* retainValue still has its previous value from previous loop */
/* Do some stuff here */
}
/* Here, retainValue is still valid; tmpValue no longer */
}
For question #2:
The variable is allocated once, when the function is called. In fact, from an allocation perspective, it is (nearly) the same as declaring the variable at the beginning of the function. The only difference is the scope: the variable cannot be used outside of the loop. It may even be possible that the variable is not allocated, just re-using some free slot (from other variable whose scope has ended).
With restricted and more precise scope come more accurate optimizations. But more importantly, it makes your code safer, with less states (i.e. variables) to worry about when reading other parts of the code.
This is true even outside of an if(){...} block. Typically, instead of :
int result;
(...)
result = f1();
if (result) then { (...) }
(...)
result = f2();
if (result) then { (...) }
it's safer to write :
(...)
{
int const result = f1();
if (result) then { (...) }
}
(...)
{
int const result = f2();
if (result) then { (...) }
}
The difference may seem minor, especially on such a small example.
But on a larger code base, it will help : now there is no risk to transport some result value from f1() to f2() block. Each result is strictly limited to its own scope, making its role more accurate. From a reviewer perspective, it's much nicer, since he has less long range state variables to worry about and track.
Even the compiler will help better : assuming that, in the future, after some erroneous change of code, result is not properly initialized with f2(). The second version will simply refuse to work, stating a clear error message at compile time (way better than run time). The first version will not spot anything, the result of f1() will simply be tested a second time, being confused for the result of f2().
Complementary information
The open-source tool CppCheck (a static analysis tool for C/C++ code) provides some excellent hints regarding optimal scope of variables.
In response to comment on allocation:
The above rule is true in C, but might not be for some C++ classes.
For standard types and structures, the size of variable is known at compilation time. There is no such thing as "construction" in C, so the space for the variable will simply be allocated into the stack (without any initialization), when the function is called. That's why there is a "zero" cost when declaring the variable inside a loop.
However, for C++ classes, there is this constructor thing which I know much less about. I guess allocation is probably not going to be the issue, since the compiler shall be clever enough to reuse the same space, but the initialization is likely to take place at each loop iteration.
Generally, it's a very good practice to keep it very close.
In some cases, there will be a consideration such as performance which justifies pulling the variable out of the loop.
In your example, the program creates and destroys the string each time. Some libraries use a small string optimization (SSO), so the dynamic allocation could be avoided in some cases.
Suppose you wanted to avoid those redundant creations/allocations, you would write it as:
for (int counter = 0; counter <= 10; counter++) {
// compiler can pull this out
const char testing[] = "testing";
cout << testing;
}
or you can pull the constant out:
const std::string testing = "testing";
for (int counter = 0; counter <= 10; counter++) {
cout << testing;
}
Do most compilers realize that the variable has already been declared and just skip that portion, or does it actually create a spot for it in memory each time?
It can reuse the space the variable consumes, and it can pull invariants out of your loop. In the case of the const char array (above) - that array could be pulled out. However, the constructor and destructor must be executed at each iteration in the case of an object (such as std::string). In the case of the std::string, that 'space' includes a pointer which contains the dynamic allocation representing the characters. So this:
for (int counter = 0; counter <= 10; counter++) {
string testing = "testing";
cout << testing;
}
would require redundant copying in each case, and dynamic allocation and free if the variable sits above the threshold for SSO character count (and SSO is implemented by your std library).
Doing this:
string testing;
for (int counter = 0; counter <= 10; counter++) {
testing = "testing";
cout << testing;
}
would still require a physical copy of the characters at each iteration, but the form could result in one dynamic allocation because you assign the string and the implementation should see there is no need to resize the string's backing allocation. Of course, you wouldn't do that in this example (because multiple superior alternatives have already been demonstrated), but you might consider it when the string or vector's content varies.
So what do you do with all those options (and more)? Keep it very close as a default -- until you understand the costs well and know when you should deviate.
I didn't post to answer JeremyRR's questions (as they have already been answered); instead, I posted merely to give a suggestion.
To JeremyRR, you could do this:
{
string someString = "testing";
for(int counter = 0; counter <= 10; counter++)
{
cout << someString;
}
// The variable is in scope.
}
// The variable is no longer in scope.
I don't know if you realize (I didn't when I first started programming), that brackets (as long they are in pairs) can be placed anywhere within the code, not just after "if", "for", "while", etc.
My code compiled in Microsoft Visual C++ 2010 Express, so I know it works; also, I have tried to to use the variable outside of the brackets that it was defined in and I received an error, so I know that the variable was "destroyed".
I don't know if it is bad practice to use this method, as a lot of unlabeled brackets could quickly make the code unreadable, but maybe some comments could clear things up.
For C++ it depends on what you are doing.
OK, it is stupid code but imagine
class myTimeEatingClass
{
public:
//constructor
myTimeEatingClass()
{
sleep(2000);
ms_usedTime+=2;
}
~myTimeEatingClass()
{
sleep(3000);
ms_usedTime+=3;
}
const unsigned int getTime() const
{
return ms_usedTime;
}
static unsigned int ms_usedTime;
};
myTimeEatingClass::ms_CreationTime=0;
myFunc()
{
for (int counter = 0; counter <= 10; counter++) {
myTimeEatingClass timeEater();
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
myOtherFunc()
{
myTimeEatingClass timeEater();
for (int counter = 0; counter <= 10; counter++) {
//do something
}
cout << "Creating class took " << timeEater.getTime() << "seconds at all" << endl;
}
You will wait 55 seconds until you get the output of myFunc.
Just because each loop constructor and destructor together need 5 seconds to finish.
You will need 5 seconds until you get the output of myOtherFunc.
Of course, this is a crazy example.
But it illustrates that it might become a performance issue when each loop the same construction is done when the constructor and / or destructor needs some time.
Since your second question is more concrete, I'm going to address it first, and then take up your first question with the context given by the second. I wanted to give a more evidence-based answer than what's here already.
Question #2: Do most compilers realize that the variable has already
been declared and just skip that portion, or does it actually create a
spot for it in memory each time?
You can answer this question for yourself by stopping your compiler before the assembler is run and looking at the asm. (Use the -S flag if your compiler has a gcc-style interface, and -masm=intel if you want the syntax style I'm using here.)
In any case, with modern compilers (gcc 10.2, clang 11.0) for x86-64, they only reload the variable on each loop pass if you disable optimizations. Consider the following C++ program—for intuitive mapping to asm, I'm keeping things mostly C-style and using an integer instead of a string, although the same principles apply in the string case:
#include <iostream>
static constexpr std::size_t LEN = 10;
void fill_arr(int a[LEN])
{
/* *** */
for (std::size_t i = 0; i < LEN; ++i) {
const int t = 8;
a[i] = t;
}
/* *** */
}
int main(void)
{
int a[LEN];
fill_arr(a);
for (std::size_t i = 0; i < LEN; ++i) {
std::cout << a[i] << " ";
}
std::cout << "\n";
return 0;
}
We can compare this to a version with the following difference:
/* *** */
const int t = 8;
for (std::size_t i = 0; i < LEN; ++i) {
a[i] = t;
}
/* *** */
With optimization disabled, gcc 10.2 puts 8 on the stack on every pass of the loop for the declaration-in-loop version:
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
mov DWORD PTR -12[rbp], 8 ;✷
whereas it only does it once for the out-of-loop version:
mov DWORD PTR -12[rbp], 8 ;✷
mov QWORD PTR -8[rbp], 0
.L3:
cmp QWORD PTR -8[rbp], 9
ja .L4
Does this make a performance impact? I didn't see an appreciable difference in runtime between them with my CPU (Intel i7-7700K) until I pushed the number of iterations into the billions, and even then the average difference was less than 0.01s. It's only a single extra operation in the loop, after all. (For a string, the difference in in-loop operations is obviously a bit greater, but not dramatically so.)
What's more, the question is largely academic, because with an optimization level of -O1 or higher gcc outputs identical asm for both source files, as does clang. So, at least for simple cases like this, it's unlikely to make any performance impact either way. Of course, in a real-world program, you should always profile rather than make assumptions.
Question #1: Is declaring a variable inside a loop a good practice or
bad practice?
As with practically every question like this, it depends. If the declaration is inside a very tight loop and you're compiling without optimizations, say for debugging purposes, it's theoretically possible that moving it outside the loop would improve performance enough to be handy during your debugging efforts. If so, it might be sensible, at least while you're debugging. And although I don't think it's likely to make any difference in an optimized build, if you do observe one, you/your pair/your team can make a judgement call as to whether it's worth it.
At the same time, you have to consider not only how the compiler reads your code, but also how it comes off to humans, yourself included. I think you'll agree that a variable declared in the smallest scope possible is easier to keep track of. If it's outside the loop, it implies that it's needed outside the loop, which is confusing if that's not actually the case. In a big codebase, little confusions like this add up over time and become fatiguing after hours of work, and can lead to silly bugs. That can be much more costly than what you reap from a slight performance improvement, depending on the use case.
Once upon a time (pre C++98); the following would break:
{
for (int i=0; i<.; ++i) {std::string foo;}
for (int i=0; i<.; ++i) {std::string foo;}
}
with the warning that i was already declared (foo was fine as that's scoped within the {}). This is likely the WHY people would first argue it's bad. It stopped being true a long time ago though.
If you STILL have to support such an old compiler (some people are on Borland) then the answer is yes, a case could be made to put the i out the loop, because not doing so makes it makes it "harder" for people to put multiple loops in with the same variable, though honestly the compiler will still fail, which is all you want if there's going to be a problem.
If you no longer have to support such an old compiler, variables should be kept to the smallest scope you can get them so that you not only minimise the memory usage; but also make understanding the project easier. It's a bit like asking why don't you have all your variables global. Same argument applies, but the scopes just change a bit.
It's a very good practice, as all above answer provide very good theoretical aspect of the question let me give a glimpse of code, i was trying to solve DFS over GEEKSFORGEEKS, i encounter the optimization problem......
If you try to solve the code declaring the integer outside the loop will give you Optimization Error..
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
int flag=0;
int top=0;
while(!st.empty()){
top = st.top();
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
Now put integers inside the loop this will give you correct answer...
stack<int> st;
st.push(s);
cout<<s<<" ";
vis[s]=1;
// int flag=0;
// int top=0;
while(!st.empty()){
int top = st.top();
int flag = 0;
for(int i=0;i<g[top].size();i++){
if(vis[g[top][i]] != 1){
st.push(g[top][i]);
cout<<g[top][i]<<" ";
vis[g[top][i]]=1;
flag=1;
break;
}
}
if(!flag){
st.pop();
}
}
this completely reflect what sir #justin was saying in 2nd comment....
try this here
https://practice.geeksforgeeks.org/problems/depth-first-traversal-for-a-graph/1. just give it a shot.... you will get it.Hope this help.
Chapter 4.8 Block Structure in K&R's The C Programming Language 2.Ed.:
An automatic variable declared and initialized in a
block is initialized each time the block is entered.
I might have missed seeing the relevant description in the book like:
An automatic variable declared and initialized in a
block is allocated only one time before the block is entered.
But a simple test can prove the assumption held:
#include <stdio.h>
int main(int argc, char *argv[]) {
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
int k;
printf("%p\n", &k);
}
}
return 0;
}
The two snippets below generate the same assembly.
// snippet 1
void test() {
int var;
while(1) var = 4;
}
// snippet 2
void test() {
while(1) int var = 4;
}
output:
test():
push rbp
mov rbp, rsp
.L2:
mov DWORD PTR [rbp-4], 4
jmp .L2
Link: https://godbolt.org/z/36hsM6Pen
So, until profiling opposes or computation extensive constructor is involved, keeping declation close to its usage should be the default approach.
Can somebody explain why next code output 26 timez 'Z' instead range from 'A' to 'Z', and how can I output this array correct. Look at code:
wchar_t *allDrvs[26];
int count = 0;
for (int n=0; n<26; n++)
{
wchar_t t[] = {L'A' + n, '\0'};
allDrvs[n] = t;
count++;
}
int j;
for(j = 0; j < count; j++)
{
std::wcout << allDrvs[j] << std::endl;
}
The problem (at least one) is:
{
wchar_t t[] = {L'A' + n, '\0'};
allDrvs[n] = t; //allDrvs points to t
count++;
} //t is deallocated here
//allDrvs[n] is a dangling pointer
So, short answer - undefined behavior on the line std::wcout << allDrvs[j].
To get a correct output - there's a crappy ugly version involving dynamic allocation and copying between arrays.
Then there's the correct version of using a std::vector<std::wstring> >.
Your t[] is on the stack; it only exists for one iteration of the loop at a time, and the next iteration appears to be reusing that space - not a behaviour that's required, but this seems to be what's happening based on your results. If you examine allDrvs[] with a debugger after the first loop completes, you'll probably see all the pointers point to the same memory location.
There's a variety of ways you could solve this. You can allocate a new t on the heap for each loop iteration (and delete them afterwards). You could do wchar_t allDrvs[26][2]; instead of wchar_t *allDrvs[26], and copy the contents of t over each iteration. You could display t right away in the first loop, instead of doing it later. You could use std::vector and std::wstring to manage things for you, instead of using arrays and pointers.
Your code has undefined behavior. Your t has automatic storage duration, so as soon as you exit the upper loop, it ceases to exist. Your allDrvs contains 26 pointers to objects that have been destroyed by the time you use them in the second loop.
As it happens, it looks like (under the circumstances you're running it, with the compiler you're using, etc.) what's happening is that it's re-using the same storage space for t at ever iteration of the loop, and when you use allDrvs in the second loop, that storage hasn't been overwritten, so you have 26 pointers to the same data.
Since you're using C++ anyway, I'd advise using std::wstring and probably std::vector instead -- for example, something on this general order:
std::vector<std::wstring> allDrvs;
for (char i=L'A'; i<L'Z'; i++)
allDrvs.push_back(std::wstring(i));
Technically, this isn't entirely portable -- it depends on 'A' .. 'Z' being contiguous, which isn't true with all character sets, IBM's EBCDIC being the obvious exception. Even in that case, it'll produce all the right outputs, but it'll also include a few additional items you didn't really want.
Nonetheless, the original depended on 'A'..'Z' being contiguous, and the code looks like it's probably intended for Windows anyway, so that's probably not really a big concern.
I have a bit unusual situation - I want to use goto statement to jump into the loop, not to jump out from it.
There are strong reasons to do so - this code must be part of some function which makes some calculations after the first call, returns with request for new data and needs one more call to continue. Function pointers (obvious solution) can't be used because we need interoperability with code which does not support function pointers.
I want to know whether code below is safe, i.e. it will be correctly compiled by all standard-compliant C/C++ compilers (we need both C and C++).
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
if( not_a_first_call )
goto request_handler;
for(i=0; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
}
I've studied standards, but there isn't much information about such use case. I also wonder whether replacing for by equivalent while will be beneficial from the portability point of view.
Thanks in advance.
UPD: Thanks to all who've commented!
to all commenters :) yes, I understand that I can't jump over initializers of local variables and that I have to save/restore i on each call.
about strong reasons :) This code must implement reverse communication interface. Reverse communication is a coding pattern which tries to avoid using function pointers. Sometimes it have to be used because of legacy code which expects that you will use it.
Unfortunately, r-comm-interface can't be implemented in a nice way. You can't use function pointers and you can't easily split work into several functions.
Seems perfectly legal.
From a draft of the C99 standard http://std.dkuug.dk/JTC1/SC22/WG14/www/docs/n843.htm in the section on the goto statement:
[#3] EXAMPLE 1 It is sometimes convenient to jump into the
middle of a complicated set of statements. The following
outline presents one possible approach to a problem based on
these three assumptions:
1. The general initialization code accesses objects only
visible to the current function.
2. The general initialization code is too large to
warrant duplication.
3. The code to determine the next operation is at the
head of the loop. (To allow it to be reached by
continue statements, for example.)
/* ... */
goto first_time;
for (;;) {
// determine next operation
/* ... */
if (need to reinitialize) {
// reinitialize-only code
/* ... */
first_time:
// general initialization code
/* ... */
continue;
}
// handle other operations
/* ... */
}
Next, we look at the for loop statement:
[#1] Except for the behavior of a continue statement in the |
loop body, the statement
for ( clause-1 ; expr-2 ; expr-3 ) statement
and the sequence of statements
{
clause-1 ;
while ( expr-2 ) {
statement
expr-3 ;
}
}
Putting the two together with your problem tells you that you are jumping past
i=0;
into the middle of a while loop. You will execute
...process data...
and then
i++;
before flow of control jumps to the test in the while/for loop
i<n;
Yes, that's legal.
What you're doing is nowhere near as ugly as e.g. Duff's Device, which also is standard-compliant.
As #Alexandre says, don't use goto to skip over variable declarations with non-trivial constructors.
I'm sure you're not expecting local variables to be preserved across calls, since automatic variable lifetime is so fundamental. If you need some state to be preserved, functors (function objects) would be a good choice (in C++). C++0x lambda syntax makes them even easier to build. In C you'll have no choice but to store state into some state block passed in by pointer by the caller.
First, I need to say that you must reconsider doing this some other way. I've rarely seen someone using goto this days if not for error management.
But if you really want to stick with it, there are a few things you'll need to keep in mind:
Jumping from outside the loop to the middle won't make your code loop. (check the comments below for more info)
Be careful and don't use variables that are set before the label, for instance, referring to *data_to_request. This includes iwhich is set on the for statement and is not initialized when you jump to the label.
Personally, I think in this case I would rather duplicate the code for ...process data... then use goto. And if you pay close attention, you'll notice the return statement inside your for loop, meaning that the code of the label will never get executed unless there's a goto in the code to jump to it.
function foo(int not_a_first_call, int *data_to_request, ...other parameters... )
{
int i = 0;
if( not_a_first_call )
{
...process data...
*data_to_request = i;
return;
}
for (i=0; i<n; i++)
{
*data_to_request = i;
return;
}
}
No, you can't do this. I don't know what this will do exactly, but I do know that as soon as you return, your call stack is unwound and the variable i doesn't exist anymore.
I suggest refactoring. It looks like you're pretty much trying to build an iterator function similar to yield return in C#. Perhaps you could actually write a C++ iterator to do this?
It seems to me that you didn't declare i. From the point of declaration completely depends whether or not this is legal what you are doing, but see below for the initialization
In C you may declare it before the loop or as loop variable. But if it is declared as loop variable its value will not be initialized when you use it, so this is undefined behavior. And if you declare it before the for the assignment of 0 to it will not be performed.
In C++ you can't jump across the constructor of the variable, so you must declare it before the goto.
In both languages you have a more important problem, this is if the value of i is well defined, and if it is initialized if that value makes sense.
Really if there is any way to avoid this, don't do it. Or if this is really, really, performance critical check the assembler if it really does what you want.
If I understand correctly, you're trying to do something on the order of:
The first time foo is called, it needs to request some data from somewhere else, so it sets up that request and immediately returns;
On each subsequent call to foo, it processes the data from the previous request and sets up a new request;
This continues until foo has processed all the data.
I don't understand why you need the for loop at all in this case; you're only iterating through the loop once per call (if I understand the use case here). Unless i has been declared static, you lose its value each time through.
Why not define a type to maintain all the state (such as the current value of i) between function calls, and then define an interface around it to set/query whatever parameters you need:
typedef ... FooState;
void foo(FooState *state, ...)
{
if (FirstCall(state))
{
SetRequest(state, 1);
}
else if (!Done(state))
{
// process data;
SetRequest(state, GetRequest(state) + 1);
}
}
The initialisation part of the for loop will not occur, which makes it somewhat redundant. You need to initialise i before the goto.
int i = 0 ;
if( not_a_first_call )
goto request_handler;
for( ; i<n; i++)
{
*data_to_request = i;
return;
request_handler:
...process data...
}
However, this is really not a good idea!
The code is flawed in any case, the return statment circumvents the loop. As it stands it is equivalent to:
int i = 0 ;
if( not_a_first_call )
\\...process_data...
i++ ;
if( i < n )
{
*data_to_request = i;
}
In the end, if you think you need to do this then your design is flawed, and from the fragment posted your logic also.