Would this method of looping be a bad idea?

Would this method of looping be a bad idea? - c++

I'm building a html/xml slicer (something that cuts the text into a list of meaningful blocks like elementStart, plainText etc.. which I can then use to construct the elements in an OO manner).
I found it convenient to create a logic sub for processing each of the various modes, the main loop info is held in class variables while the sub's do some work and then pass the torch to another one depending on what data they encounter. It occurred to me though, that this might cause some problems (like overflowing the stack? idk) since technically, it doesn't exit a single one of the subs until the entire process is complete. Heres what I mean: (in pseudo code)
void plainLogic() {
// look for '<' with something other than space after it
// add the gathered data to the slice list
switch (data[it]) // <- the something other
{
case '!':
commentLogic();
break;
case '/':
elEndLogic();
break;
default:
elStartLogic();
break;
}
}
and the other *Logic subs are pretty much the same: find the extent of data, bag the goods, go somewhere else. I realized just a little while ago though, that it doesn't actually go anywhere, it just goes deeper and actually has to come back.
Is this bad? Would it be Ok if none of the subs had any sub scope variables? preferred alternatives? Maybe there is like a special return keyword that executes another function after exiting the sub?

Related

Remove/minimize Git merge conflicts

I am trying to think of a way to completely remove or minimize Git merge conflicts for the following scenario:
switch(value)
{
case OLD_CASE_1:
case OLD_CASE_2:
case NEW_CASE_1:
case NEW_CASE_2:
case NEW_CASE_3:
case NEW_CASE_4:
return true;
default:
return false;
}
For each of the new cases, I create a branch where I add just the case specific to that branch. (Eg. feature/new-case-x contains just case NEW_CASE_X:).
At the end of the day, I submit 4 pull requests. As soon as any one of them is merged, the others will enter a conflict state. Since the case order doesn't matter to me, is there a way I can minimize or completely get rid of conflicts? Thank you.

One way would be to do an immediate check-in/promotion with a marker label after creating the named test branch.
The creation of the marker label may be as trivial as a comment, or a pair of comments, so would avoid needing a test cycle before promotion. Perhaps use the branch name to "decorate" the comment. By immediately, I do mean it! So long as 2 new test branches are not both created simultaneously in this small window, they now each have a space to actually write their code that should be conflict free.
You could do the same with the test body, if you have issues with people simultaneously appending tests in existing files: Create a comment block denoting the start and end of their test. So long as they write between the lines a conflict is avoided. This does mean they have to choose which test module they want to append to straight away, of course.

The short answer is no. As R Sahu noted in a comment, Git has no idea about the semantics of the text it is merging. It merely treats it as a set of lines. If, while merging two different change-sets, the lines overlap or abut, you get a merge conflict. For instance, given:
case OLD_CASE_1:
case OLD_CASE_2:
+ case NEW_CASE_1:
return true;
vs
case OLD_CASE_1:
case OLD_CASE_2:
+ case NEW_CASE_2:
return true;
Git would in general declare a conflict here. Were this Go code, rather than C++, Git would be "more right" about this being a true conflict, since Go has no automatic fall-through (or equivalently, has the C++ equivalent of a default break; in front of each case), i.e., in C++ the semantics would be to return true only for the one new case, and no longer for OLD_CASE_2.
Now, if you're in a group that prefers to rebase pull requests, your automatic rebasing will tend to go smoothly if the NEW_CASE_1 PR is accepted and merged first and you build your second PR on the first one and you rebase each of your remaining PRs on the accepted PRs in order. This is more complicated, and prone to a lot more re-work if your PRs must be accepted in some other order for some reason, but sometimes it's pretty nice. Whether you and/or your group want to do this is a much bigger question, though.

standard alternative for non-standard gnu case ranges

I have a question for a quick workaround to enjoy the benefits of the non-standard gnu case ranges. For example, the non-standard:
case 1 ... 5:
Could be replaced by:
case 1:
case 2:
case 3:
case 4:
case 5:
Probably some macro solution might be in order. From my memory macro loops cannot loop for large numbers of iterations. For this reason, what if the range is "large", say in the thousands?

If you're talking preprocessor loops I guess that you're thinking of the preprocessor meta programming from boost. While it's probably quite portable, the loops seems to be limited to 255 "iterations". In fact the implementation is not a real loop, it's more like a hard coded loop-unroll (thereby the limitation). You could of course expand this to more iterations.
While the preprocessor trick could be tempting, I think you should consider using if-else if-else construct. What's actually (often) happens in a modern compiler regarding conditionals is that it boils down to the same construct that should generate the same code (unless you trick the compiler into evaluating the variable multiple times).
You could even combine the constructs, using a switch-case construct for all singular alternatives and then after the default label add an if-else if-else to handle all ranges.
A third solution would be to write a script that finds the case-ranges and replace them by a standard construct, this should be fairly straight forward in most cases as case can't appear in many places without being a keyword and then it should be followed by an expression which can't contain ... in that way. The only problematic situation (that I can think of) would be when the case-range is a result of preprocessor expansion.

The best alternative would be to re-factor the code to use if/else. If there truly are thousands of cases it may or may not be very efficient to have a giant case statement in the first place.
However, it because cases could "fall-through" or other odd flow control like Duff's device (I hope not), it may not be completely a straightforward conversion.
It is not likely to be a very good implementation to abuse the preprocessor to "loop". See Writing a while loop in the C preprocessor for sample of what this might look like.
It may be best to write a simple python or awk script. However this approach may also be flawed if the keyword case appears somewhere like a string or if labels the preprocessor changes anything. This may work very well for a narrow one-off conversion though, but without seeing the code in question it is hard to say.
There is a serious problem with either preprocessing approach if the case labels use enumerations. Since enums are still just text strings at the time the preprocessor runs (or an external script), how can it iterate from STATE_10 to STATE_20 without knowing what integers they represent? It can't - the GNU extension really requires compiler support.
If a one-time wholesale replacement of the case statement is too invasive or irregular to manage, you could probably utilize a hybrid approach:
Assuming you have a (notional) example like:
switch(state)
{
case STATE_1:
xxx; break;
case STATE_2 ... STATE_10:
yyy; break;
}
Allocate a previously unused range of indexes. Add one new special index for each existing range label. Use if/else logic to detect the ranges first and then replace the range case with a new standard one. This allows the control flow structure to remain essentially unmodified:
#if !defined(__GNUC__)
#define STATE_RANGE_2_10 101
if(state >= 2 && state <= 10)
state2 = STATE_RANGE_2_10
else if(...)
state2 = STATE_RANGE_x_y
else
state2 = state;
#else /* GNU */
#define STATE_RANGE_2_10 STATE_2 ... STATE_10
state2 = state;
#endif
switch(state2)
{
case STATE_1:
xxx; break;
case STATE_RANGE_2_10:
yyy; break;
}
With some suitable macros this could even be made portable between GNUC and real C if you really wanted GNUC to still use the extension for some reason. Note: I introduced the state2 variable in case it is stored or used outside the local scope. If not, you can skip that.

How Switch case Statement Implemented or works internally?

I read somewhere that the switch statement uses "Binary Search" or some sorting techniques to exactly choose the correct case and this increases its performance compared to else-if ladder.
And also if we give the case in order does the switch work faster? is it so? Can you add your valuable suggestions on this?
We discussed here about the same and planned to post as a question.

It's actually up to the compiler how a switch statement is realized in code.
However, my understanding is that when it's suitable (that is, relatively dense cases), a jump table is used.
That would mean that something like:
switch(i) {
case 0: doZero(); break;
case 1: doOne();
case 2: doTwo(); break;
default: doDefault();
}
Would end up getting compiled to something like (horrible pseudo-assembler, but it should be clear, I hope).
load i into REG
compare REG to 2
if greater, jmp to DEFAULT
compare REG to 0
if less jmp to DEFAULT
jmp to table[REG]
data table
ZERO
ONE
TWO
end data
ZERO: call doZero
jmp END
ONE: call doOne
TWO: call doTwo
jmp END
DEFAULT: call doDefault
END:
If that's not the case, there are other possible implementations that allow for some extent of "better than a a sequence of conditionals".

How swtich is implemented depends on what values you have. For values that are close in range, the compiler will generally generate a jump table. If the values are far apart, it will generate a linked branch, using something like a binary search to find the right value.
The order of the switch statements as such doesn't matter, it will do the same thing whether you have the order in ascending, descending or random order - do what makes most sense with regard to what you want to do.
If nothing else, switch is usually a lot easier to read than an if-else sequence.

On some googling I found some interestin link and planned to post as an answer to my question.
http://www.codeproject.com/Articles/100473/Something-You-May-Not-Know-About-the-Switch-Statem
Comments are welcome..

Although it can be implemented as several ways it depends on how the language designer wants to implement it.
One possible efficient way is to use Hash Maps
Map every condition (usually integer) to the corresponding expression to be evaluated followed by a jump statement.
Other solutions also might work as often switch has finite conditions but a efficient solution shall be to use Hash map

Another way to use continue keyword in C++

Recently we found a "good way" to comment out lines of code by using continue:
for(int i=0; i<MAX_NUM; i++){
....
.... //--> about 30 lines of code
continue;
....//--> there is about 30 lines of code after continue
....
}
I scratch my head by asking why the previous developer put the continue keyword inside the intensive loop. Most probably is he/she feel it's easier to put a "continue" keyword instead of removing all the unwanted code...
It trigger me another question, by looking at below scenario:
Scenario A:
for(int i=0; i<MAX_NUM; i++){
....
if(bFlag)
continue;
....//--> there is about 100 lines of code after continue
....
}
Scenario B:
for(int i=0; i<MAX_NUM; i++){
....
if(!bFlag){
....//--> there is about 100 lines of code after continue
....
}
}
Which do you think is the best? Why?
How about break keyword?

Using continue in this case reduces nesting greatly and often makes code more readable.
For example:
for(...) {
if( condition1 ) {
Object* pointer = getObject();
if( pointer != 0 ) {
ObjectProperty* property = pointer->GetProperty();
if( property != 0 ) {
///blahblahblah...
}
}
}
becomes just
for(...) {
if( !condition1 ) {
continue;
}
Object* pointer = getObject();
if( pointer == 0 ) {
continue;
}
ObjectProperty* property = pointer->GetProperty();
if( property == 0 ) {
continue;
}
///blahblahblah...
}
You see - code becomes linear instead of nested.
You might also find answers to this closely related question helpful.

For your first question, it may be a way of skipping the code without commenting it out or deleting it. I wouldn't recommend doing this. If you don't want your code to be executed, don't precede it with a continue/break/return, as this will raise confusion when you/others are reviewing the code and may be seen as a bug.
As for your second question, they are basically identical (depends on assembly output) performance wise, and greatly depends on design. It depends on the way you want the readers of the code to "translate" it into english, as most do when reading back code.
So, the first example may read "Do blah, blah, blah. If (expression), continue on to the next iteration."
While the second may read "Do blah, blah, blah. If (expression), do blah, blah, blah"
So, using continue of an if statement may undermine the importance of the code that follows it.
In my opinion, I would prefer the continue if I could, because it would reduce nesting.

I hate comment out unused code. What I did is that,
I remove them completely and then check-in into version control.
Who still need to comment out unused code after the invention of source code control?

That "comment" use of continue is about as abusive as a goto :-). It's so easy to put an #if 0/#endif or /*...*/, and many editors will then colour-code the commented code so it's immediately obvious that it's not in use. (I sometimes like e.g. #ifdef USE_OLD_VERSION_WITH_LINEAR_SEARCH so I know what's left there, given it's immediately obvious to me that I'd never have such a stupid macro name if I actually expected someone to define it during the compile... guess I'd have to explain that to the team if I shared the code in that state though.) Other answers point out source control systems allow you to simply remove the commented code, and while that's my practice before commit - there's often a "working" stage where you want it around for maximally convenient cross-reference, copy-paste etc..
For scenarios: practically, it doesn't matter which one you use unless your project has a consistent approach that you need to fit in with, so I suggest using whichever seems more readable/expressive in the circumstances. In longer code blocks, a single continue may be less visible and hence less intuitive, while a group of them - or many scattered throughout the loop - are harder to miss. Overly nested code can get ugly too. So choose either if unsure then change it if the alternative starts to look appealing.
They communicate subtly different information to the reader too: continue means "hey, rule out all these circumstances and then look at the code below", whereas the if block means you have to "push" a context but still have them all in your mind as you try to understand the rest of the loop internals (here, only to find the if immediately followed by the loop termination, so all that mental effort was wasted. Countering this, continue statements tend to trigger a mental check to ensure all necessary steps have been completed before the next loop iteration - that it's all just as valid as whatever follows might be, and if someone say adds an extra increment or debug statement at the bottom of the loop then they have to know there are continue statements they may also want to handle.
You may even decide which to use based on how trivial the test is, much as some programmers will use early return statements for exceptional error conditions but will use a "result" variable and structured programming for anticipated flows. It can all get messy - programming has to be at least as complex as the problems - your job is to make it minimally messier / more-complex than that.
To be productive, it's important to remember "Don't sweat the small stuff", but in IT it can be a right pain learning what's small :-).
Aside: you may find it useful to do some background reading on the pros/cons of structured programming, which involves single entry/exit points, gotos etc..

I agree with other answerers that the first use of continue is BAD. Unused code should be removed (should you still need it later, you can always find it from your SCM - you do use an SCM, right? :-)
For the second, some answers have emphasized readability, but I miss one important thing: IMO the first move should be to extract that 100 lines of code into one or more separate methods. After that, the loop becomes much shorter and simpler, and the flow of execution becomes obvious. If I can extract the code into a single method, I personally prefer an if:
for(int i=0; i<MAX_NUM; i++){
....
if(!bFlag){
doIntricateCalculation(...);
}
}
But a continue would be almost equally fine to me. In fact, if there are multiple continues / returns / breaks within that 100 lines of code, it is impossible to extract it into a single method, so then the refactoring might end up with a series of continues and method calls:
for(int i=0; i<MAX_NUM; i++){
....
if(bFlag){
continue;
}
SomeClass* someObject = doIntricateCalculation(...);
if(!someObject){
continue;
}
SomeOtherClass* otherObject = doAnotherIntricateCalculation(someObject);
if(!otherObject){
continue;
}
// blah blah
}

continue is useful in a high complexity for loop. It's bad practice to use it to comment out the remaining code of a loop even for temporary debugging since people tends to forget...

Think on readability first, which is what is going to make your code more maintainable. Using a continue statement is clear to the user: under this condition there is nothing else I can/want to do with this element, forget about it and try the next one. On the other hand, the if is only telling that the next block of code does not apply to those for which the condition is not met, but if the block is big enough, you might not know whether there is actually any further code that will apply to this particular element.
I tend to prefer the continue over the if for this particular reason. It more explicitly states the intent.

Why Switch/Case and not If/Else If?

This question in mainly pointed at C/C++, but I guess other languages are relevant as well.
I can't understand why is switch/case still being used instead of if/else if. It seems to me much like using goto's, and results in the same sort of messy code, while the same results could be acheived with if/else if's in a much more organized manner.
Still, I see these blocks around quite often. A common place to find them is near a message-loop (WndProc...), whereas these are among the places when they raise the heaviest havoc: variables are shared along the entire block, even when not propriate (and can't be initialized inside it). Extra attention has to be put on not dropping break's, and so on...
Personally, I avoid using them, and I wonder wether I'm missing something?
Are they more efficient than if/else's?
Are they carried on by tradition?

Summarising my initial post and comments - there are several advantages of switch statement over if/else statement:
Cleaner code. Code with multiple chained if/else if ... looks messy and is difficult to maintain - switch gives cleaner structure.
Performance. For dense case values compiler generates jump table, for sparse - binary search or series of if/else, so in worst case switch is as fast as if/else, but typically faster. Although some compilers can similarly optimise if/else.
Test order doesn't matter. To speed up series of if/else tests one needs to put more likely cases first. With switch/case programmer doesn't need to think about this.
Default can be anywhere. With if/else default case must be at the very end - after last else. In switch - default can be anywhere, wherever programmer finds it more appropriate.
Common code. If you need to execute common code for several cases, you may omit break and the execution will "fall through" - something you cannot achieve with if/else. (There is a good practice to place a special comment /* FALLTHROUGH */ for such cases - lint recognises it and doesn't complain, without this comment it does complain as it is common error to forgot break).
Thanks to all commenters.

Well, one reason is clarity....
if you have a switch/case, then the expression can't change....
i.e.
switch (foo[bar][baz]) {
case 'a':
...
break;
case 'b':
...
break;
}
whereas with if/else, if you write by mistake (or intent):
if (foo[bar][baz] == 'a') {
....
}
else if (foo[bar][baz+1] == 'b') {
....
}
people reading your code will wonder "were the foo expressions supposed to be the same", or "why are they different"?

please remember that case/select provides additional flexibility:
condition is evaluated once
is flexible enough to build things like the Duff's device
fallthrough (aka case without break)
as well as it executes much faster (via jump/lookup table) * historically

Also remember that switch statements allows the flow of control to continue, which allows you to nicely combine conditions while allowing you to add additional code for certain conditions, such as in the following piece of code:
switch (dayOfWeek)
{
case MONDAY:
garfieldUnhappy = true;
case TUESDAY:
case WEDNESDAY:
case THURSDAY:
case FRIDAY:
weekDay = true;
break;
case SATURDAY:
weekendJustStarted = true;
case SUNDAY:
weekendDay = true;
break;
}
Using if/else statements here instead would not be anywhere as nice.
if (dayOfWeek == MONDAY)
{
garfieldUnhappy = true;
}
if (dayOfWeek == SATURDAY)
{
weekendJustStarted = true;
}
if (dayOfWeek == MONDAY || dayOfWeek == TUESDAY || dayOfWeek == WEDNESDAY
|| dayOfWeek == THURSDAY || dayOfWeek == FRIDAY)
{
weekDay = true;
}
else if (dayOfWeek == SATURDAY || dayOfWeek == SUNDAY)
{
weekendDay = true;
}

If there are lots of cases, the switch statement seems cleaner.
It's also nice when you have multiple values for which you want the same behavior - just using multiple "case" statements that fall through to a single implementation is much easier to read than a if( this || that || someotherthing || ... )

It might also depend on your language -- For example, some languages switch only works with numeric types, so it saves you some typing when you're working with an enumerated value, numeric constants... etc...
If (day == DAYOFWEEK_MONDAY) {
//...
}
else if (day == DAYOFWEEK_TUESDAY) {
//...
}
//etc...
Or slightly easier to read...
switch (day) {
case DAYOFWEEK_MONDAY :
//...
case DAYOFWEEK_TUESDAY :
//...
//etc...
}

Switch/case is usually optimized more efficiently than if/else if/else, but is occasionally (depending on language and compiler) translated to simple if/else if/else statements.
I personally think switch statements makes code more readable than a bunch of if statements; provided that you follow a few simple rules. Rules you should probably follow even for your if/else if/else situations, but that's again my opinion.
Those rules:
Never, ever, have more than one line on your switch block. Call a method or function and do your work there.
Always check for break/ case fallthrough.
Bubble up exceptions.

Clarity. As I said here, a clue that else if is problematic is
the frequency with which ELSE IF is
used in a far more constrained way
than is allowed by the syntax. It is a
sledgehammer of flexibility,
permitting entirely unrelated
conditions to be tested. But it is
routinely used to swat the flies of
CASE, comparing the same expression
with alternate values...
This reduces the readability of the
code. Since the structure permits a
universe of conditional complexity,
the reader needs to keep more
possibilities in mind when parsing
ELSE IF than when parsing CASE.

Actually a switch statement implies that you are working off of something that is more or less an enum which gives you an instant clue what's going on.
That said, a switch on an enum in any OO language could probably be coded better--and a series of if/else's on the same "enum" style value would be at least as bad and even worse at conveying meaning.

addressing the concern that everything inside the switch has equivalent scope, you can always throw your case logic into another { } block, like so ..
switch( thing ) {
case ONETHING: {
int x; // local to the case!
...
}
break;
case ANOTHERTHING: {
int x; // a different x than the other one
}
break;
}
.. now I'm not saying that's pretty. Just putting it out there as something that's possible if you absolutely have to isolate something in one case from another.
one other thought on the scope issue - it seems like a good practice to only put one switch inside a function, and not a lot else. Under those circumstances, variable scope isn't as much of a concern, since that way you're generally only dealing with one case of execution on any given invocation of the function.
ok, one last thought on switches: if a function contains more than a couple of switches, it's probably time to refactor your code. If a function contains nested switches, it's probably a clue to rethink your design a bit =)

switch case is mainly used to have the choice to made in the programming .This is not related the conditional statement as :
if your program only require the choice to make then why you use the if/else block and increase the programming effort plus it reduce the execution speed of the program .

Switch statements can be optimized for speed, but can take up more memory if the case values are spread out over large numbers of values.
if/else are generally slow, as each value needs to be checked.

A Smalltalker might reject both switch and if-then-else's and might write something like:-
shortToLongDaysMap := Dictionary new.
shortToLongDaysMap
at: 'Mon' put: 'Monday';
at: 'Tue' put: 'Tuesday';
at: 'Wed' put: 'Wednesday'
etc etc.
longForm := shortToLongDaysMap at: shortForm ifAbsent: [shortForm]
This is a trivial example but I hope you can see how this technique scales for large numbers of cases.
Note the second argument to at:IfAbsent: is similar to the default clause of a case statement.

The main reason behind this is Maintainability and readability. Its easy to make code more readable and maintainable with Switch/case statement then if/else. Because you have many if/else then code become so much messy like nest and its very hard to maintain it.
And some how execution time is another reason.

Pretty sure they compile to the same things as if/else if, but I find the switch/case easier to read when there are more than 2 or 3 elses.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js