C++ Writing an Interpreter - determining loops target for break statement c++ - c++

I am writing a simple program interpreter in c++. When I am building the internal representation of the program and I get a break statement, how do I determine the encompassing loops target location?
void Imp::whilestmt()
{
Expr *pExpr;
accept(Token::WHILE);
expr(pExpr);
WhileStmt *pwhilestmt = new WhileStmt(pExpr,vm.getLocationCounter);
vm.add(pwhilestmt);
accept(Token::LOOP);
stmtlist();
pwhilestmt->setTarget(vm.getLocationCounter);
accept(Token::END);
accept(Token::LOOP);
vm.add(new EndLoopStmt);
}
My break statement object is going to take the the while statement's target as a parameter, how can I determine this?

I'd consider building a kind of execution tree/pipeline. Every LOOP/WHILE would be a new branch (similarily to every function) so when you encounter END/BREAK instruction you just revert to the branches origin point and continue down the line.

I think the solution is to add a forward reference that is resolved (by looking up the location of the end of the loop) when all the code for that level of loop has been produced.
In other words, when generating the code for the loop, you need to form a "jump" instruction, which has it's target set to somewhere you don't know where it is yet. The solution is to have a jump with an unknown destination (set the "destination" to instruction 0 or -1 or 0xdeaddead or something else that can be easily identified for debugging purposes later - because the best way to avoid getting bugs of "I didn't fix it up properly" is to make it easy to identify those places - bugs only occur in things that are hard to identify, just like it never rains when you carry an umbrella), and keep a fixup list of such jumps until you have generated the entire loop, then work your way through that fixup list, and fill in the relevant address that you now know is "here" (the next instruction after the loop). I suspect you also need something similar for the condition of the loop itself - if that's false, then you need to continue "after" the loop.

I added setTarget as a virtual function of Stmt.
I stored the start location in the part that handles the if statements and then checked if I had any break stmts from the start location to the current location, and if I did I set the target to the current location.
really messy way to do it, but it works for now

Related

How to use [[(un)likely]] at do while loop in C++20?

do [[unlikely]]
{...}
while(a == 0);
This code can be compiled.
But is this the correct way to tell compiler that a is usually non-zero.
Structurally, this is a correct way to say what you're trying to say. The attribute is placed in a location that tags the path of execution that is likely/unlikely to be executed. Applying it to the block statement of the do/while loop works adequately. It would also work within the block.
That having been said, it's unclear what good this would do practically. It might prevent some unrolling of the loop or inhibit prefetching. But it can't really change the structure of the compiled code, since the block has to be executed at least once and the conditional branch has to come after the block.

How do I know when a variable is accessed within my code?

I'm using VS2008 to write a program. There's one specific line in my code that causes a numerical error. It is:
Qp[j] = (Cp - Cm)/(Bp + Bm);
Qp is a std::vector. When I comment this line out, the numerical error disappears. I am going through my code line by line to find all the places that access Qp[j]. I was wondering if there was a feature in VS2008 or a linux program that wraps around the executable that can identify every line of code that reads from that section of memory (the specific element in the vector)?
I tried searching online but the keywords I used brought up results relating to global variables.
--- EDIT
Hi all. To those have responded, thank you. Just to clarify my question:
Imagine I have a vector with 5 elements. I'd like to know all the places in my code that use the value stored in element 3 at any point in time during execution. Is there an easy way to do this?
I am not sure if I understand you correctly, but if you comment out that line and the code works then maybe the problem is that line, and you don't need to check others lines.
Maybe in your case you get in the situation where Bp+Bm = 0 (division by zero error).
Qp may not have as many elements as the index j, check the size of Qp.

How can I avoid using the stack with continuation-passing style?

For my diploma thesis I chose to implement the task of the ICFP 2004 contest.
The task--as I translated it to myself--is to write a compiler which translates a high-level ant-language into a low-level ant-assembly. In my case this means using a DSL written in Clojure (a Lisp dialect) as the high-level ant-language to produce ant-assembly.
UPDATE:
The ant-assembly has several restrictions: there are no assembly-instructions for calling functions (that is, I can't write CALL function1, param1), nor returning from functions, nor pushing return addresses onto a stack. Also, there is no stack at all (for passing parameters), nor any heap, or any kind of memory. The only thing I have is a GOTO/JUMP instruction.
Actually, the ant-assembly is for to describe the transitions of a state machine (=the ants' "brain"). For "function calls" (=state transitions) all I have is a JUMP/GOTO.
While not having anything like a stack, heap or a proper CALL instruction, I still would like to be able to call functions in the ant-assembly (by JUMPing to certain labels).
At several places I read that transforming my Clojure DSL function calls into continuation-passing style (CPS) I can avoid using the stack[1], and I can translate my ant-assembly function calls into plain JUMPs (or GOTOs). Which is exactly what I need, because in the ant-assembly I have no stack at all, only a GOTO instruction.
My problem is that after an ant-assembly function has finished, I have no way to tell the interpreter (which interprets the ant-assembly instructions) where to continue. Maybe an example helps:
The high-level Clojure DSL:
(defn search-for-food [cont]
(sense-food-here? ; a conditional w/ 2 branches
(pickup-food ; true branch, food was found
(go-home ; ***
(drop-food
(search-for-food cont))))
(move ; false branch, continue searching
(search-for-food cont))))
(defn run-away-from-enemy [cont]
(sense-enemy-here? ; a conditional w/ 2 branches
(go-home ; ***
(call-help-from-others cont))
(search-for-food cont)))
(defn go-home [cont]
(turn-backwards
; don't bother that this "while" is not in CPS now
(while (not (sense-home-here?))
(move)))
(cont))
The ant-assembly I'd like to produce from the go-home function is:
FUNCTION-GO-HOME:
turn left nextline
turn left nextline
turn left nextline ; now we turned backwards
SENSE-HOME:
sense here home WE-ARE-AT-HOME CONTINUE-MOVING
CONTINUE-MOVING:
move SENSE-HOME
WE-ARE-AT-HOME:
JUMP ???
FUNCTION-DROP-FOOD:
...
FUNCTION-CALL-HELP-FROM-OTHERS:
...
The syntax for the ant-asm instructions above:
turn direction which-line-to-jump
sense direction what jump-if-true jump-if-false
move which-line-to-jump
My problem is that I fail to find out what to write to the last line in the assembly (JUMP ???). Because--as you can see in the example--go-home can be invoked with two different continuations:
(go-home
(drop-food))
and
(go-home
(call-help-from-others))
After go-home has finished I'd like to call either drop-food or call-help-from-others. In assembly: after I arrived at home (=the WE-ARE-AT-HOME label) I'd like to jump either to the label FUNCTION-DROP-FOOD or to the FUNCTION-CALL-HELP-FROM-OTHERS.
How could I do that without a stack, without PUSHing the address of the next instruction (=FUNCTION-DROP-FOOD / FUNCTION-CALL-HELP-FROM-OTHERS) to the stack? My problem is that I don't understand how continuation-passing style (=no stack, only a GOTO/JUMP) could help me solving this problem.
(I can try to explain this again if the things above are incomprehensible.)
And huge thanks in advance for your help!
--
[1] "interpreting it requires no control stack or other unbounded temporary storage". Steele: Rabbit: a compiler for Scheme.
Yes, you've provided the precise motivation for continuation-passing style.
It looks like you've partially translated your code into continuation-passing-style, but not completely.
I would advise you to take a look at PLAI, but I can show you a bit of how your function would be transformed, assuming I can guess at clojure syntax, and mix in scheme's lambda.
(defn search-for-food [cont]
(sense-food-here? ; a conditional w/ 2 branches
(search-for-food
(lambda (r)
(drop-food r
(lambda (s)
(go-home s cont)))))
(search-for-food
(lambda (r)
(move r cont)))))
I'm a bit confused by the fact that you're searching for food whether or not you sense food here, and I find myself suspicious that either this is weird half-translated code, or just doesn't mean exactly what you think it means.
Hope this helps!
And really: go take a look at PLAI. The CPS transform is covered in good detail there, though there's a bunch of stuff for you to read first.
Your ant assembly language is not even Turing-complete. You said it has no memory, so how are you supposed to allocate the environments for your function calls? You can at most get it to accept regular languages and simulate finite automata: anything more complex requires memory. To be Turing-complete you'll need what amounts to a garbage-collected heap. To do everything you need to do to evaluate CPS terms you'll also need an indirect GOTO primitive. Function calls in CPS are basically (possibly indirect) GOTOs that provide parameter passing, and the parameters you pass require memory.
Clearly, your two basic options are to inline everything, with no "external" procedures (for extra credit look up the original meaning of "internal" and "external" here), or somehow "remember" where you need to go on "return" from a procedure "call" (where the return point does not necessarily need to fall in the physical locations immediately following the "calling" point). Basically, the return point identifier can be a code address, an index into a branch table, or even a character symbol -- it just needs to identify the return target relative to the called procedure.
The most obvious here would be to track, in your compiler, all of the return targets for a given call target, then, at the end of the called procedure, build a branch table (or branch ladder) to select from one of the several possible return targets. (In most cases there are only a handful of possible return targets, though for commonly used procedures there could be hundreds or thousands.) Then, at the call point, the caller needs to load a parameter with the index of its return point relative to the called procedure.
Obviously, if the callee in turn calls another procedure, the first return point identifier must be preserved somehow.
Continuation passing is, after all, just a more generalized form of a return address.
You might be interested in Andrew Appel's book Compiling with Continuations.

Another way to use continue keyword in C++

Recently we found a "good way" to comment out lines of code by using continue:
for(int i=0; i<MAX_NUM; i++){
....
.... //--> about 30 lines of code
continue;
....//--> there is about 30 lines of code after continue
....
}
I scratch my head by asking why the previous developer put the continue keyword inside the intensive loop. Most probably is he/she feel it's easier to put a "continue" keyword instead of removing all the unwanted code...
It trigger me another question, by looking at below scenario:
Scenario A:
for(int i=0; i<MAX_NUM; i++){
....
if(bFlag)
continue;
....//--> there is about 100 lines of code after continue
....
}
Scenario B:
for(int i=0; i<MAX_NUM; i++){
....
if(!bFlag){
....//--> there is about 100 lines of code after continue
....
}
}
Which do you think is the best? Why?
How about break keyword?
Using continue in this case reduces nesting greatly and often makes code more readable.
For example:
for(...) {
if( condition1 ) {
Object* pointer = getObject();
if( pointer != 0 ) {
ObjectProperty* property = pointer->GetProperty();
if( property != 0 ) {
///blahblahblah...
}
}
}
becomes just
for(...) {
if( !condition1 ) {
continue;
}
Object* pointer = getObject();
if( pointer == 0 ) {
continue;
}
ObjectProperty* property = pointer->GetProperty();
if( property == 0 ) {
continue;
}
///blahblahblah...
}
You see - code becomes linear instead of nested.
You might also find answers to this closely related question helpful.
For your first question, it may be a way of skipping the code without commenting it out or deleting it. I wouldn't recommend doing this. If you don't want your code to be executed, don't precede it with a continue/break/return, as this will raise confusion when you/others are reviewing the code and may be seen as a bug.
As for your second question, they are basically identical (depends on assembly output) performance wise, and greatly depends on design. It depends on the way you want the readers of the code to "translate" it into english, as most do when reading back code.
So, the first example may read "Do blah, blah, blah. If (expression), continue on to the next iteration."
While the second may read "Do blah, blah, blah. If (expression), do blah, blah, blah"
So, using continue of an if statement may undermine the importance of the code that follows it.
In my opinion, I would prefer the continue if I could, because it would reduce nesting.
I hate comment out unused code. What I did is that,
I remove them completely and then check-in into version control.
Who still need to comment out unused code after the invention of source code control?
That "comment" use of continue is about as abusive as a goto :-). It's so easy to put an #if 0/#endif or /*...*/, and many editors will then colour-code the commented code so it's immediately obvious that it's not in use. (I sometimes like e.g. #ifdef USE_OLD_VERSION_WITH_LINEAR_SEARCH so I know what's left there, given it's immediately obvious to me that I'd never have such a stupid macro name if I actually expected someone to define it during the compile... guess I'd have to explain that to the team if I shared the code in that state though.) Other answers point out source control systems allow you to simply remove the commented code, and while that's my practice before commit - there's often a "working" stage where you want it around for maximally convenient cross-reference, copy-paste etc..
For scenarios: practically, it doesn't matter which one you use unless your project has a consistent approach that you need to fit in with, so I suggest using whichever seems more readable/expressive in the circumstances. In longer code blocks, a single continue may be less visible and hence less intuitive, while a group of them - or many scattered throughout the loop - are harder to miss. Overly nested code can get ugly too. So choose either if unsure then change it if the alternative starts to look appealing.
They communicate subtly different information to the reader too: continue means "hey, rule out all these circumstances and then look at the code below", whereas the if block means you have to "push" a context but still have them all in your mind as you try to understand the rest of the loop internals (here, only to find the if immediately followed by the loop termination, so all that mental effort was wasted. Countering this, continue statements tend to trigger a mental check to ensure all necessary steps have been completed before the next loop iteration - that it's all just as valid as whatever follows might be, and if someone say adds an extra increment or debug statement at the bottom of the loop then they have to know there are continue statements they may also want to handle.
You may even decide which to use based on how trivial the test is, much as some programmers will use early return statements for exceptional error conditions but will use a "result" variable and structured programming for anticipated flows. It can all get messy - programming has to be at least as complex as the problems - your job is to make it minimally messier / more-complex than that.
To be productive, it's important to remember "Don't sweat the small stuff", but in IT it can be a right pain learning what's small :-).
Aside: you may find it useful to do some background reading on the pros/cons of structured programming, which involves single entry/exit points, gotos etc..
I agree with other answerers that the first use of continue is BAD. Unused code should be removed (should you still need it later, you can always find it from your SCM - you do use an SCM, right? :-)
For the second, some answers have emphasized readability, but I miss one important thing: IMO the first move should be to extract that 100 lines of code into one or more separate methods. After that, the loop becomes much shorter and simpler, and the flow of execution becomes obvious. If I can extract the code into a single method, I personally prefer an if:
for(int i=0; i<MAX_NUM; i++){
....
if(!bFlag){
doIntricateCalculation(...);
}
}
But a continue would be almost equally fine to me. In fact, if there are multiple continues / returns / breaks within that 100 lines of code, it is impossible to extract it into a single method, so then the refactoring might end up with a series of continues and method calls:
for(int i=0; i<MAX_NUM; i++){
....
if(bFlag){
continue;
}
SomeClass* someObject = doIntricateCalculation(...);
if(!someObject){
continue;
}
SomeOtherClass* otherObject = doAnotherIntricateCalculation(someObject);
if(!otherObject){
continue;
}
// blah blah
}
continue is useful in a high complexity for loop. It's bad practice to use it to comment out the remaining code of a loop even for temporary debugging since people tends to forget...
Think on readability first, which is what is going to make your code more maintainable. Using a continue statement is clear to the user: under this condition there is nothing else I can/want to do with this element, forget about it and try the next one. On the other hand, the if is only telling that the next block of code does not apply to those for which the condition is not met, but if the block is big enough, you might not know whether there is actually any further code that will apply to this particular element.
I tend to prefer the continue over the if for this particular reason. It more explicitly states the intent.

When should I use do-while instead of while loops? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
When I was taking CS in college (mid 80's), one of the ideas that was constantly repeated was to always write loops which test at the top (while...) rather than at the bottom (do ... while) of the loop. These notions were often backed up with references to studies which showed that loops which tested at the top were statistically much more likely to be correct than their bottom-testing counterparts.
As a result, I almost always write loops which test at the top. I don't do it if it introduces extra complexity in the code, but that case seems rare. I notice that some programmers tend to almost exclusively write loops that test at the bottom. When I see constructs like:
if (condition)
{
do
{
...
} while (same condition);
}
or the inverse (if inside the while), it makes me wonder if they actually wrote it that way or if they added the if statement when they realized the loop didn't handle the null case.
I've done some googling, but haven't been able to find any literature on this subject. How do you guys (and gals) write your loops?
I always follow the rule that if it should run zero or more times, test at the beginning, if it must run once or more, test at the end. I do not see any logical reason to use the code you listed in your example. It only adds complexity.
Use while loops when you want to test a condition before the first iteration of the loop.
Use do-while loops when you want to test a condition after running the first iteration of the loop.
For example, if you find yourself doing something like either of these snippets:
func();
while (condition) {
func();
}
//or:
while (true){
func();
if (!condition) break;
}
You should rewrite it as:
do{
func();
} while(condition);
Difference is that the do loop executes "do something" once and then checks the condition to see if it should repeat the "do something" while the while loop checks the condition before doing anything
Does avoiding do/while really help make my code more readable?
No.
If it makes more sense to use a do/while loop, then do so. If you need to execute the body of a loop once before testing the condition, then a do/while loop is probably the most straightforward implementation.
First one may not execute at all if condition is false. Other one will execute at least once, then check the conidition.
For the sake of readability it seems sensible to test at the top. The fact it is a loop is important; the person reading the code should be aware of the loop conditions before trying to comprehend the body of the loop.
Here's a good real-world example I came across recently. Suppose you have a number of processing tasks (like processing elements in an array) and you wish to split the work between one thread per CPU core present. There must be at least one core to be running the current code! So you can use a do... while something like:
do {
get_tasks_for_core();
launch_thread();
} while (cores_remaining());
It's almost negligable, but it might be worth considering the performance benefit: it could equally be written as a standard while loop, but that would always make an unnecessary initial comparison that would always evaluate true - and on single-core, the do-while condition branches more predictably (always false, versus alternating true/false for a standard while).
Yaa..its true.. do while will run atleast one time.
Thats the only difference. Nothing else to debate on this
The first tests the condition before performing so it's possible your code won't ever enter the code underneath. The second will perform the code within before testing the condition.
The while loop will check "condition" first; if it's false, it will never "do something." But the do...while loop will "do something" first, then check "condition".
Yes, just like using for instead of while, or foreach instead of for improves readability. That said some circumstances need do while and I agree you would be silly to force those situations into a while loop.
It's more helpful to think in terms of common usage. The vast majority of while loops work quite naturally with while, even if they could be made to work with do...while, so basically you should use it when the difference doesn't matter. I would thus use do...while for the rare scenarios where it provides a noticeable improvement in readability.
The use cases are different for the two. This isn't a "best practices" question.
If you want a loop to execute based on the condition exclusively than use
for or while
If you want to do something once regardless of the the condition and then continue doing it based the condition evaluation.
do..while
For anyone who can't think of a reason to have a one-or-more times loop:
try {
someOperation();
} catch (Exception e) {
do {
if (e instanceof ExceptionIHandleInAWierdWay) {
HandleWierdException((ExceptionIHandleInAWierdWay)e);
}
} while ((e = e.getInnerException())!= null);
}
The same could be used for any sort of hierarchical structure.
in class Node:
public Node findSelfOrParentWithText(string text) {
Node node = this;
do {
if(node.containsText(text)) {
break;
}
} while((node = node.getParent()) != null);
return node;
}
A while() checks the condition before each execution of the loop body and a do...while() checks the condition after each execution of the loop body.
Thus, **do...while()**s will always execute the loop body at least once.
Functionally, a while() is equivalent to
startOfLoop:
if (!condition)
goto endOfLoop;
//loop body goes here
goto startOfLoop;
endOfLoop:
and a do...while() is equivalent to
startOfLoop:
//loop body
//goes here
if (condition)
goto startOfLoop;
Note that the implementation is probably more efficient than this. However, a do...while() does involve one less comparison than a while() so it is slightly faster. Use a do...while() if:
you know that the condition will always be true the first time around, or
you want the loop to execute once even if the condition is false to begin with.
Here is the translation:
do { y; } while(x);
Same as
{ y; } while(x) { y; }
Note the extra set of braces are for the case you have variable definitions in y. The scope of those must be kept local like in the do-loop case. So, a do-while loop just executes its body at least once. Apart from that, the two loops are identical. So if we apply this rule to your code
do {
// do something
} while (condition is true);
The corresponding while loop for your do-loop looks like
{
// do something
}
while (condition is true) {
// do something
}
Yes, you see the corresponding while for your do loop differs from your while :)
As noted by Piemasons, the difference is whether the loop executes once before doing the test, or if the test is done first so that the body of the loop might never execute.
The key question is which makes sense for your application.
To take two simple examples:
Say you're looping through the elements of an array. If the array has no elements, you don't want to process number one of zero. So you should use WHILE.
You want to display a message, accept a response, and if the response is invalid, ask again until you get a valid response. So you always want to ask once. You can't test if the response is valid until you get a response, so you have to go through the body of the loop once before you can test the condition. You should use DO/WHILE.
I tend to prefer do-while loops, myself. If the condition will always be true at the start of the loop, I prefer to test it at the end. To my eye, the whole point of testing conditions (other than assertions) is that one doesn't know the result of the test. If I see a while loop with the condition test at the top, my inclination is to consider the case that the loop executes zero times. If that can never happen, why not code in a way that clearly shows that?
It's actually meant for a different things. In C, you can use do - while construct to achieve both scenario (runs at least once and runs while true). But PASCAL has repeat - until and while for each scenario, and if I remember correctly, ADA has another construct that lets you quit in the middle, but of course that's not what you're asking.
My answer to your question : I like my loop with testing on top.
Both conventions are correct if you know how to write the code correctly :)
Usually the use of second convention ( do {} while() ) is meant to avoid have a duplicated statement outside the loop. Consider the following (over simplified) example:
a++;
while (a < n) {
a++;
}
can be written more concisely using
do {
a++;
} while (a < n)
Of course, this particular example can be written in an even more concise way as (assuming C syntax)
while (++a < n) {}
But I think you can see the point here.
while( someConditionMayBeFalse ){
// this will never run...
}
// then the alternative
do{
// this will run once even if the condition is false
while( someConditionMayBeFalse );
The difference is obvious and allows you to have code run and then evaluate the result to see if you have to "Do it again" and the other method of while allows you to have a block of script ignored if the conditional is not met.
I write mine pretty much exclusively testing at the top. It's less code, so for me at least, it's less potential to screw something up (e.g., copy-pasting the condition makes two places you always have to update it)
It really depends there are situations when you want to test at the top, others when you want to test at the bottom, and still others when you want to test in the middle.
However the example given seems absurd. If you are going to test at the top, don't use an if statement and test at the bottom, just use a while statement, that's what it is made for.
You should first think of the test as part of the loop code. If the test logically belongs at the start of the loop processing, then it's a top-of-the-loop test. If the test logically belongs at the end of the loop (i.e. it decides if the loop should continue to run), then it's probably a bottom-of-the-loop test.
You will have to do something fancy if the test logically belongs in them middle. :-)
I guess some people test at the bottom because you could save one or a few machine cycles by doing that 30 years ago.
To write code that is correct, one basically needs to perform a mental, perhaps informal proof of correctness.
To prove a loop correct, the standard way is to choose a loop invariant, and an induction proof. But skip the complicated words: what you do, informally, is figure out something that is true of each iteration of the loop, and that when the loop is done, what you wanted accomplished is now true. The loop invariant is false at the end, for the loop to terminate.
If the loop conditions map fairly easily to the invariant, and the invariant is at the top of the loop, and one infers that the invariant is true at the next iteration of the loop by working through the code of the loop, then it is easy to figure out that the loop is correct.
However, if the invariant is at the bottom of the loop, then unless you have an assertion just prior to the loop (a good practice) then it becomes more difficult because you have to essentially infer what that invariant should be, and that any code that ran before the loop makes the loop invariant true (since there is no loop precondition, code will execute in the loop). It just becomes that more difficult to prove correct, even if it is an informal in-your-head proof.
This isn't really an answer but a reiteration of something one of my lecturers said and it interested me at the time.
The two types of loop while..do and do..while are actually instances of a third more generic loop, which has the test somewhere in the middle.
begin loop
<Code block A>
loop condition
<Code block B>
end loop
Code block A is executed at least once and B is executed zero or more times, but isn't run on the very last (failing) iteration. a while loop is when code block a is empty and a do..while is when code block b is empty. But if you're writing a compiler, you might be interested in generalizing both cases to a loop like this.
In a typical Discrete Structures class in computer science, it's an easy proof that there is an equivalence mapping between the two.
Stylistically, I prefer while (easy-expr) { } when easy-expr is known up front and ready to go, and the loop doesn't have a lot of repeated overhead/initialization. I prefer do { } while (somewhat-less-easy-expr); when there is more repeated overhead and the condition may not be quite so simple to set up ahead of time. If I write an infinite loop, I always use while (true) { }. I can't explain why, but I just don't like writing for (;;) { }.
I would say it is bad practice to write if..do..while loops, for the simple reason that this increases the size of the code and causes code duplications. Code duplications are error prone and should be avoided, as any change to one part must be performed on the duplicate as well, which isn't always the case. Also, bigger code means a harder time on the cpu cache. Finally, it handles null cases, and solves head aches.
Only when the first loop is fundamentally different should one use do..while, say, if the code that makes you pass the loop condition (like initialization) is performed in the loop. Otherwise, if it certain that loop will never fall on the first iteration, then yes, a do..while is appropriate.
From my limited knowledge of code generation I think it may be a good idea to write bottom test loops since they enable the compiler to perform loop optimizations better. For bottom test loops it is guaranteed that the loop executes at least once. This means loop invariant code "dominates" the exit node. And thus can be safely moved just before the loop starts.