Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Now that I know some of the basics of C++, I must admit that I still find it very hard to deal with code that others have written in C++. This may inherently be so, as C++ allows for complex object hierarchies that are, or at least to me, very hard to grasp if one is just supplied with a C++ Project without any further comments or instructions.
So my question is more a question to the more experienced C++ programmers among you: how can someone understand a large C++ project written by others?
I easily loose my way and can be lost for weeks, if I try to understand how a large project of, for example, 10,000 lines of code is written. Functions of classes are pointers to functions of different classes that may or may not be overloaded and may or may not be inherited by other classes, etcetera, without ending.
Are there any practical tips that may speed up my ability to read and understand large C++ projects? Is there perhaps a tutorial with such tips? Please, elaborate! :)
I've been programming professionally for some time now, and as such I have repeatedly been handed down codebases written by others before me. Understanding is never easy, especially when the code is inconsistent.
The first thing to realize, though, is that learning your ways in a new codebase is not so different than re-discovering a codebase you had not touched for a while. Thus, whether written by your old-self of others does not matter much; and since you probably manage to cope with re-discovering codebases you had worked on before, you should be able to discover new codebases as well. Don't lose hope.
The second thing to realize is that understanding is a vague term, and there are certainly different degrees. Often times, nobody asks you to understand the ins and outs completely; more likely you will be asked to understand a portion of the codebase in which either there is a bug or some new functionality should be developed. Therefore, as time passes, you will gradually gain an understanding of various portions, and you will inevitably have a deeper knowledge of the portions you worked the most whilst others can be relatively abstract or even completely obscure. It's okay, it's been a long time since human beings stopped trying to learn everything there was to learn.
With that said, there are several axis of understanding you can try:
you should look for architecture: a good thing is to trace the library dependencies (the Makefile/Project should help here) this will give you the coarse technical blocks out of which the application is built. Executables are normally leaves of the dependency trees.
you should look for data-flow: what's the trigger of the application (called directly or as a callback) ? what are the steps followed by this data (roughly, just a sketch). Do not hesitate to focus on a specific narrow usecase and use the debugger to trace things, and do not try to dig too deep at first; just get a feel of things.
There are also other axis that may help gaining some understanding of the domain the application has been written for. An understanding of the domain is useful because it provides you with a key insight on what should happen and it also helps you decipher the comments/function names.
user documentation: what is this used for ? if you can arrange for a demo it is generally very helpful, otherwise maybe you can try playing with it yourself (in a test environment)
tests: what is tested ? what is exposed to the user ?
persistent data: what is serialized ? what is saved in a database ? Persistent data is accessed at some point, so it helps if you understand when it is read/written.
If it is a working product (that runs) and you can "debug" it, start by looking at just one particular feature.
Learn how it is working from the user's point of view (UI, behaviour, inputs, outputs, ...).
Once you know the feature from the outside, just look for the code for that feature (only that feature); the starting point might be a handler for a menu, or from a dialog or a mouse/pointer event.
From there; manually trace the code for one action or sub-feature; skip deep internal libraries (treat them as black box for now) and learn how it works.
Once you know that section of code, dig deeper in libraries API that was called from the upper level code.
Take your time.
Do not try to understand everything at once.
Draw up schematic (pen and paper) of the dependencies (stay high level, no class dependencies at the beginning).
Good luck.
The problem that you are mentioning does not have clear and simple answer. Nevertheless here are some tips:
At the beginning try to randomly remember everything. Names of directories, classes, params of templates, etc. As much as you can. This sounds pointless but still makes sense.
While working with the code always think "Have I looked at this function/param/etc before?" If the answer is yes, spend with this piece of code more. If not, just make basic grasp and go on.
As the time will go on, you will find out that more and more sounds clear and easier to grasp.
It is impossible to give any exact values because size and complexity of projects vary greatly. Do not expect simple and immediate results.
Other points:
You definitely need a source code browser. Spend time in learning how to use it. Good example is http://sourceinsight.com/. This is not my site!!! I do have my own site. I will not mention it here.
If you see a function that is called 500 times, it is 500 times more likely that knowledge about this function will be useful comparing with a function, that is called only once.
The best is to grasp the architecture of the project. Trying to do this it is necessary to remember that project may have no architecture at all.
Studying the code you should remember your task. Typical situation - you need to modify something or fix a bug. If this is so look for the right part of the code and focus your effort on it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I've started a rather large 2D game engine project a few months ago, and I began noticing:
The code from the first one or two months is quite different to the more recent one:
The naming of variables feels a bit different
Some code style aspects are different
I'm sometimes wondering why I named some function that way, and can easily think of a better name
The code feels rather messy
There are parts where almost instantly a better way of doing it comes to my mind
The code seems as if its quality was significantly lower
However, at the time I wrote it, I was watching out to do everything right the same way as I do now.
Now, for my questions:
Is this a common situation, even in large commercial-style projects?
Should I consider investing (a lot of) time in refactoring and maybe even rewriting the affected code?
Is it normal that as a project grows and changes, large parts of code have to be refactored or rewritten from ground up? Is this bad?
Is this a common situation, even in large commercial-style projects?
Yes.
Should I consider investing (a lot of) time in refactoring and maybe even rewriting the affected code?
You going to do that again tomorrow too?
No. Not unless you're actually working on the code you want to refactor.
Is it normal that as a project grows and changes, large parts of code have to be refactored or rewritten from ground up?
Yes.
Is this bad?
It would certainly be a lot easier if we where all perfect, yes.
Yes, this is a common pattern with my projects as well. ABR: Always Be Refactoring. When I feel a new pattern emerge, I try to update older code to match it as well. As a project grows, your experience working in the problem domain influences your style and it's a good idea to be updating older code to match it as well.
As a corollary, if your first project commit is still in your project unchanged a few months later, something is going wrong. I view development as an exploratory practice, and a big part of that is updating old code and ironing out your style. No one knows their final design/API before they start coding. Find any large open source project and walk up its commit history; it happens everywhere.
If you've been working on a drawing or a painting for a while, your style develops sophistication the longer you do it. Also, your first layer or first few sketches are rarely the inked lines that appear in the final result.
A big takeaway lesson from this experience: you're getting better. Or, at least, you're changing. Certainly, from today's perspective, the code you're writing today looks better to you. If the code you wrote back then looks bad today - make it look better. Your responsibility today is not just the code you write today; it is the entire code base. So make it right - and be glad you're getting better.
Yes, this happens. I would even say that it's expected and typical as you delve further into your solution.
Only update your code when you go back and touch it. Don't forget to write unit tests before adjusting it.
It's very tempting to rewrite bad code for no reason, particularly when you don't have a deadline looming. You can easily get stuck in a loop that way.
Remember, shipping is a feature.
Is this a common situation, even in large commercial-style projects?
I must confess here that my belief is that if you design first and code later you can avoid many issues. So I would say here it depends. If one starts with a good design has some company standards in place to ensure the code based on the design follows the same important rules no matter who wrote it then at least you have a chance to avoid such situations. However I am not sure if this is always the case :-).
Should I consider investing (a lot of) time in re-factoring and maybe even rewriting the affected code?
Making things better can never hurt :-).
Is it normal that as a project grows and changes, large parts of code have to be re-factored or rewritten from ground up? Is this bad?
I would say yes and re-factoring should be normally considered to be a good thing when the resulting code is better than the old one. The world never stays the same and even if something was appropriate at some point in time it just may be that it doesn't stand up to the needs of today. So I would say it would be bad if the company you work for would say to you: "you cannot re-factor this code. It's holy". Change (if it is for the better) is always good.
Fred Brooks wrote, "Build one to throw away, you will anyway." While it's not as true as it used to be, it is far from uncommon to not really understand the problem until you start working on it.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Our code sucks. Actually, let me clarify that. Our old code sucks. It's difficult to debug and is full of abstractions that few people understand or even remember. Just yesterday I spent an hour debugging in an area that I've worked for over a year and found myself thinking, "Wow, this is really painful." It's not anyone's fault - I'm sure it all made perfect sense initially. The worst part is usually It Just Works...provided you don't ask it to do anything outside of its comfort zone.
Our new code is pretty good. I think we're doing a lot of good things there. It's clear, consistent, and (hopefully) maintainable. We've got a Hudson server running for continuous integration and we have the beginnings of a unit test suite in place. The problem is our management is laser-focused on writing New Code. There's no time to give Old Code (or even old New Code) the TLC it so desperately needs. At any given moment our scrum backlog (for six developers) has about 140 items and around a dozen defects. And those numbers aren't changing much. We're adding things as fast as we can burn them down.
So what can I do to avoid the headaches of marathon debugging sessions mired in the depths of Old Code? Every sprint is filled to the brim with new development and showstopper defects. Specifically...
What can I do to help maintenance and refactoring tasks get high enough priority to be worked?
Are there any C++-specific strategies you employ to help prevent New Code from rotting so quickly?
Your management may be focused on getting working features into the product, and keeping them working. In this case, you will need to make a business case for refactoring the old stuff, in that by X investment of time and effort you can reduce necessary maintenance time by Y over period Z. Or your management may be fundamentally clueless (this happens, but less often than most developers seem to think), in which case you'll never get permission.
You need to see the business point of view. It doesn't matter to the end user whether the code is ugly or elegant, only what the software does. The cost of bad code is potential unreliability and additional difficulty in changing it; the emotional distress it causes to the programmer is rarely considered.
If you can't get permission to go in and refactor, you can always try it on your own, a little bit at a time. Whenever you fix a bug, do a little rewriting to make things clearer. This may turn out to be faster than the minimum possible fix, particularly in verifying that the code now works. Even if it isn't, it's usually possible to take a little more time on a bug fix without getting into trouble. Just don't get carried away.
If you can leave the code just a little better each time you go in, you'll feel a lot better about it.
Stand Up Meetings
I might go to my mechanic, and we have a little stand-up meeting in the morning:
I tell him I want my wheels aligned,
my tires rotated, and my oil changed.
I mention that "Oh by the way my
brakes felt a little soft on the way
in. Could [he] take a look at them?
How soon can I get my car back because
I need to get back to work?"
He pops his head under my car, pops
back up and says my brakes are leaking
oil and starting to fail. He will need
a part that will arrive at 10:30am.
His man won't finish before lunch, but
I should get my car back by 1:30pm or
so. He's booked solid so he won't be
able to do any of the other stuff
today, and I will have to book another
appointment.
I ask if he can do the other stuff and
I come back for the brake. He tells me
he really can't let me drive out of
there without fixing the brakes
because they might cause an accident,
but if I want to go to another
mechanic, he can call for a tow.
Since the car will be done so shortly
after lunch, I ask if his man can take
a late lunch so I can get my car back
an hour earlier.
He tells me his men come in at 8am and
often work into the evening.
They earn every break they
get, and his man deserves to take his
lunch with everyone else.
None of that is what I wanted to hear. I wanted to hear that I would drive out of there in a half hour with my wheels, tires and oil done.
My mechanic was just straight up and honest with me. Are you straight up and honest with your management? Or do you avoid telling them things they don't want to hear?
Unit Testing
I wouldn't touch a line of code I didn't understand, and I wouldn't check in a new line of code I didn't test thoroughly. (At least, not intentionally.)
Your question seems to imply that somehow a large corpus of poorly documented code made it past review without any unit tests. Maybe you participated in that, and maybe you didn't. Everyone involved needs to accept responsibility for that--including management. Regardless, what's done is done. You cannot go back and change it.
However, right now, in the present time, it is everybody's responsibility to stop the behavior that led to the problem in the first place. You say you spent a year working in code that you find difficult to understand and that has no unit tests. During that year, as you worked hard to improve your understanding, how many unit tests did you write to document and to verify that understanding?
As you struggled through the code slowly gaining understanding, how many comments did you add so you wouldn't have to struggle next time?
Scrum Backlog
Personally, I think the term "Scrum backlog" is a misnomer. A list of things to do is just a list--a shopping list if you will. I had a list when I went to the mechanic. My stand up meeting with the mechanic was really more of a sprint planning meeting.
A sprint planning meeting is a negotiation. If your management is time boxing without that negotiation, they aren't managing anything. They are simply trying to cram 10 lbs of shit into a 5 lb sack, and it's your responsibility to tell them so.
When you show up to a sprint planning meeting, you are expected to commit to a body of work, and it's your responsibility to prepare for that. Preparation means having some idea of what you will have to do to complete each item on the list--including the time it takes to understand obscure code and the time it takes to write unit tests.
If someone invites you to a planning meeting where you won't have time to prepare, decline the meeting and suggest when to reschedule so you will have time.
If you have an existing body of code with no unit tests and a feature might conceivably affect the operation of that code, you need to write unit tests for as much of the old code as might be affected. When you commit to writing the feature, you are committing to doing that work. If that leaves you too little time to commit to some other feature, just say so. Don't commit to the other feature.
When you commit to fix a defect, you commit to testing your work. Obviously, that means writing a unit test for the defect. But if it involves old code with no unit tests, it also means writing unit tests for things that aren't broken yet, but might break due to your change. How else will you test the fix?
If your defect list remains a constant size, your team regresses as much as it fixes. Politely explain to whomever needs to understand that unit tests prevent the regressions that currently keep your defect list from shrinking.
If you fail to write those unit tests because you commit to too many features, whose responsibility is that?
Refactoring
When you refactor code, you have to test all of it, and that means writing unit tests for all of it. If you have a large body of code with no unit tests, you will have to write all of those unit tests before you refactor.
I suggest you hold off on refactoring until those unit tests are in place. In the meantime, if you insist on including unit tests in your estimates for the work you commit to, eventually all those unit tests will be there. And then you can refactor.
The one exception to that is refactoring for testability. You may find that some of the code was not designed for test and that you have to refactor for things like dependency injection before you can create your unit tests. When you commit to writing the feature that requires the unit test, you commit to making the code testable. Include that in your estimate when you commit to the feature.
Commitment + Responsibility = Power
You say you are powerless. When you accept responsibility and commit to doing what needs doing, I think you will find you have all the power you need.
P.S. If anyone complains about anybody "wasting time" writing multiple unit tests when fixing a single defect, show them this video on the 80:20 rule and pound "defect clusters" into their brains.
It is hard to tell much from the information you give. Some questions I would have is a logical reason to be writing new code is to replace the old code. If that is what you are doing, abandon the old code.
Is it also old code that has showstopper defects? If so where are they coming from? Old code does not have "showstopper" defects, it just grinds closer and closer to a halt usually. It is old code after all - it should have the same old defects and the same old limitations, not stuff that has to be looked at right away. Showstopper defects are new code defects. It sounds like there is active development on in the old code.
If you are writing all this new code on top of old code that sucks, with no plans to fix it once and for all, sorry, there is only so much you can do when you are too busy burying yourself to dig yourself out.
If the latter is the case. you should recognize where you are headed, and try to detach a little. It is going to all collapse eventually, if you plan on being around save your strength for a worthwhile battle.
In the meantime try to pick up some design patterns. There are several that can at least help shield you new code from the old stuff, but still, ultimately it is just hard to write good code against bad code.
And your sprints sound maybe confused. Is there not an overall direction? That should determine how much backlog you have, although things can change month to month, is there not a clear sense of moving towards some final goal?
And new code rotting? The way you prevent that is you have a meaningful design, a meaningful direction, and a quality team that is committed to both the quality of their work and the vision of the design. If you have that, discipline is what maintains quality. If you don't have that sorry, you basically were writing code with no purpose already. It was basically rotten on the vine.
Not being critical, just trying to be honest. Take a deep breath. Slow down. You seem like you need it. Look at what you have written here. It tells nothing. You talk of refactor, scrums, showstoppers, defects, old code, new code. What does any of that mean? It is all jumbled up.
What about "new initiatives versus legacy systems"? "Need to refactor early sprint cycle code in terms of latest understanding etc." Are showstoppers in fact "Early components of the current enterprise initiatives have been released but are experiencing problems and no time is budgeted because of new development".
These would be meaningful concepts. You've given us nothing. I understand it is intense. My sprints are crazy too, we add a lot of back;pg items because we could not get many requirements up front (a lot of my new requirements result from having to also contend with external regulatory bodies, the normal business process is not always available).
But at the same time I am ground down by the sheer magnitude of what has to be done and the time to do it. Everything that is added to my backlog needs to be there. It is crazy, but at the same time I have a very clear idea of where I have been, where I need to go, and why the road is getter harder.
Step back, clear your thoughts, figure out the same - where you have been and where you are going. Because if you know that, it sure is not obvious. If you cannot communicate anything your peers can understand, how far are you going to get with a business manager?
Old code always sucks. There's probably some rare exceptions written by people with names like Kernighan or Thompson but, for the typical "code written in an office" stuff, over time it's gonna stink. Developers get more experienced. Newer practices, such as continuous integration, change the game. Stuff get's forgotten. New maintainers fail to grasp designs and wish for re-writes. So best accept this as normal.
Some random things that might help...
Talk about it with your team. Share your experiences and your concerns, while avoiding "man your old code sucks" (for obvious reasons) and see what the consensus is. You're probably not alone.
Forget about your managers. Don't expose them to this level of detail - they don't need to think about new vs. old code and probably won't understand if they do. This is a problem for your team to tackle and, if necessary, to make your PO aware of
Be open to the possibility that you may be able to throw stuff out. Some of that old code probably relates to features that are no longer being used or failed to be adopted by users in the first place. To make this work for you, you really need to go a level higher and think in terms of where the code really delivers user or business value vs. where it's just a ball of mud that no one is brave enough to take a decision on. Who dares, wins.
Relax your view of architectural consistency. There's always a way to tap into a working system with new code somewhere, and that may allow you to slowly migrate to a newer, smarter approach, while preserving the old long enough not to break existing things.
Overall, winning in this kind of situation is less about coding skills and much more about smart choices and handling the human aspects.
Hope that helps.
I recommend keeping track of how many bugs and code changes involve your "old code" and present this to either your manager or to your fellow developers at your next team meeting. With this in hand it should be simple enough to convince them that more needs to be done to refactor your "old code" and bring it up to par with your "new code".
It would also be prudent to document the parts of your "old code" that are most difficult to understand. These would also be the parts of your "old code" that you should be refactoring first once you get the approval.
Something to try: group your class into - say - worst 10%, best 10%, and the rest. Deliver the lists to your management, saying, "I predict the majority of bugs over the next quarter will be found in the first set." Based on length, cyclomatic complexity, test coverage - whatever tools are handy and comfortable to you. Then sit back and watch - and be right. Now you've got some credibility, some leverage when you say, "I'd like to invest some resources in making our bad code better, to reduce bugs and maintenance costs - and I know where to invest that energy, see?"
You could create diagrams and sketches of how the new code works and how the classes and functions are related to one another. You could use FreeMind or maybe Dia. And I definitely agree with Documenting and commenting your code.
I once had a problem with this too. I wrote a font class for J2ME for my own language. It was awful for these reasons that maybe you might also see in your code.
No Comments or documentation
Less object oriented
bad variable / function names
...
But after a few months I was forced to write the whole thing again. Now I've learned to use meaningful variable names that are sometimes VERY long. write comments more than writing codes. And using diagrams for the project's classes and their relationships.
I don't know If it was a real answer but it definitely worked for me. and for old codes you might actually have to reread the whole thing and add comments when you remember the functionalities.
Hope it helped.
Talk to your Product Owner! Explain that time invested in refactoring the old code will bring him benefit of higher team velocity on new features once this obstacle is removed.
Other than the approaches mentioned above which are good, you can also try these:
For keeping future code clean
Try pair programming, at least for parts that make sense. It's an effective way of getting reviewed, refactored code a practice.
Try to get refactoring onto the definition of "done". Then it will be part of the estimation process and allotted accordingly. So the definition of done might include: coded, unit tested, functionally tested, performance tested, code reviewed, refactored, and integrated (or something like this).
For Cleaning up the old code:
Unit tests are great for helping you refactor and figure out how things work.
I agree with the comments that a business case needs to be made for large-scale refactoring. But, small-scale refactoring could be easily included in the estimate and will provide immediate return. i.e.: I spend 2 hours rewriting a piece but I would have spent that time looking for bugs anyway.
You may also want to consider getting the product owner and scrummaster to capture a separate velocity for the old code vs the new code, and use that accordingly.
If there's a desired new feature and you can delineate a non-overwhelming hunk of code that is in the way, then you might be able to get management's blessing to replace the old code with new code that has the desired new feature. When I did this, I had to write a somewhat ugly shim layer to meet the old interfaces of the part of the software I wasn't going to touch. And a test harness that could exercise the existing code and exercise the new code to make sure the new code, as seen through the shim layer, could fool the rest of the application into thinking nothing had changed. By reworking the portion we reworked, we were able to show huge performance benefits, compatibility with desired new hardware, reduction in each of our field site's needs for expertise in administering space for the application - and the new code was much more maintainable. That last point mattered not a whit to the users, but the other advantages from the rework were attractive enough to "sell" the users on the merits of a somewhat painful database conversion.
Another more modest success story: we had a decent trouble tracking system that had literally years of history. There was a subsystem of our application that was famed for the speed with which it would burn out maintenance programmers. Clearly (well, clearly in my mind) it was in need of a major re-write, but management wasn't enthused about that. We were able to dig through the history in the trouble tracking data to show the staffing level that had gone into maintaining this module, and for all that effort, the trouble tickets per month against that module continued to arrive at a constant rate. When faced with actual data like that, even the reluctant managers who had long been tight-fisted about staffing re-work of that subsystem could see the merit of assigning staff to rework that module.
The approach as before was to leave the input and output of that module alone. The good news was that throwing virtual memory at the new code with its fancy new data structures did give a noticeable performance improvement to the module. The bad news is that we were nearly done with the re-implementation before we really understood what was wrong in the original implementation such that it did work most of the time, but managed to fail on some of the transactions on some days. The first cut faithfully reproduced those bugs, but the bugs were easier to understand in the reworked code so we now had a shot at really fixing the real problem. In retrospect, maybe we'd have been smarter to have captured data that produced the problems and have taken better care to make sure the reworked version didn't reproduce that problem. But, the truth is, nobody understood the problem until we were quite far along on the re-write. So, the re-write gave improved performance to the users and improved understanding to the current programmers, such that the real problem could really be resolved at last.
A fail example: There was yet another incredibly ugly module that persistently was a sore spot. Alas, I wasn't clever enough to be able to understand the defacto interfaces to this particular wretched hive of scum and villainy, at least not in the time frame of the nominal release schedule. I'd like to believe that given more time we could have figured out a suitable plan for re-working that piece of the system too, and maybe once we understood it, we could even identify user-desired improvements that we could fit into the re-write. But I can't promise that you'll find a prize in every box. If the box is entirely obscure to you, slicing away a chunk of it and replacing that piece with clean code is hard to do. The guy who had charge of that module is probably the one who was best positioned to figure out a plan of attack, but he saw the frequent crashes and calls from the field for assistance as "job security". I don't think management ever really recognized that he needed to be eased aside for someone with a hunger for change, but that's what probably was needed.
Drew
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I am new to work but the company I work in hires a lot of non-comp-science people who are smart enough to get the work done (complex) but lack the style and practices that should help other people read their code.
For example they adopt C++ but still use C-like 3 page functions which drives new folks nuts when they try to read that. Also we feel very risky changing it as it's never easy to be sure we are not breaking something.
Now, I am involved in the project with these guys and I can't change the entire code base myself or design so that code looks good, what can I do in this situation?
PS> we actually have 3 page functions & because we do not have a concept of design, all we can do is assume what they might have thought as there is no way to know why is it designed the way it is.
I am not complaining.I am asking for suggestion,already reading some books to solve the issues Pragmatic Programmer; Design portion from B.Stroustrup; Programming and principles by B.Stroustrup;
The best and most important thing you can do is lead by example. Do things the right way and try to improve things slowly. You aren't going to fix anything overnight.
Make sure every piece of code that you are responsible for is better after you are done with it. Over time, the system will tangibly be better because of your efforts.
After you build a strong reputation with your co-workers, try go start some code reviews or lunch-training sessions to get everyone up to speed on better ways to do things.
In a nutshell: it will be difficult and frustrating, but it's possible. Good luck.
Your best bet is to NOT to handle it at all. Here are potential problems if you try:
You will be criticized for doing something you were not told to (makes performance reviews go real bad.)
You will have less time to do your own work.
You will not advance your career by cleaning working code- if it is not broke then do not touch it.
Never make enemies with people who control your career- unintentionally implying they are obsolete morons does not help your cause (especially in a bad economy).
Focus on making your own code great. Battling poorly written code is part of the ill of being a Software Engineer. You are in the wrong profession if you will not stand for it.
A little off point but important- You may need to switch jobs or teams if possible once the economy picks up. Mixing with a truckload of bad coders who do not bother to update their knowledge and practices dulls your own programming skills and weakens your marketability.
If you're a junior dev, then the only thing you can really do is write code as elegantly and readable as possible.
If your style is indeed better, other people might notice and say "hey we should adopt this formula"
Actions speak louder than complaints, which is something I noticed.
The present is embodied in Hexagram 47 - K'un (Oppression): Despite exhaustion, there may yet be progress and success. For the firm and correct, the really great man, there will be good fortune. He will fall into no error. If he make speeches, his words cannot be made good.
The future is embodied in Hexagram 6 - Sung (Conflict): Though there is sincerity in one's contention, he will yet meet with opposition and obstruction. If he cherish an apprehensive caution, there will be good fortune. If he prosecute the contention to the bitter end, there will be evil. It will be advantageous to see the great man. It will not be advantageous to cross the great stream.
I was in the exact same shoes before. I got hired as a C++ programmer to 'lead the team' on how to use C++ by an enthusiastic manager. That was about a decade ago. Some of the newer engineers loved me, the seniors despised me. Our system was basically a pseudo-C++ system. It was like C with classes, but it appeared that people didn't even understand the usefulness of things like constructors since they hardly appeared.
You complain about functions that are 3 pages long; we had functions that were 8000 lines long full of long jumps, function pointer casts, etc. One of the seniors even formatted the code with 2-space indentations so that super deep nested blocks can be written without using much horizontal space since the seniors seemed to be allergic to writing functions and procedural programming in general. Someone even inlined a 2000 line function thinking it would make things faster. You might be dealing with some bad code, but I was dealing with the most horrid copy-and-paste code imaginable.
Unfortunately I was very young and cocky. I didn't get along with the seniors, I fought against them in territorial battles over code. They responded by creating coding standards which no sane C++ programmer could follow (ex: it's okay to use operator new, but don't use exception handling, don't use constructors or destructors, etc). As a result I wrote the most bizarre and stupid standards-workaround C++ code just to kind of rebel against those standards since I refused to write C-style code given the reason I was hired (I didn't hate C so much, but writing C code was not part of the job description: I was hired essentially as a C++ consultant), even though the standards made C-style coding the only practical way to do things. I only kept my job because I put in so much overtime to make sure that my code works very well in spite of these ridiculous coding standards.
It wasn't until years later when others started to see things my way that we lifted the silly standards and started writing more natural, easy-to-read C++ code complete with STL and boost goodies, RAII, exception-handling, etc. That isolated the seniors who refused to write code in a more sane fashion and they were finally forced to adapt.
In retrospect, I could have done things much better. The seniors were intent not to allow me to be put in a teaching position, but I think I would have gotten my point across much faster with my head down. The biggest regret I have is trying to work around the impossible coding standards rebelliously rather than getting them rectified through clear and rational discussions. As a result, I have really stupid and obfuscated C++ code in the system which people attribute to me even though that's not how I normally write C++ code. The regular developers I work with understand this, but the seniors still point to it as an example of why C++ is bad.
In short, I recommend that you focus on making more friends rather than enemies. Your friends will support you, and if your way is better and you can clearly demonstrate it, you can isolate the few who will never agree.
Being enthusiastic to code the right way is a good trait to possess and in the software industry we will always encounter other developers who write code that is not quite in line with our "perfect way" of coding. This should never be interpreted as rubbish code, or inept coders, because we all start out like that in some shape or form.
Always respect your peers around you as you want them to respect you. It's certainly not easy to do in an environment that highly regards ego, but attempting to approach a topic like this is never easy.
It's how you communicate
Try different approach angles, remember you are there to learn as much as to render a service.
So commenting on the "poor" code style in an "in-your-face" kind of approach might not be the result you were looking for. So then back up a bit and try approaching the topic with "I was considering the style of code used and have a few suggestions..." and see the difference that gives.
Where I work now, the one thing I've learnt is that it's fine to comment on something that's might not be good quality but then I had better have a better solution to present.
In other words, be prepared to back up your words with useful solutions and not because you feel so.
This is the most basic reason why establishing coding standards and code review processes are a good idea.
I will not write three page functions no matter what coding standards and processes are established, but some people will. They will create 20 local variables at the beginning, without initializing any of them. You'll have pointers and integers with unspecified values. You won't know the exact meaning and scope of each variable. Etc etc.
Try to convince your manager and, later, your team, with solid arguments. Maybe you could start with a shared reading of Effective C++ or C++ Coding Standards. Try to stress the point that, when working this way and creating better code, everybody wins. If they see this as a win-win situation, they will be more open to the change.
I can empathize.
I'm below two senior programmers who have very unique styles that I find frustrating. We have code that consists of one main function that's 1000 lines long. (That is not a typo.) Our coding standards discourage globals, so we make every program one App object. Now our globals are member variables! When we need an iteration interface for C++ classes, why should we use the begin, end, and operator++ conventions, when First, AtLast, and Next can be used instead. We've wrapped up third party libraries in custom interfaces for no good reason. (We've wrapped log4cxx and lost functionality for no reason I can tell, and one of our date classes consists of a shared pointer to a boost::date object with a fraction of the functionality.)
Here's how I've kept my sanity. I've focused on the new languages and tools. This is our company's first project involving Python, and I've spent my time writing utilities and programs there. While the senior programmers code in what they're familiar with, I've got near free rein of all the Python code. I couldn't stand the C++ APIs we use, so I duplicated most of the same functionality in Python in a much more friendly way, and the other developers prefer it too.
Likewise, we have little familiarity with log4cxx and less with boost-build, both of which I've taken time to study in depth and know good enough that people come to me with questions. I've written a handful of resources on our wiki giving usage tips for log4cxx and numpy and other tools.
This is how it is, get used to it or quit and find a place where it isn't like that. You will be marginalized if you criticize their efforts and they may feel threatened if you do indeed write your own, better code or improve theirs. At the end of the day they deliver code and management sees a black box that works and that's all that matters. Plus, you'll be just another kid from college who thinks he knows something about development at a business and laughed at and ostracized when not around. Honestly, a lot of the times these systems are built like this because of shaky requirements, lots of functionality bolted on with time and managements lack of respect for a stable software development process.
Not all companies are like this. I'd start looking for a new job to be honest.
I fervently hope, this is the best opportunity to grow by facing the challenge.As Robert said,try to lead by example.If possible let them adopt your pattern.
You are not alone. I too faced recently, luckily we have support from Senior Management for Code Reviews.
1. Even a single line change for Bug fix should be reviewed online.
Comments on code can be classified as CodingStandard/Suggestion/Clarification/Major/Minor etc
While giving comments to senior you can start with Clarification/CodingStandard rather than Major
You could address the issue of "we are afraid to touch it for fear of breaking it" by writing extensive unit tests. Once you get the unit tests in place, you will be freed to refactor at will. If your unit tests still pass after the refactoring, you can be confident you haven't broken anything.
Just place a copy of "Clean Code" (Martin), "Refactoring: Improving the Design of Existing Code" (Fowler) or "Effective C++" somewhere in the office where people may start browsing the books. From now on, word will spread. Seriously, its never too late to learn! ;)
Our management has recently been talking to some people selling C++ static analysis tools. Of course the sales people say they will find tons of bugs, but I'm skeptical.
How do such tools work in the real world? Do they find real bugs? Do they help more junior programmers learn?
Are they worth the trouble?
Static code analysis is almost always worth it. The issue with an existing code base is that it will probably report far too many errors to make it useful out of the box.
I once worked on a project that had 100,000+ warnings from the compiler... no point in running Lint tools on that code base.
Using Lint tools "right" means buying into a better process (which is a good thing). One of the best jobs I had was working at a research lab where we were not allowed to check in code with warnings.
So, yes the tools are worth it... in the long term. In the short term turn your compiler warnings up to the max and see what it reports. If the code is "clean" then the time to look at lint tools is now. If the code has many warnings... prioritize and fix them. Once the code has none (or at least very few) warnings then look at Lint tools.
So, Lint tools are not going to help a poor code base, but once you have a good codebase it can help you keep it good.
Edit:
In the case of the 100,000+ warning product, it was broken down into about 60 Visual Studio projects. As each project had all of the warnings removed it was changed so that the warnings were errors, that prevented new warnings from being added to projects that had been cleaned up (or rather it let my co-worker righteously yell at any developer that checked in code without compiling it first :-)
In my experience with a couple of employers, Coverity Prevent for C/C++ was decidedly worth it, finding some bugs even in good developers’ code, and a lot of bugs in the worst developers’ code. Others have already covered technical aspects, so I’ll focus on the political difficulties.
First, the developers whose code need static analysis the most, are the least likely to use it voluntarily. So I’m afraid you’ll need strong management backing, in practice as well as in theory; otherwise it might end up as just a checklist item, to produce impressive metrics without actually getting bugs fixed. Any static analysis tool is going to produce false positives; you’re probably going to need to dedicate somebody to minimizing the annoyance from them, e.g., by triaging defects, prioritizing the checkers, and tweaking the settings. (A commercial tool should be extremely good at never showing a false positive more than once; that alone may be worth the price.) Even the genuine defects are likely to generate annoyance; my advice on this is not to worry about, e.g., check-in comments grumbling that obviously destructive bugs are “minor.”
My biggest piece of advice is a corollary to my first law, above: Take the cheap shots first, and look at the painfully obvious bugs from your worst developers. Some of these might even have been found by compiler warnings, but a lot of bugs can slip through those cracks, e.g., when they’re suppressed by command-line options. Really blatant bugs can be politically useful, e.g., with a Top Ten List of the funniest defects, which can concentrate minds wonderfully, if used carefully.
As a couple people remarked, if you run a static analysis tool full bore on most applications, you will get a lot of warnings, some of them may be false positives or may not lead to an exploitable defect. It is that experience that leads to a perception that these types of tools are noisy and perhaps a waste of time. However, there are warnings that will highlight a real and potentially dangerous defects that can lead to security, reliability, or correctness issues and for many teams, those issues are important to fix and may be nearly impossible to discover via testing.
That said, static analysis tools can be profoundly helpful, but applying them to an existing codebase requires a little strategy. Here are a couple of tips that might help you..
1) Don't turn everything on at once, decide on an initial set of defects, turn those analyses on and fix them across your code base.
2) When you are addressing a class of defects, help your entire development team to understand what the defect is, why it's important and how to code to defend against that defect.
3) Work to clear the codebase completely of that class of defects.
4) Once this class of issues have been fixed, introduce a mechanism to stay in that zero issue state. Luckily, it is much easier make sure you are not re-introducing an error if you are at a baseline has no errors.
It does help. I'd suggest taking a trial version and running it through a part of your codebase which you think is neglected. These tools generate a lot of false positives. Once you've waded through these, you're likely to find a buffer overrun or two that can save a lot of grief in near future. Also, try at least two/three varieties (and also some of the OpenSource stuff).
I've used them - PC-Lint, for example, and they did find some things. Typically they are configurable and you can tell them 'stop bothering me about xyz', if you determine that xyz really isn't an issue.
I don't know that they help junior programmers learn a lot, but they can be used as a mechanism to help tighten up the code.
I've found that a second set of (skeptical, probing for bugs) eyes and unit testing is typically where I've seen more bug catching take place.
Those tools do help. lint has been a great tool for C developers.
But one objection that I have is that they're batch processes that run after you've written a fair amount of code and potentially generate a lot of messages.
I think a better approach is to build such a thing into your IDE and have it point out the problem while you're writing it so you can correct it right away. Don't let those problems get into the code base in the first place.
That's the difference between the FindBugs static analysis tool for Java and IntelliJ's Inspector. I greatly prefer the latter.
You are probably going to have to deal with a good amount of false positives, particularly if your code base is large.
Most static analysis tools work using "intra-procedural analysis", which means that they consider each procedure in isolation, as opposed to "whole-program analysis" which considers the entire program.
They typically use "intra-procedural" analysis because "whole-program analysis" has to consider many paths through a program that won't actually ever happen in practice, and thus can often generate false positive results.
Intra-procedural analysis eliminates those problems by just focusing on a single procedure. In order to work, however, they usually need to introduce an "annotation language" that you use to describe meta-data for procedure arguments, return types, and object fields. For C++ those things are usually implemented via macros that you decorate things with. The annotations then describe things like "this field is never null", "this string buffer is guarded by this integer value", "this field can only be accessed by the thread labeled 'background'", etc.
The analysis tool will then take the annotations you supply and verify that the code you wrote actually conforms to the annotations. For example, if you could potentially pass a null off to something that is marked as not null, it will flag an error.
In the absence of annotations, the tool needs to assume the worst, and so will report a lot of errors that aren't really errors.
Since it appears you are not using such a tool already, you should assume you are going to have to spend a considerably amount of time annotating your code to get rid of all the false positives that will initially be reported. I would run the tool initially, and count the number of errors. That should give you an estimate of how much time you will need to adopt it in your code base.
Wether or not the tool is worth it depends on your organization. What are the kinds of bugs you are bit by the most? Are they buffer overrun bugs? Are they null-dereference or memory-leak bugs? Are they threading issues? Are they "oops we didn't consider that scenario", or "we didn't test a Chineese version of our product running on a Lithuanian version of Windows 98?".
Once you figure out what the issues are, then you should know if it's worth the effort.
The tool will probably help with buffer overflow, null dereference, and memory leak bugs. There's a chance that it may help with threading bugs if it has support for "thread coloring", "effects", or "permissions" analysis. However, those types of analysis are pretty cutting-edge, and have HUGE notational burdens, so they do come with some expense. The tool probably won't help with any other type of bugs.
So, it really depends on what kind of software you write, and what kind of bugs you run into most frequently.
I think static code analysis is well worth, if you are using the right tool. Recently, we tried the Coverity Tool ( bit expensive). Its awesome, it brought out many critical defects,which were not detected by lint or purify.
Also we found that, we could have avoided 35% of the customer Field defects, if we had used coverity earlier.
Now, Coverity is rolled out in my company and when ever we get a customer TR in old software version, we are running coverity against it to bring out the possible canditates for the fault before we start the analysis in a susbsytem.
Paying for most static analysis tools is probably unnecessary when there's some very good-quality free ones (unless you need some very special or specific feature provided by a commercial version). For example, see this answer I gave on another question about cppcheck.
I guess it depends quite a bit on your programming style. If you are mostly writing C code (with the occasional C++ feature) then these tools will likely be able to help (e.g. memory management, buffer overruns, ...). But if you are using more sophisticated C++ features, then the tools might get confused when trying to parse your source code (or just won't find many issues because C++ facilities are usually safer to use).
As with everything the answer depends ... if you are the sole developer working on a knitting-pattern-pretty-printer for you grandma you'll probably do not want to buy any static analysis tools. If you are having a medium sized project for software that will go into something important and maybe on top of that you have a tight schedule, you might want to invest a little bit now that saves you much more later on.
I recently wrote a general rant on this: http://www.redlizards.com/blog/?p=29
I should write part 2 as soon as time permits, but in general do some rough calculations whether it is worth it for you:
how much time spent on debugging?
how many resources bound?
what percentage could have been found by static analysis?
costs for tool setup?
purchase price?
peace of mind? :-)
My personal take is also:
get static analysis in early
early in the project
early in the development cycle
early as in really early (before nightly build and subsequent testing)
provide the developer with the ability to use static analysis himself
nobody likes to be told by test engineers or some anonymous tool
what they did wrong yesterday
less debugging makes a developer happy :-)
provides a good way of learning about (subtle) pitfalls without embarrassment
This rather amazing result was accomplished using Elsa and Oink.
http://www.cs.berkeley.edu/~daw/papers/fmtstr-plas07.pdf
"Large-Scale Analysis of Format String Vulnerabilities in Debian Linux"
by Karl Chen, David Wagner,
UC Berkeley,
{quarl, daw}#cs.berkeley.edu
Abstract:
Format-string bugs are a relatively common security vulnerability, and can lead to arbitrary code execution. In collaboration with others, we designed and implemented a system to eliminate format string vulnerabilities from an entire Linux distribution, using typequalifier inference, a static analysis technique that can find taint violations. We successfully analyze 66% of C/C++ source packages in the Debian 3.1 Linux distribution. Our system finds 1,533 format string taint warnings. We estimate that 85% of these are true positives, i.e., real bugs; ignoring duplicates from libraries, about 75% are real bugs. We suggest that the technology exists to render format string vulnerabilities extinct in the near future.
Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection—Invasive Software;
General Terms: Security, Languages;
Keywords: Format string vulnerability, Large-scale analysis, Typequalifier inference
Static analysis that finds real bugs is worth it regardless of whether it's C++ or not. Some tend to be quite noisy, but if they can catch subtle bugs like signed/unsigned comparisons causing optimizations that break your code or out of bounds array accesses, they are definitely worth the effort.
At a former employer we had Insure++.
It helped to pinpoint random behaviour (use of uninitialized stuff) which Valgrind could not find. But most important: it helpd to remove mistakes which were not known as errors yet.
Insure++ is good, but pricey, that's why we bought one user license only.