Why are not clojure namespaces in MVCC? - clojure

There's a lot of discussion on how to best tender the namespaces. Stuart Sierra enlightens us all about this in his Lifecycle Composition and the work with clojure.tools.namespace.
Most of the complexity comes from the mutability of the namespaces; then why don't we put the namespaces in Clojures own MVCC? There must be a reason to why, but I cannot figure it out myself.

One practical reason is that namespaces are used to build the MVCC so it makes building the compiler harder, though not impossible, to use MVCC in building namespaces. The other reason lies in the way programmers modify them, While the contents of refs are typically modified by manipulating data as the program runs, the contents of vars in a namsepace are almost always modified during program development where the vast majority of the time the programmer wants the change to be seen system wide and immediately.
It's worth noting that there are alternatives to namespaces for storing your functions in cases where you need coordinated updates of multiple functions. It is reasonable, if you need attomic upgrades, to store your functions in a ref and upgrade your program using dosync. This way you can have MVCC semantics for functions that need them and keep the update-in-place semantics of namespaces everywhere that does not need coordinated upgrades.

Code loading, in its very nature, is an operation that can be completely arbitrary - loading a namespace can trigger all sort of side effects, so it isn't suitable to be put into transactions.
Anyway, I think some improvements could be made to the current loading mechanism: for example, within a call to ns, require, etc, all current vars could be listed, as well as all the vars that are being added because of the call; if the call to ns/require fails, all the vars in the latter list would be removed.
Note that such an approach would require to serialize calls to ns/require, as opposed to the current mechanism which is concurrent. I don't think there is a strong case for concurrent loading anyway.

Related

What is C++ global variables usage risk vs. C? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
In C/C++, are global variables as bad as my professor thinks they are?
The problem with global variables is that since every function has access to these, it becomes increasingly hard to figure out which functions actually read and write these variables.
To understand how the application works, you pretty much have to take into account every function which modifies the global state. That can be done, but as the application grows it will get harder to the point of being virtually impossible (or at least a complete waste of time).
If you don't rely on global variables, you can pass state around between different functions as needed. That way you stand a much better chance of understanding what each function does, as you don't need to take the global state into account.
The important thing is to remember the overall goal: clarity
The "no global variables" rule is there because most of the time, global variables make the meaning of code less clear.
However, like many rules, people remember the rule, and not what the rule was intended to do.
I've seen programs that seem to double the size of the code by passing an enormous number of parameters around simply to avoid the evil of global variables. In the end, using globals would have made the program clearer to those reading it. By mindlessly adhering to the word of the rule, the original programmer had failed the intent of the rule.
So, yes, globals are often bad. But if you feel that in the end, the intent of the programmer is made clearer by the use of global variables, then go ahead. However, remember the drop in clarity that automatically ensues when you force someone to access a second piece of code (the globals) to understand how the first piece works.
My professor used to say something like: using global variables are okay if you use them correctly. I don't think I ever got good at using them correctly, so I rarely used them at all.
The problem that global variables create for the programmer is that it expands the inter-component coupling surface between the various components that are using the global variables. What this means is that as the number of components using a global variable increases, the complexity of the interactions can also increase. This increased coupling usually makes defects easier to inject into the system when making changes and also makes defects harder to diagnose and correct. This increase coupling can also reduce the number of available options when making changes and it can increase the effort required for changes as often one must trace through the various modules that are also using the global variable in order to determine the consequences of changes.
The purpose of encapsulation, which is basically the opposite of using global variables, is to decrease coupling in order to make understanding and changing the source easier and safer and more easily tested. It is much easier to use unit testing when global variables are not used.
For example if you have a simple global integer variable that is being used as an enumerated indicator that various components use as a state machine and you then make a change by adding a new state for a new component, you must then trace through all the other components to ensure that the change will not affect them. An example of a possible problem would be if a switch statement to test the value of the enumeration global variable with case statements for each of the current values is being used in various places and it so happens that some of the switch statements do not have a default case to handle an unexpected value for the global all of a sudden you have undefined behavior so far as the application is concerned.
On the other hand the use of a shared data area might be used to contain a set of global parameters which are referenced throughout the application. This approach is often used with embedded applications with small memory footprints.
When using global variables in these sort of applications typically the responsibility for writing to the data area is allocated to a single component and all other components see the area as const and read from it, never writing to it. Taking this approach limits the problems that can develop.
A few problems from global variables which need to be worked around
When the source for a global variable such as a struct is modified, everything using it must be recompiled so that everything using the variable knows its true size and memory template.
If more than one component can modify the global variable you can run into problems with inconsistent data being in the global variable. With a multi-threading application, you will probably need to add some kind of locking or critical region to provide a way so that only one thread at a time can modify the global variable and when a thread is modifying the variable, all changes are complete and committed before other threads can query the variable or modify it.
Debugging a multi-threaded application that uses a global variable can be more difficult. You can run into race conditions that can create defects that are difficult to replicate. With several components communicating through a global variable, especially in a multi-threaded application, being able to know what component is changing the variable when and how it is changing the variable can be very difficult to understand.
Name clash can be a problem with using of global variables. A local variable that has the same name as a global variable can hide the global variable. You also run into the naming convention issue when using the C programming language. A work around is to divide the system up into sub-systems with the global variables for a particular sub-system all beginning with the same first three letters (see this on resolving name space collisions in objective C). C++ provides namespaces and with C you can work around this by creating a globally visible struct whose members are various data items and pointers to data and functions which are provided in a file as static hence with file visibility only so that they can only be referenced through the globally visible struct.
In some cases the original application intent is changed so that global variables that provided the state for a single thread is modified to allow several duplicate threads to run. An example would be a simple application designed for a single user using global variables for state and then a request comes down from management to add a REST interface to allow remote applications to act as virtual users. So now you run into having to duplicate the global variables and their state information so that the single user as well as each of the virtual users from remote applications have their own, unique set of global variables.
Using C++ namespace and the struct Technique for C
For the C++ programming language the namespace directive is a huge help in reducing the chances of a name clash. namespace along with class and the various access keywords (private, protected, and public) provide most of the tools you need to encapsulate variables. However the C programming language doesn't provide this directive. This stackoverflow posting, Namespaces in C , provides some techniques for C.
A useful technique is to have a single memory resident data area that is defined as a struct which has global visibility and within this struct are pointers to the various global variables and functions that are being exposed. The actual definitions of the global variables are given file scope using the static keyword. If you then use the const keyword to indicate which are read only, the compiler can help you to enforce read only access.
Using the struct technique can also encapsulate the global so that it becomes a kind of package or component that happens to be a global. By having a component of this kind it becomes easier to manage changes that affect the global and the functionality using the global.
However while namespace or the struct technique can help manage name clashes, the underlying problems of inter-component coupling which the use of globals introduces especially in a modern multi-threaded application, still exist.
Global variables should only be used when you have no alternative. And yes, that includes Singletons. 90% of the time, global variables are introduced to save the cost of passing around a parameter. And then multithreading/unit testing/maintenance coding happens, and you have a problem.
So yes, in 90% of the situations global variables are bad. The exceptions are not likely to be seen by you in your college years. One exception I can think off the top of my head is dealing with inherently global objects such as interrupt tables. Things like DB connection seem to be global, but ain't.
Global variables are as bad as you make them, no less.
If you are creating a fully encapsulated program, you can use globals. It's a "sin" to use globals, but programming sins are laregly philosophical.
If you check out L.in.oleum, you will see a language whose variables are solely global. It's unscalable because libraries all have no choice but to use globals.
That said, if you have choices, and can ignore programmer philosophy, globals aren't all that bad.
Neither are Gotos, if you use them right.
The big "bad" problem is that, if you use them wrong, people scream, the mars lander crashes, and the world blows up....or something like that.
Yes, but you don't incur the cost of global variables until you stop working in the code that uses global variables and start writing something else that uses the code that uses global variables. But the cost is still there.
In other words, it's a long term indirect cost and as such most people think it's not bad.
If it's possible your code will end up under intensive review during a Supreme Court trial, then you want to make sure to avoid global variables.
See this article:
Buggy breathalyzer code reflects importance of source review
There were some problems with the
style of the code that were identified
by both studies. One of the stylistic
issues that concerned the reviewers
was the extensive use of unprotected
global variables. This is considered
poor form because it increases the
risk that the program state will
become inconsistent or that values
will be inadvertently modified or
overwritten. The researchers also
expressed some concern about the fact
that decimal precision is not
maintained consistently throughout the
code.
Man, I bet those developers are wishing they hadn't used global variables!
The issue is less that they're bad, and more that they're dangerous. They have their own set of pros and cons, and there are situations where they're either the most efficient or only way to achieve a particular task. However, they're very easy to misuse, even if you take steps to always use them properly.
A few pros:
Can be accessed from any function.
Can be accessed from multiple threads.
Will never go out of scope until the program ends.
A few cons:
Can be accessed from any function, without needing to be explicitly dragged in as a parameter and/or documented.
Not thread-safe.
Pollutes the global namespace and potentially causes name collisions, unless measures are taken to prevent this.
Note, if you will, that the first two pros and the first two cons I listed are the exact same thing, just with different wording. This is because the features of a global variable can indeed be useful, but the very features that make them useful are the source of all their problems.
A few potential solutions to some of the problems:
Consider whether they're actually the best or most efficient solution for the problem. If there are any better solutions, use that instead.
Put them in a namespace [C++] or singleton struct [C, C++] with a unique name (a good example would be Globals or GlobalVars), or use a standardised naming convention for global variables (such as global_[name] or g_module_varNameStyle (as mentioned by underscore_d in the comments)). This will both document their use (you can find code that uses global variables by searching for the namespace/struct name), and minimise the impact on the global namespace.
For any function that accesses global variables, explicitly document which variables it reads and which it writes. This will make troubleshooting easier.
Put them in their own source file and declare them extern in the associated header, so their use can be limited to compilation units that need to access them. If your code relies on a lot of global variables, but each compilation unit only needs access to a handful of them, you could consider sorting them into multiple source files, so it's easier to limit each file's access to global variables.
Set up a mechanism to lock and unlock them, and/or design your code so that as few functions as possible need to actually modify global variables. Reading them is a lot safer than writing them, although thread races may still cause problems in multithreaded programs.
Basically, minimise access to them, and maximise name uniqueness. You want to avoid name collisions and have as few functions as possible that can potentially modify any given variable.
Whether they're good or bad depends on how you use them. The majority tend to use them badly, hence the general wariness towards them. If used properly, they can be a major boon; if used poorly, however, they can and will come back to bite you when and how you least expect it.
A good way to look at it is that they themselves aren't bad, but they enable bad design, and can multiply the effects of bad design exponentially.
Even if you don't intend to use them, it is better to know how to use them safely and choose not to, than not to use them because you don't know how to use them safely. If you ever find yourself in a situation where you need to maintain pre-existing code that relies on global variables, you may be in for difficulty if you don't know how to use them properly.
I'd answer this question with another question: Do you use singeltons/ Are singeltons bad?
Because (almost all) singelton usage is a glorified global variable.
As someone said (I'm paraphrasing) in another thread "Rules like this should not be broken, until you fully understand the consequences of doing so."
There are times when global variables are necessary, or at least very helpful (Working with system defined call-backs for example). On the other hand, they're also very dangerous for all of the reasons you've been told.
There are many aspects of programming that should probably be left to the experts. Sometimes you NEED a very sharp knife. But you don't get to use one until you're ready...
Global variables are generally bad, especially if other people are working on the same code and don't want to spend 20mins searching for all the places the variable is referenced. And adding threads that modify the variables brings in a whole new level of headaches.
Global constants in an anonymous namespace used in a single translation unit are fine and ubiquitous in professional apps and libraries. But if the data is mutable, and/or it has to be shared between multiple TUs, you may want to encapsulate it--if not for design's sake, then for the sake of anybody debugging or working with your code.
Using global variables is kind of like sweeping dirt under a rug. It's a quick fix, and a lot easier in the short term than getting a dust-pan or vacuum to clean it up. However, if you ever end up moving the rug later, you're gonna have a big surprise mess underneath.
I think your professor is trying to stop a bad habit before it even starts.
Global variables have their place and like many people said knowing where and when to use them can be complicated. So I think rather than get into the nitty gritty of the why, how, when, and where of global variables your professor decided to just ban. Who knows, he might un-ban them in the future.
Global variables are bad, if they allow you to manipulate aspects of a program that should be only modified locally. In OOP globals often conflict with the encapsulation-idea.
Absolutely not. Misusing them though... that is bad.
Mindlessly removing them for the sake of is just that... mindless. Unless you know the advanatages and disadvantages, it is best to steer clear and do as you have been taught/learned, but there is nothing implicitly wrong with global variables. When you understand the pros and cons better make your own decision.
No they are not bad at all. You need to look at the (machine) code produced by the compiler to make this determination, sometimes it is far far worse to use a local than a global. Also note that putting "static" on a local variable is basically making it a global (and creates other ugly problems that a real global would solve). "local globals" are particularly bad.
Globals give you clean control over your memory usage as well, something far more difficult to do with locals. These days that only matters in embedded environments where memory is quite limited. Something to know before you assume that embedded is the same as other environments and assume the programming rules are the same across the board.
It is good that you question the rules being taught, most of them are not for the reasons you are being told. The most important lesson though is not that this is a rule to carry with you forever, but this is a rule required to honor in order to pass this class and move forward. In life you will find that for company XYZ you will have other programming rules that you in the end will have to honor in order to keep getting a paycheck. In both situations you can argue the rule, but I think you will have far better luck at a job than at school. You are just another of many students, your seat will be replaced soon, the professors wont, at a job you are one of a small team of players that have to see this product to the end and in that environment the rules developed are for the benefit of the team members as well as the product and the company, so if everyone is like minded or if for the particular product there is good engineering reason to violate something you learned in college or some book on generic programming, then sell your idea to the team and write it down as a valid if not the preferred method. Everything is fair game in the real world.
If you follow all of the programming rules taught to you in school or books your programming career will be extremely limited. You can likely survive and have a fruitful career, but the breadth and width of the environments available to you will be extremely limited. If you know how and why the rule is there and can defend it, thats good, if you only reason is "because my teacher said so", well thats not so good.
Note that topics like this are often argued in the workplace and will continue to be, as compilers and processors (and languages) evolve so do these kinds of rules and without defending your position and possibly being taught a lesson by someone with another opinion you wont move forward.
In the mean time, then just do whatever the one that speaks the loudest or carries the biggest stick says (until such a time as you are the one that yells the loudest and carries the biggest stick).
I would like to argue against the point being made throughout this thread that it makes multi-threading harder or impossible per se. Global variables are shared state, but the alternatives to globals (e. g. passing pointers around) might also share state. The problem with multi-threading is how to properly use shared state, not whether that state happens to be shared through a global variable or something else.
Most of the time when you do multi-threading you need to share something. In a producer-consumer pattern for example, you might share some thread-safe queue that contains the work units. And you are allowed to share it because that data structure is thread-safe. Whether that queue is global or not is completely irrelevant when it comes to thread-safety.
The implied hope expressed throughout this thread that transforming a program from single-threaded to multi-threaded will be easier when not using globals is naive. Yes, globals make it easier to shoot yourself in the foot, but there's a lot of ways to shoot yourself.
I'm not advocating globals, as the other points still stand, my point is merely that the number of threads in a program has nothing to do with variable scope.
Yes, because if you let incompetent programmers use them (read 90% especially scientists) you end up with 600+ global variable spread over 20+ files and a project of 12,000 lines where 80% of the functions take void, return void, and operate entirely on global state.
It quickly becomes impossible to understand what is going on at any one point unless you know the entire project.
Global variables are fine in small programs, but horrible if used the same way in large ones.
This means that you can easily get in the habit of using them while learning. This is what your professor is trying to protect you from.
When you are more experienced it will be easier to learn when they are okay.
Global are good when it comes to configuration . When we want our configuration/changes to have a global impact on entire project.
So we can change one configuration and the changes are directed to entire project . But I must warn you would have to be very smart to use globals .
Use of Global variables actually depends on the requirements. Its advantage is that,it reduces the overhead of passing the values repeatedly.
But your professor is right because it raises security issues so use of global variables should be avoided as much as possible. Global variables also create problems which are sometimes difficult to debug.
For example:-
Situations when the variables values is getting modified on runtime. At that moment its difficult to identify which part of code is modifying it and on what conditions.
In the end of the day, your program or app can still work but its a matter of being tidy and having a complete understanding of whats going on. If you share a variable value among all functions, it may become difficult to trace what function is changing the value(if the function does so) and makes debugging a million times harder
Sooner or later you will need to change how that variable is set or what happens when it is accessed, or you just need to hunt down where it is changed.
It is practically always better to not have global variables. Just write the dam get and set methods, and be gland you when you need them a day, week or month later.
I usually use globals for values that are rarely changed like singletons or function pointers to functions in dynamically loaded library. Using mutable globals in multithreaded applications tends to lead to hard to track bug so I try to avoid this as a general rule.
Using a global instead of passing an argument is often faster but if you're writing a multithreaded application, which you often do nowadays, it generally doesn't work very well (you can use thread-statics but then the performance gain is questionable).
In web applications within an enterprize can be used for holding session/window/thread/user specific data on the server for reasons of optimization and to preserve against loss of work where connection are unstable. As mentioned, race conditions need to be handled. We use a single instance of a class for this information and it is carefully managed.
security is less means any one can manipulate the variables if they are declared global , for this one to explain take this example if you have balance as a global variable in your bank program the user function can manipulate this as well as bank officer can also manipulate this so there is a problem .only user should be given the read only and withdraw function but the clerk of the bank can add the amount when the user personally gives the cash in the desk.this is the way it works
In a multi-threaded application, use local variables in place of global variables to avoid a race condition.
A race condition occurs when multiple thread access a shared resource, with at least one thread having a write access to the data. Then, the result of the program is not predictable, and depends on the order of accesses to the data by different threads.
More on this here, https://software.intel.com/en-us/articles/use-intel-parallel-inspector-to-find-race-conditions-in-openmp-based-multithreaded-code

Removing dependencies from statechart framework

I've got lots of problems with project i am currently working on. The project is more than 10 years old and it was based on one of those commercial C++ frameworks which were very populary in the 90's. The problem is with statecharts. The framework provides quite common implementation of state pattern. Each state is a separate class, with action on entry, action in state etc. There is a switch which sets current state according to received events.
Devil is hidden in details. That project is enormous. It's something about 2000 KLOC. There is definitely too much statecharts (i've seen "for" loops implemented using statecharts). What's more ... framework allows to embed statechart in another statechart so there are many statecherts with seven or even more levels of nesting. Because statecharts run in different threads, and it's possible to send events between statecharts we have lots of synchronization problems (and big mess in interfaces).
I must admit that scale of this problem is overwhelming and I don't know how to touch it. My first idea was to remove as much code as I can from statecharts and put it into separate classes. Then delegate these classes from statechart to do a job. But in result we will have many separate functions, which logically don't have any specific functionality and any change in statechart architecture will need also a change of that classes and functions.
So I asking for help:
Do you know any books/articles/magic artefacts which can help me to fix this ? I would like to at least separate as much code as I can from statechart without introducing any hidden dependencies and keep separated code maintainable, testable and reusable.
If you have any suggestion how to handle this, please let me know.
The statechart pattern is intended to be used specifically to remove switch statements, so this sounds like a horrid abuse. Additionally, states should only change on asynchronous events. If you are processing an event and you change through multiple states (or for loop, etc.), then this is also a horrid abuse of the pattern.
I would start from these two points, as they will solve much of your concurrency issues just fixing them up. What you need to determine is:
What are your external, asynchronous events to the system? These are the only things that should be determining state transitions, not things that happen during event processing. An event may cause 0 or 1 state transitions. Once you have a list of these state transitions, you can reconstruct the actual states of your system. If you are aware of UML State diagrams, this would be a perfect time to sketch one up in a charting program, not just for yourself (though it will help you immensely), but also for everyone in the future that has to return to the project. As you have learned, this happens.
Now that you know what are really states, list what are states in the code that shouldn't be. This usually indicates that something can be "functionally decomposed". Instead of a state object for each of these, likely all that is needed is a separate function. This will cut down on a lot of the overhead of state objects and should clean up the code immensely.
Now it's time to tackle those horrendous switch statements you mentioned. If they were truly based on state, you shouldn't need one at all. Instead, you should be able to call the state machine directly.
Something like:
myStateMachine->myEvent();
and it should work without any switch. But notice, this may be the case even for some of those objects that don't work across asynchronous events. This is also an indication of where you may just use inheritance to get the same effect. If you have:
switch (someTypeIdentifier)
{
case type1:
doSomething();
break;
case type2:
doSomethingElse();
break;
}
usually the correct OOP method to do is to create two actual types Type1, Type2, both derived from an abstract base TypeBase, with a virtual method doSomething() that does what you need. The reason this is useful is because it means you can "close" the handling (in the meaning of the Open/Closed Principle), and still extend the functionality by adding new derived types as needed (leaving it open to extension). This saves bugs like crazy because it gets developers hands out of those switch statements, which can get quite ugly and convoluted, instead encapsulating each separate behavior in separate classes.
4 - Now look to fix up your thread issues. Identify all objects used from multiple threads. Make a list. Now, how are these used? Are some of them always used together? Start making groups. The goal here is to find the level of encapsulation that best works for these objects, separate the objects into individual classes that control their own synchronisation, figure out the atomic level of actual "transactions" for the objects, and make methods of the classes that expose those meaningful transactions, wrapped behind the scenes with the appropriate mutexes, condition variables, etc.
You might be saying "that sounds like a lot of work! Why do all that instead of just writing it all over myself?" Good question! :) The reason is actually straightforward: if you are going to do it all by yourself, those are the steps you should be doing anyway. You should be identifying your states, your dynamic polymorphism, and getting a handle on the multithreaded transactions. But, if you start with the existing code, you also have all of those unspoken business rules that were never documented and may cause all sorts of unexpected bugs down the line. You don't have to bring everything over - if you suspect it's a bug, discuss the logic with the people who have worked with the system in the past (if available), QA, or whoever might identify bugs, and see if it really should be carried over. But you need to actually evaluate what the bugs are either way, or you may not code something that actually needed coding.
In the end, this is a manual process that is a part of software engineering. There are CASE tools that can help draw up the state diagrams and even publish them to code, there are refactoring tools, like those found in many IDEs, that can help move code between functions and classes, and similar tools which can help identify threading needs. However, those things shouldn't be picked up for a single project. They need to be learned throughout your career, picking them up and learning them more deeply over years of work, as they are a part of being a software engineer. They don't do it for you. You still need to know the whys and hows, and they just help get it done more efficiently.
Statecharts (including nested Statecharts) are a powerful way to specify, understand and even simulate/validate complex control flow. But to gain the benefit, you need the statechart model in a suitable tool (I used Statemate way back in the day, not sure if it's still available), plus a reliable mapping from the chart to the code (Statemate used to generate the code) - then you can forget about the state management code (mostly)! In your situation, if you don't have the model, I would try to reverse one from the code - as Ira says, chances are high that the original developers had a model in some form, and you may find the code making a lot of sense as the model emerges. If this works out, you will have a really good spec/model of the code which should make future code edits much easier (even if you don't want to go to automatic code generation, and maintain the code/model mapping manually (but you'll need to be meticulous!!))
Sounds to me like your best bet is (gulp!) likely to start from scratch if it's as horrifically broken as you make out. Is there any documentation? Could you begin to build some saner software based on the docs?
If a complete re-write isn't an option (and they never are in my experience) I'd try some of the following:
If you don't already have it, draw an architectural picture of the whole system. Sketch out how all the bits are supposed to work together and that will help you break the system down into potentially manageable / testable parts.
Do you have any kind of requirements or testing plan in place? If not, can you write one and start to put unit tests in place for the various chunks of code / functionality which exist already? If you can do that, you can start to refactor things without breaking as much of whatever does currently work.
Once you've broken things down a bit, start building your unit tests into integration tests which pull together more of the functionality.
I've not read them myself, but I've heard good things about these books which may have some advice you can use:
Refactoring: Improving the Design of Existing Code (Object Technology Series).
Working Effectively with Legacy Code (Robert C. Martin)
Good luck! :-)

Do global variables mean faster code?

I read recently, in an article on game programming written in 1996, that using global variables is faster than passing parameters.
Was this ever true, and if so, is this still true today?
Short answer - No, good programmers make code go faster by knowing and using the appropriate tools for the job, and then optimizing in a methodical way where their code does not meet their requirements.
Longer answer - This article, which in my opinion is not especially well-written, is not in any case general advice on program speedup but '15 ways to do faster blits'. Extrapolating this to the general case is missing the writer's point, whatever you think of the merits of the article.
If I was looking for performance advice, I would place zero credence in an article that does not identify or show a single concrete code change to support the assertions in the sample code, and without suggesting that measuring the code might be a good idea. If you are not going to show how to make the code better, why include it?
Some of the advice is years out of date - FAR pointers stopped being an issue on the PC a long time ago.
A serious game developer (or any other professional programmer, for that matter) would have a good laugh about advice like this:
You can either take out the assert's
completely, or you can just add a
#define NDEBUG when you compile the final version.
My advice to you, if you really wish to evaluate the merit of any of these 15 tips, and since the article is 14 years old, would be to compile the code in a modern compiler (Visual C++ 10 say) and try to identify any area where using a global variable (or any of the other tips) would make it faster.
[Just joking - my real advice would be to ignore this article completely and ask specific performance questions on Stack Overflow as you hit issues in your work that you cannot resolve. That way the answers you get will be peer reviewed, supported by example code or good external evidence, and current.]
When you switch from parameters to global variables, one of three things can happen:
it runs faster
it runs the same
it runs slower
You will have to measure performance to see what's faster in a non-trivial concrete case. This was true in 1996, is true today and is true tomorrow.
Leaving the performance aside for a moment, global variables in a large project introduce dependencies which almost always make maintenance and testing much harder.
When trying to find legitimate uses of globals variables for performance reasons today I very much agree with the examples in Preet's answer: very often needed variables in microcontroller programs or device drivers. The extreme case is a processor register which is exclusively dedicated to the global variable.
When reasoning about the performance of global variables versus parameter passing, the way the compiler implements them is relevant. Global variables typically are stored at fixed locations. Sometimes the compiler generates direct addressing to access the globals. Sometimes however, the compiler uses one more indirection and uses a kind of symbol table for globals. IIRC gcc for AIX did this 15 years ago. In this environment, globals of small types were always slower than locals and parameter passing.
On the other hand, a compiler can pass parameters by pushing them on the stack, by passing them in registers or a mixture of both.
Everyone has already given the appropriate caveat answers about this being platform and program specific, needing to actually measure timings, etc. So, with that all said already, let me answer your question directly for the specific case of game programming on x86 and PowerPC.
In 1996, there were certain cases where pushing parameters onto the stack took extra instructions and could cause a brief stall inside the Intel CPU pipeline. In those cases there could be a very small speedup from avoiding parameter passing altogether and reading data from literal addresses.
This isn't true any more on the x86 or on the PowerPC used in most game consoles. Using globals is usually slower than passing parameters for two reasons:
Parameter passing is implemented better now. Modern CPUs pass their parameters in registers, so reading a value from a function's parameter list is faster than a memory load operation. The x86 uses register shadowing and store forwarding, so what looks like shuffling data onto the stack and back can actually be a simple register move.
Data cache latency far outweighs CPU clock speed in most performance considerations. The stack, being heavily used, is almost always in cache. Loading from an arbitrary global address can cause a cache miss, which is a huge penalty as the memory controller has to go and fetch the data from main RAM. ("Huge" here is 600 cycles or more.)
What do you mean, "faster"?
I know for a fact, that understanding a program with global variables takes me a whole lot more time than one without.
If the extra time it takes the programmer(s) is less than the time gained by the users when they run the program with globals, then I'd say using global is faster.
But consider that the program is going to be run by 10 people once a day for 2 years. And that it takes 2.84632 secs without globals and 2.84217 secs with globals (a 0.00415 sec increase). That's 727 seconds less of TOTAL runtime. Gaining 10 minutes of run time is not worth the introduction of a global as regards programmer time.
To a degree any code that avoids processor instructions (ie shorter code) will be faster. However how much faster? Not very! Also note that compiler optimisation strategies may result in the smaller code anyway.
These days this is only an optimisation on very specific applications usually in ultra time critical drivers or micro-control code.
Putting aside the issues of maintainability and correctness, there are basically two factors that will govern performance with regard to globals vs. parameters.
When you make a global you avoid a copy. That's slightly faster. When you pass a parameter by value, it has to be copied so that a function can work on a local copy of it and not damage the caller's copy of the data. At least in theory. Some modern optimizers do pretty tricky things if they identify that your code is well behaved. A function may get automatically inlined, and the compiler may notice that the function doesn't do anything to the parameters, and just optimise away any copying.
When you make a global, you are lying to the cache. When you have all of your variables neatly contained in your function, and a few parameters, the data will tend to all be in one place. Some of the variables will be in registers, and some will probably be in cache right away because they are right 'next to' each other. Using a lot of global variables is basically pathological behavior for the cache. There is no guarantee that various globals will be used by the same functions. Location has no obvious correlation with usage. Perhaps you have a small enough working set that it makes no difference where anything is, and it all winds up in cache.
All of this just adds up to the point made by a poster above me:
When you switch from parameters to
global variables, one of three things
can happen:
* it runs faster
* it runs the same
* it runs slower
You will have to measure performance
to see what's faster in a non-trivial
concrete case. This was true in 1996,
is true today and is true tomorrow.
Depending on the specific behavior of your exact compiler, and precise details of the hardware that you use to run your code, it's possible that global variables could be a very slight performance win in some cases. That possibility may be worth trying it on some code that runs too slow as an experiment. It's probably not worth dedicating yourself to, as the answer of your experiment could change tomorrow. So, the right answer is almost always to go with "correct" design patterns and avoid the uglier design. Look for better algorithms, more efficient data structures, etc., before intentionally trying to spaghettify your project. Much better payoff in the long run.
And, aside from the dev time vs user time argument, I'll add the dev time vs. Moore's time argument. If you assume Moore's law will make computers something like half again as fast every year, then for the sake of a simple round number, we can assume that progress happens in a steady 1% progress per week. IF you are looking at a microoptimisation that may improve things like 1%, and it will add a week to the project from complicating things, then just taking the week off will have the same effect on average run times for your users.
Perhaps a micro optimisation, and would probably be wiped out by optimisations your compiler could generate without resort to such practices. In fact the use of globals may even inhibit some compiler optimisations. Reliable and maintainable code would generally be of greater value, and globals are not conducive to that.
Using globals to replace function parameters renders all such functions non-reentrant, which may be a problem if multi-threading is used - not a common practice in game development in 1996, but more common with the advent of multi-core processors. It also precludes recursion, although that is probably less of an issue since recursion has its own issues.
In any significant body of code, there is likely to be more mileage in higher-level optimisation of algorithms and data structures. Moreover there are options open to you other than global variables that avoid parameter passing, most especially C++ class-member variables.
If the habitual use of global variables in your code makes a measurable or useful difference to its performance, I would question the design first.
For a discussion of the problems inherent in global variables and some ways to avoid them see A Pox on Globals by Jack Gannsle. The article relates to embedded systems development, but is generally applicable; its just that some embedded systems developers think they have good reason to use globals, probably for all the same misguided reasons used to justify it in game development.
Well, if you are considering using global parameters instead of parameter passing, that could mean that you have a long chain of methods/functions that you have to pass that parameter down. It that is the case, you really WILL save CPU cycles by switching from parameter to global variable.
So, guys that say that it depends, I guess that they are plain wrong. Even with REGISTER parameter passing, there will still be MORE cpu cycles and MORE overhead for pushing the parameters down to the callee.
HOWEVER - I never do that. CPUs are superior now, and at times when there were 12Mhz 8086s that could be the issue. Nowadays, if you don't write embedded or super-turbo-charged performance code, stick to that which looks good in code, which doesn't break code logic, and thrives to be modular.
And lastly, leave machine language code generation to compiler - guys who designed it are best at knowing how their baby performs and will make your code run at its best.
In general (but it may depend greatly on compiler and platform implementation), passing parameters mean writing them onto the stack which you would not need with global variable.
That said, global variable may mean include page refresh in the MMU or memory controller whereas the stack may be located in much faster memory available to the processor...
Sorry, no good answer for a general question like this, just measure it (and try different scenarios too)
It was faster when we had <100mhz processors. Now that that processors are 100x faster this 'problem' is 100x less significant. It wasnt a big deal then, it was a big deal when you did it in assembly and had no (good) optimizer.
Says the guy who programmed on a 3mhz processor. Yes you read that right and 64k was NOT enough.
I see a lot of theoretical answers, but no practical advice for your scenario. What I'm guessing is that you have a large number of parameters to pass down through a number of function calls, and you're worried about accumulated overhead from many levels of call frames and many parameters at each level. Otherwise your concern is completely unfounded.
If this is your scenario, you should probably put all of the parameters in a "context" structure and pass a pointer to that structure. This will ensure data locality, and makes it so you don't have to pass more than one argument (the pointer) at each function call.
Parameters accessed this way are slightly more expensive to access than true function arguments (you need an extra register to hold the pointer to the base of the structure, as opposed to the frame pointer which would serve this purpose with function arguments), and individually (but probably not with cache effects factored in) more expensive to access than global variables in normal, non-PIC code. However, if your code is in a shared library/DLL using position independent code, the cost of accessing parameters passed by pointer to struct is cheaper than accessing a global variable and identical to accessing static variables, due to GOT and GOT-relative addressing. This is another reason never to use global variables for parameter passing: if you may eventually put your code in a shared library/DLL, any possible performance benefits will suddenly backfire!
Like everything else: yes and no. There is no one answer because it depends on context.
Counterpoints:
Imagine programming on Itanium where you have hundreds of registers. You can put quite a few globals into those, which will be faster than the typical way globals are implemented in C (some static address (although they might just hardcode the globals into instructions if they are word length)). Even if the globals are in cache the whole time, registers may still be faster.
In Java, overuse of globals (statics) can decrease performance because of initialization locks that have to be done. If 10 classes want to access some static class, they all have to wait for that class to finish initializing its static fields, which can take anywhere form no time up to forever.
In any case, global state is just bad practice, it raises code complexity. Well designed code is naturally fast enough 99.9% of the time. It seems like newer languages are removing global state all together. E removes global state because it violates their security model. Haskell removes state all together. The fact that Haskell exists and has implementations that outperform most other languages is proof enough for me that I will never use globals again.
Also, in the near future, when we all have hundreds of cores, global state isn't really going to help much.
It might still be true, under some circumstances.
A global variable might be as fast as a pointer to a variable, where its pointer is stored in/passed through registers only. So, it is a question about the count of registers, you can use.
To speed-optimize a function call, you could do several other things, that might perform better with global-variable-hacks:
Minimize the count of local variables in the function to a few (explicit) register variables.
Minimize the count of parameters of the function, i.e. by using pointers to structures instead of using the same parameter-constellations in functions that call each other.
Make the function "naked", that means that it does not use the stack at all.
Use "proper-tail-calls" (does neither work with java/-bytecode nor java-/ecma-script)
If there is no better way, hack yourself sth like TABLES_NEXT_TO_CODE, which locates your global variables next to the function code. In functional languages this is a backend-optimization that uses the function-pointer as data-pointer, too; but as long as you do not program in a functional language, you only need to locate those variables beside those used by the function. Then again, you only want this to remove the stack-handling from your function. If your compiler generates assembler code that handles the stack, then there is no point in doing this, you could use pointers instead.
I've found this "gcc attribute overview":
http://www.ohse.de/uwe/articles/gcc-attributes.html
and I can give you these tags for googling:
- Proper Tail Call (it is mostly relevant to imperative backends of functional languages)
- TABLES_NEXT_TO_CODE (it is mostly relevant to Haskell and LLVM)
But you have 'spaghetti code', when you often use global variables.

In game programming are global variables bad?

I know my gut reaction to global variables is "badd!" but in the two game development courses I've taken at my college globals were used extensively, and now in the DirectX 9 game programming tutorial I am using (www.directxtutorial.com) I'm being told globals are okay in game programming ...? The site also recommends using only structs if you can when doing game programming to help keep things simple.
I'm really confused on this issue, and all the research I've been trying to do is very confusing. I realize there are issues when using global variables (threading issues, they make code harder to maintain, the state of them is hard to track etc) but also there is a cost associated with not using globals, I'd have to pass a loooot of information around very often which can be confusing and I imagine time-costing, although I guess pointers would speed the process up (this is my first time writing a game in C++.) Anyway, I realize there is probably no "right" or "wrong" answer here since both ways work, but I want my code to be as proper as I can so any input would be good, thank you very much!
The trouble with games and globals is that games (nowadays) are threaded at engine level. Game developers using an engine use the engine's abstractions rather than directly programming concurrency (IIRC). In many of the highlevel languages such as C++, threads sharing state is complex. When many concurrent processes share a common resource they have to make sure they don't tread on eachother's toes.
To get around this, you use concurrency control such as mutex and various locks. This in effect makes asynchronous critical sections of code access shared state in a synchronous manner for writing. The topic of concurrency control is too much to explain fully here.
Suffice to say, if threads run with global variables, it makes debugging very hard, as concurrency bugs are a nightmare (think, "which thread wrote that? Who holds that lock?").
There are exceptions in games programming API such as OpenGL and DX. If your shared data/globals are pointers to DX or OpenGL graphics contexts then typically this maps down to GPU operations which don't suffer so much from the same trouble.
Just be careful. Keeping objects representing 'player' or 'zombie' or whatever, and sharing them between threads can be tricky. Spawn 'player' threads and 'zombie group' threads instead and have a robust concurrency abstraction between them based on message passing rather than accessing those object's state across the thread/critical section boundary.
Saying all that, I do agree with the "Say no to globals" point made below.
For more on the complexities of threads and shared state see:
1 POSIX Threads API - I know it is POSIX, but provides a good idea that translates to other API
2 Wikipedia's excellent range of articles on concurrency control mechanisms
3 The Dining Philosopher's problem (and many others)
4 ThreadMentor tutorials and articles on threading
5 Another Intel article, but more of a marketing thing.
6 An ACM article on building multi-threaded game engines
Have worked on AAA game titles, I can tell you that globals should be eradicated immediately before they spread like a cancer. I've seen them corrupt an I/O subsystem so completely that it had to be wholly thrown out to be rewritten.
Say no to globals. Always.
In this respect, there's no difference between games and other programs. While arguably OK in small examples given in elementary courses, global variables are strongly discouraged in real programs.
So if you want to write correct, readable and maintainable code, stay away from global variables as much as possible.
All answers until now deal with the globals/threads issue, but I will add my 2 cents to the struct/class (understood as all public attributes/private attributes + methods) discussion. Preferring structs over classes on the grounds of that being simpler is in the same line of thought of preferring assembler over C++ on the grounds of that being a simpler language.
The fact that you have to think on how your entities are going to be used and provide methods for it makes the concrete entity a little more complex, but greatly simplifies the rest of the code and maintainability. The whole point of encapsulation is that it simplifies the program by providing clear ways in that your data can be modified while maintaining your objects invariants. You control the entry points and what can happen there. Having all attributes public imply that any part of the code can have a small innocent error (forgot to check condition X) and break your invariants completely (health below 0, but no 'death' processing being triggered)
The other common discussion is performance: If I just need to update a datum, then having to call a method will impact my performance. Not really. If methods are simple and you provide them in the header as inlines (inside the class body or outside with the inline keyword), the compiler will be able to copy those instructions to each use place. You get the guarantee that the compiler will not leave out any check by mistake, and no impact in performance.
Having read a bit more what you posted though:
I'm being told globals are okay in
game programming ...? The site also
recommends using only structs if you
can when doing game programming to
help keep things simple.
Games code is no different from other code really. Gratuitous use of globals is bad regardless. And as for 'only use structs', that is just a load of crap. Approach game development on the same principles as any other software - you may find places where you need to bend this but they should be the exception, typically when dealing with low-level hardware issues.
I would say that the advice about globals and 'keeping things simple' is probably a mixture of being easier to learn and old fashioned thinking about performance. When I was being taught about game programming I remember being told that C++ wasn't advised for games as it would be too slow but I've worked on multiple games using every facet of C++ which proves that isn't true.
I would add to everyone's answers here that globals are to be avoided where possible, I wouldn't be afraid to use whatever you need from C++ to make your code understandable and easy to use. If you come up against a performance problem then profile that specific issue and I'll bet that most of the time you won't need to remove use of a C++ feature but just think about your problem better. There may still be some platforms around that require pure C but I don't really have experience of them, even the Gameboy Advance seemed to deal with C++ quite nicely.
The metaissue here is that of state. How much state does any given function depend on, and how much does it change? Then consider how much state is implicit versus explicit, and cross that with the inspect vs. change.
If you have a function/method/whatever that is called DoStuff(), you have no idea from the outside what it depends on, what it needs, and what's going to happen to the shared state. If this is a class member, you also have no idea how that object's state is going to mutate. This is bad.
Contrast to something like cosf(2), this function is understood not to change any global state, and any state that it requires (lookup tables for example) are hidden from view and have no effect on your program-at-large. This is a function that computes a value based on what you give it and it returns that value. It changes no state.
Class member functions then have the opportunity to step up some problems. There's a huge difference between
myObject.hitpoints -= 4;
myObject.UpdateHealth();
and
myObject.TakeDamage(4);
In the first example, an external operation is changing some state that one of its member functions implicitly depends upon. The fact is that these two lines can be separated by many other lines of code begins to make it non-obvious what's going to happen in the UpdateHealth call, even if outside of the subtraction it is the same as the TakeDamage call. Encapsulating the state changes (in the second example) implies that the specifics of the state changes aren't important to the outside world, and hopefully they're not. In the first example, the state changes are explicitly important to the outside world, and this is really no different than setting some globals and calling a function that uses those globals. E.g. hopefully you'd never see
extern float value_to_sqrt;
value_to_sqrt = 2.0f;
sqrt(); // reads the global value_to_sqrt
extern float sqrt_value; // has the results of the sqrt.
And yet, how many people do exactly this sort of thing in other contexts? (Considering especially that class instance state is "global" in regards to that particular instance.)
So- prefer giving explicit instruction to your function calls, and prefer that they return the results directly rather than having to explicitly set state before calling a function and then checking other state after it returns.
The more state dependencies a bit of code has, the harder it'll be to make it multithread safe, but that has already been covered above. The point I want to make is that the problem isn't so much globals but more the visibility of the collection of state that is required for a bit of code to operate (and subsequently how much other code also depends on that state).
Most games aren't multi-threaded, although newer titles are going down that route, and so they've managed to get away with it so far.
Saying that globals are okay in games is like not bothering to fix the brakes on your car because you only drive at 10mph!
It's bad practice which ever way you look at it.
You only have to look at the number of bugs in games to see examples of this.
If in class A you need to access data D, instead of setting D global, you'd better put into A a reference to D.
Globals are NOT intrinsically bad. In C for instance they are part of the language's normal use... and since C++ builds on C they still have a place.
On an aesthetic level, it's better to avoid them where you can sensibly make them part of a class, but if all you do is wrap a bunch of globals into a singleton, you made things worse because at least with globals it's obvious what the point is.
Be careful, but for some things it makes less sense to force OO concepts on what is actually a global value.
Two specific issues that I've encountered in the past:
First: If you're attempting to separate e.g. render phase (const access to most game state) from logic phase (non-const access to most game state) for whatever reason (decoupling render rate from logic rate, synchronizing game state across a network, recording and playback of gameplay at a fixed point in the frame, etc), globals make it very hard to enforce that.
Problems tend to creep in and become hard to debug and eradicate. This also has implications for threaded renderers separate from game logic, or the like (the other answers cover this topic thoroughly).
Second: The presence of many globals tends to bloat the literal pool, which the compiler typically places after each function.
If you get to your state through either a single "struct GlobalTable" which holds globals or a collection of methods on an object or the like, your literal pool tends to be a lot smaller, decreasing the size of the .text section in your executable.
This is mostly a concern for instruction set architectures that can't embed load targets directly into instructions (see e.g. fixed-width ARM or Thumb version 1 instruction encoding on ARM processors). Even on modern processors I'd wager you'll get slightly smaller codegen.
It also hurts doubly when your instruction and data caches are separate (again, as on some ARM processors); you'll tend to get two cachelines loaded where you might only need one with a unified cache. Since the literal pool may count as data, and won't always start on a cacheline boundary, the same cacheline might have to be loaded both into the i-cache and d-cache.
This may count as a "microoptimization", but it's a transform that's relatively easily applied to an entire codebase (e.g. extern struct GlobalTable { /* ... */ } the; and then replace all g_foo with the.foo)... And we've seen code size decrease between 5 and 10 percent in some cases, with a corresponding boost to performance due to keeping the caches cleaner.
I'm going to take a risk and say it depends to me on the scope/scale of your project. Because if you are trying to program an epic game with a boatload of code, then globals could, and easily in the most painful way in hindsight, cost you way more time than they save.
But if you are trying to code, say, something as simple as Super Mario or Metroid or Contra or even simpler, like Pac-Man, then these games were originally coded in 6502 assembly and used globals for almost everything. Just about all the game data and state was stored in data segments, and that didn't stop the devs from shipping a very competent and popular product in spite of working with absolutely inferior tools and engineering standards which would probably horrify people today.
So if you are just writing this kind of small and simple game which has a very limited scope and isn't designed to grow and expand far beyond its original design, isn't designed to be maintained for years and years, with a few thousand lines of simple C++ code, then I don't see the big deal of using a global here or a singleton there. Someone obsessed with trying to engineer Super Mario with the soundest engineering techniques with SOLID and a DI framework could end up taking far, far longer to ship than even the devs who wrote it in 6502 asm.
And I'm getting old and there's something to it there when I look at these old simple games and how they were coded, and it almost seems like the devs were doing something right in spite of the hard-coded magic numbers and globals all over the place while I spend my career fumbling around and trying to figure out the best way to engineer things. That said this is probably a very unpopular opinion, and not one I would have liked either a decade or two ago, but there's something to it. I don't look at the 6502 asm of Metroid and think, "these devs underengineered their product and their lives would have been so much easier if they did this or that." Seems like they did things just about right.
But again this is for small-scale stuff, maybe in the indie category of games by today's standards, and in the smaller of the indie games among them, and far from doing anything ground-breaking in terms of how much data it can process or using cutting-edge hardware techniques. If in doubt, I'd definitely suggest to err on the side of avoiding globals. It's also a little bit trickier in C++ as opposed to say, C, since you can have objects with constructor and destructors, and initialization and destruction order isn't well-defined and easily predictable for global objects. There I'd say to lean even more on the side of avoiding globals since they can trip you up in whole new ways when you aren't explicitly initializing and destroying them yourself in a predictable order. And naturally if you want to multithread a lot beyond a critical loop here and there, then your ability to reason about thread-safety of any particular code will be severely diminished if you cannot minimize the scope/visibility of your game state to the minimum of places.

What are the best resources for learning how to avoid side effects and state in OOP?

I've been playing with functional programming lately and there are pretty good treatments on the topic of side effects, why they should be contained, etc. In projects where OOP is used, I'm looking for some resources which lay out some strategies for minimizing side effect and/or state.
A good example of this is the book RESTful Web Services which gives you strategies for minimizing state in a web application. What others exist?
Remember I'm not looking for another OOP analysts/design patterns book (though good encapsulation and loose coupling help avoid side effects) but rather a resource where the topic itself is state/side effects.
Some compiled answers
OOP programmers who mostly care about state do so because of concurrency, so read Java Concurrency in Practice. [exactly what I was looking for]
Use TDD to make side effects more visible [I like it, example: the bigger your setUps are, the more state you need in place to run your tests = good warning]
Command-query separation [Good stuff, prevents the side effect of changing a function argument which is generally confusing]
Methods do only one thing, perhaps use descriptive names if they change the state of their object so it's simple and clear.
Make objects immutable [I really like this]
Pass values as parameters, instead of storing them in member variables. [I don't link this; it clutters up function prototype and is actively discouraged by Clean Code and other books, though I do admit it helps the state issue]
Recompute values instead of storing and updating them [I also really like this; in the apps I work on performance is a minor concern]
Similarly, don't copy state around if you can avoid it. Make one object responsible for keeping it and let others access it there. [Basic OOP principle, good advice]
I don't think you'll find a lot current material in the OO world on this topic, simply because OOP (and most imperative programming, for that matter) relies on state and side effects. Consider logging, for instance. It's pure side-effect, yet in any self-respecting J2EE app, it's everywhere. Hoare's original QuickSort relies on mutable state, since you have to swap values around a pivot, and yet it too is everywhere.
This is why many OO programmers have trouble wrapping their heads around functional programming paradigms. They try to reassign the value of "x," discover that it can't be done (at least not in the way it can in every other language they've worked in), and they throw up their hands and shout "This is impossible!" Eventually, if they're patient, they learn recursion and currying and how the map function replaces the need for loops, and they calm down. But the learning curve can be very steep for some.
The OO programmers these days who care most about avoiding state are those working on concurrency. The reasons for this are obvious -- mutable state and side effects cause huge headaches when you're trying to manage concurrency between threads. As a result, the best discussion I've seen in the OO world about avoiding state is Java Concurrency in Practice.
I think the rules are quite simple: methods should only ever do one thing, and the intent should be communicated clearly in the method name.
Methods should either query or change data, but never both.
Some small things I do:
Prefer immutable state, it is relatively benign. E.g. in Java I make member variables final and set them in the constructor wherever possible.
Pass around values as parameters, instead of storing them in member variables.
Recompute values instead of storing and updating them, if that can be done cheaply enough. This helps to avoid inconsistent data by forgetting to update it.
Similarly, don't copy state around if you can avoid it. Make one object responsible for keeping it and let others access it there.
One way to isolate side-effects in OO is to let operations only return a description object of the side-effects to cause.
Command-query separation is a pattern that is close to this idea.
By practising TDD (or at least writing unit tests) one will typically be much more aware of side-effects and use them more sparingly, and also separate them from other side-effect free expressions that are easy to write data driven (expected, actual) unit-tests for.