Is using clojure's stm as a global state considered a good practice?

Is using clojure's stm as a global state considered a good practice? - clojure

in most of my clojure programs... and alot other clojure programs I see, there is some sort of global variable in an atom:
(def *program-state*
(atom {:name "Program"
:var1 1
:var2 "Another value"}))
And this state would be referred to occasionally in the code.
(defn program-name []
(:name #*program-state*))
Reading this article http://misko.hevery.com/2008/07/24/how-to-write-3v1l-untestable-code/ made me rethink global state but somehow, even though I completely agree with the article, I think its okay to use hash-maps in atoms because its providing a common interface for manipulating global state data (analogous to using different databases to store your data).
I would like some other thoughts on this matter.

This kind of thing can be OK, but it is also often a design smell so I would approach with caution.
Things to think about:
Consistency - can one part of the code change the program name? if so then the program-name function will behave inconsistently from the perspective of other threads. Not good!
Testability - is this easy to test? can one part of the test suite that is changing the program name safely run concurrently with another test that is reading the name?
Multiple instances - will you ever have two different parts of the application expecting to use a different program-name at the same time? If so, this is a strong hint that your mutable state should not be global.
Alternatives to consider:
Using a ref instead of an atom, you can at least ensure consistency of mutable state within transactions
Using binding you can limit mutability to a per-thread basis. This solves most of the concurrency issues and can be helpful when your global variables are being used like a set of thread-local configuration parameters.
Using immutable global state wherever you can. Does it really need to be mutable?

I think having a single global state that is occasionally updated in commutative ways is fine. When you start having two global states that need to be updated and threads start using them for communication, then I start to worry.
maintaining a count of current global users is fine:
Any thread can inc or dec this at any time without hurting another
If it changes out from under your thread nothing explodes.
maintaining the log directory is questionable:
When it changes will all threads stop writing to the old one?
if two threads change it will they converge.
Using this as a message queue is even more dubious:

I think it is fine to have such a global state (and in many cases it is required) but I would be careful about that the core logic of my application have functions that take the state as a parameter and return the updated state rather than directly accessing the global state. Basically I would prefer to have a controlled access to the global state from few set of function and everything else in my program should use these set of methods to access the state as that would allow me to abstract away the state implementation i.e initially I could start with an in memory atom, then may be move to some persistent storage.

Related

C++ multiple Lua states

I want to use the same lua file for multiple game objects
The lua file:
function onUpdate()
if Input.isKeyDown(Keys.d) then
actor.x = actor.x + 0.1
end
if Input.isKeyDown(Keys.a) then
actor.x = actor.x - 0.1
end
if Input.isKeyDown(Keys.w) then
actor.y = actor.y + 0.1
end
if Input.isKeyDown(Keys.s) then
actor.y = actor.y - 0.1
end
end
Question
Is it a good practice to have a Lua State for each object or should i use the same state for the same file and update the "actor" global variable before the game object calls the script
(I want to avoid using tables because i would have to use the table name before the variables and function calls)
(I don't know if there is any other solution... I am new to lua)

While it is nice that you can have many Lua states within a single program, keep in mind that each of them does take up some memory. If there's a good reason why you should keep two environments completely separate, like security concearns, or totally unrelated subsystems that may or may not be needed at the same time, then it's certainly worth it.
Otherwise, it's usually better and more manageable to have a single Lua state.
If you need strong separation between different blocks of logic, Lua has you covered:
Coroutines
If you need to have several blocks of logic that you want to pause and resume later on, you can simply wrap them up in coroutines. This can easily be done from C as well, and allows you to do most of what you could do with different Lua states.
Environments
While these work somewhat differently before and after Lua 5.2, the basic idea is the same: you can change what "global" variables are visible to a section of your code, even using metatables to access other data or generate it on the fly.
With those two, you should really not have much need for separate Lua states in a game. One exception to this rule would obviously be multi-threading; you shouldn't have more than one thread with access to the same Lua state unless you use some sort of locking mechanism, so it would make sense to have one state per thread and set up a way for them to communicate from within C.

Are global variables constantly updated

I know global variables are bad, however I have a checksettings function which is run every tick. http://pastebin.com/54yp4vuW The paste bin contains some of the check setting function. Before I added the GetPrivateProfileIntA everything worked fine. Now when I run it, it lags like hell. I can only assume this is because it is constantly loading the files. So my question is, are global variables constantly updated. (ie if I put this in global var will it stop the lag)
Thanks :)

Assuming I'm interpreting your question correctly, then no, global variables are not constantly updated unless you explicitly do so in code. So yes, putting those calls in global variables will get rid of the lag.

You haven't provided any details about the design but globals are visible across the entire application and get updated when they are written into.
Multiple processes/threads reading that global variable would then read the same updated value.
But synchronizing reads/writes requires the use of synchronization mechanisms such as mutexes, condition variables etc etc.
In your case you need to decide when to call GetPrivateProfileIntA() for all those settings.
Are all those settings constantly updated or only a fraction of those? Identify the ones which need to be monitored periodically and only load those.
And if a setting is stateful meaning all objects of the class refer to a single copy of the setting then I would use static class variables instead of plain global variables.
Alternately you could make a JIT call to GetPrivateProfileIntA() where needed and not bother about storing the setting in a global variable.

What is Clojure volatile?

There has been an addition in the recent Clojure 1.7 release : volatile!
volatile is already used in many languages, including java, but what are the semantics in Clojure?
What does it do? When is it useful?

The new volatile is as close as a real "variable" (as it is from many other programming languages) as it gets for clojure.
From the announcement:
there are a new set of functions (volatile!, vswap!, vreset!, volatile?) to create and use volatile "boxes" to hold state in stateful transducers. Volatiles are faster than atoms but give up atomicity guarantees so should only be used with thread isolation.
For instance, you can set/get and update them just like you would do with a variable in C.
The only addition (and hence the name) is the volatile keyword to the actual java object.
This is to prevent the JVM from optimization and makes sure that it reads the memory location every time it is accessed.
From the JIRA ticket:
Clojure needs a faster variant of Atom for managing state inside transducers. That is, Atoms do the job, but they provide a little too much capability for the purposes of transducers. Specifically the compare and swap semantics of Atoms add too much overhead. Therefore, it was determined that a simple volatile ref type would work to ensure basic propagation of its value to other threads and reads of the latest write from any other thread. While updates are subject to race conditions, access is controlled by JVM guarantees.
Solution overview: Create a concrete type in Java, akin to clojure.lang.Box, but volatile inside supports IDeref, but not watches etc.
This mean, a volatile! can still be accessed by multiple threads (which is necessary for transducers) but it does not allow to be changed by these threads at the same time since it gives you no atomic updates.
The semantics of what volatile does is very well explained in a java answer:
there are two aspects to thread safety: (1) execution control, and (2) memory visibility. The first has to do with controlling when code executes (including the order in which instructions are executed) and whether it can execute concurrently, and the second to do with when the effects in memory of what has been done are visible to other threads. Because each CPU has several levels of cache between it and main memory, threads running on different CPUs or cores can see "memory" differently at any given moment in time because threads are permitted to obtain and work on private copies of main memory.
Now let's see why not use var-set or transients:
Volatile vs var-set
Rich Hickey didn't want to give truly mutable variables:
Without mutable locals, people are forced to use recur, a functional
looping construct. While this may seem strange at first, it is just as
succinct as loops with mutation, and the resulting patterns can be
reused elsewhere in Clojure, i.e. recur, reduce, alter, commute etc
are all (logically) very similar.
[...]
In any case, Vars
are available for use when appropriate.
And thus creating with-local-vars, var-set etc..
The problem with these is that they're true vars and the doc string of var-set tells you:
The var must be thread-locally bound.
This is, of course, not an option for core.async which potentially executes on different threads. They're also much slower because they do all those checks.
Why not use transients
Transients are similar in that they don't allow concurrent access and optimize mutating a data structure.
The problem is that transient only work with collection that implement IEditableCollection. That is they're simply to avoid expensive intermediate representation of the collection data structures. Also remember that transients are not bashed into place and you still need some memory location to store the actual transient.
Volatiles are often used to simply hold a flag or the value of the last element (see partition-by for instance)
Summary:
Volatile's are nothing else but a wrapper around java's volatile and have thus the exact same semantics.
Don't ever share them. Use them only very carefully.

Volatiles are a "faster atom" with no atomicity guarantees. They were introduced as atoms were considered too slow to hold state in transducers.
there are a new set of functions (volatile!, vswap!, vreset!, volatile?) to create and use volatile "boxes" to hold state in stateful transducers. Volatiles are faster than atoms but give up atomicity guarantees so should only be used with thread isolation

How many threads can be used (by compiler) to initialise global objects (before function main)

The question may be wrong in wording but the idea is simple. The order of initialisation of global objects in different translation units is not guarantied. If an application consists of two translation units - can compiler potentially generate initialisation code that will start two threads to create global objects in those translation units? This question is similar to this or this but I think the answers don't match the questions.
I will rephrase the question - if a program has several global objects - will their constructors always be called from the same thread (considering that user code doesn't start any threads before main)?
To make this question more practical - consider a case when several global objects are scattered over several translation units and each of them increments single global counter in constructor. If there is no guarantee that they are executed in single thread then access to the counter must be synchronised. For me it's definitely overkill.
UPDATE:
It looks like concurrent initialisation of globals is possible - see n2660 paper (second case)
UPDATE:
I've implemented singleton using singleton_registry. The idea is that declaration using SomeSingleton = singleton<SomeClass>; registers SomeClass for further initialisation in singleton_registry and real singleton initialisation (with all inter-dependencies) happens at the beginning of main function (before I've started any other threads) by singleton_registry (which is created on stack). In this case I don't need to use DLCP. It also allows me to prepare application configuration and disseminate it over all singletons uniformly. Another important use case is usage of singletons with TDD. Normally it's a pain to use singletons in unit tests, but with singleton_registry I can recreate application global objects for each test case.
In fact this is just a development of an idea that all singletons must be initialised at the beginning of function main. Normally it means that singleton user has to write custom initialisation function which handles all dependencies and prepare proper initialisation parameters (this is what I want to avoid).
Implementation looks good except the fact that I may have potential race conditions during singletons registration.

Several projects ago, using vxWorks and C++, a team mate used a non-thread safe pattern from the book (are any of those patterns thread safe?):
10% of system starts failed.
Lesson 1: if you don't explicitly control the when of a CTOR of a global object, it can change from build to build. (and we found no way to control it.)
Lesson 2: Controlling the when of CTOR is probably the easiest solution to lesson 1 (should it become a problem).
In my experience, if you gotta have Global objects (and it seems many would discourage it), consider limiting these globals to pointers initialized to 0:
GlobalObject* globalobject = nullptr;
And only the thread assigned to initialize it will do so. The other threads/tasks can spin wait for access.

Queueing Method Calls So That They Are Performed By A Single Thread In Clojure

I'm building a wrapper around OrientDB in Clojure. One of the biggest limitations (IMHO) of OrientDB is that the ODatabaseDocumentTx is not thread-safe, and yet the lifetime of this thing from .open() to .close() is supposed to represent a single transaction, effectively forcing transactions to occur is a single thread. Indeed, thread-local refs to these hybrid database/transaction objects are provided by default. But what if I want to log in the same thread as I want to persist "real" state? If I hit an error, the log entries get rolled back too! That use case alone puts me off of virtually all DBMS's since most do not allow named transaction scope management. /soapbox
Anyways, OrientDB is the way it is, and it's not going to change for me. I'm using Clojure and I want an elegant way to construct a with-tx macro such that all imperative database calls within the with-tx body are serialized.
Obviously, I can brute-force it by creating a sentinel at the top level of the with-tx generated body and deconstructing every form to the lowest level and wrapping them in a synchronized block. That's terrible, and I'm not sure how that would interact with something like pmap.
I can search the macro body for calls to the ODatabaseDocumentTx object and wrap those in synchronized blocks.
I can create some sort of dispatching system with an agent, I guess.
Or I can subclass ODatabaseDocumentTx with synchronized method calls.
I'm scratching my head trying to come up with other approaches. Thoughts? In general the agent approach seems more appealing simply because if a block of code has database method calls interspersed, I would rather do all the computation up front, queue the calls, and just fire a whole bunch of stuff to the DB at the end. That assumes, however, that the computation doesn't need to ensure consistency of reads. IDK.

Sounds like a job for Lamina.

One option would be to use Executor with 1 thread in thread pool. Something like shown below. You can create a nice macro around this concept.
(import 'java.util.concurrent.Executors)
(import 'java.util.concurrent.Callable)
(defmacro sync [executor & body]
`(.get (.submit ~executor (proxy [Callable] []
(call []
(do ~#body))))))
(let [exe (Executors/newFixedThreadPool (int 1))
dbtx (sync exe (DatabaseTx.))]
(do
(sync exe (readfrom dbtx))
(sync exe (writeto dbtx))))
The sync macro make sure that the body expression is executed in the executor (which has only one thread) and it waits for the operation to complete so that all operations execute one by one.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js