Bluebird has a nice function called Promise.map that lets you pass in an extra argument for the amount of concurrent operations.
e.g.
yield Promise.map arrayOfThings, coroutine (thing) ->
newThing = yield thing.operate()
database.set newThing
, concurrency: 500
However, Promise.map will keep an array of whatever database.set newThing returns in memory for all of arrayOfThings. I'd rather not store all of that in memory as it bogs down my server. Optimally, I would want to replace Promise.map with Promise.each so it doesn't store the returned values in memory. Unfortunately this is super slow because Promise.each is not concurrent.
Is there any way I can change my code to make it work like that?
First of all, at the moment Promise.each doesn't actually not allocate the array. There is an open issue for this - I'm assigned and I'd like to apologize - I'm not in front of a dev box and have been abroad. I'll try to fix this soon.
Second of all - no. There is no such functionality at the moment. Promise.each was created in order to precisely run things sequentially. A pull request might be entertained and it shouldn't be too hard to implement on top of PromiseArray. We just haven't really seen the use case before.
Meanwhile you can use Promise.map.
Related
I have a situation where I have a legacy multi-threaded application I'm trying to move to a linux platform and convert into C++.
I have a fixed size array of integers:
int R[5000];
And I perform a lot of operations like:
R[5] = (R[10] + R[20]) / 50;
R[5]++;
I have one Foreground task that mostly reads the values....but on occasion can update one. And then I have a background worker that is updating the values constantly.
I need to make this structure thread safe.
I would rather only update the value if the value has actually changed. The worker is constantly collecting data and doing calculation and storing the data whether it changes or not.
So should I create a custom class MyInt which has the structure and then include an array of mutexes to lock for updating/reading each value and then overload the [], =, ++, +=, -=, etc? Or should I try to implement anatomic integer array?
Any suggestions as to what that would look like? I'd like to try and keep the above notation for doing the updates...but I get that it might not be possible.
Thanks,
WB
The first thing to do is make the program work reliably, and the easiest way to do that is to have a Mutex that is used to control access to the entire array. That is, whenever either thread needs to read or write to anything in the array, it should do:
the_mutex.lock();
// do all the array-reads, calculations, and array-writes it needs to do
the_mutex.unlock();
... then test your program and see if it still runs fast enough for your needs. If so, you're done; that's all you need to do.
If you find that the program isn't fast enough due to contention on the mutex, you can start trying optimizations to make things faster. For example, if you know that your threads' operations will only need to work on local segments of the array at one time, you could create multiple mutexes, and assign different subsets of the array to each mutex (e.g. mutex #1 is used to serialize access to the first 100 array items, mutex #2 for the second 100 array items, etc). That will greatly decrease the chances of one thread having to wait for the other thread to release a mutex before it can continue.
If things still aren't fast enough for you, you could then look in to having two different arrays, one for each thread, and occasionally copying from one array to the other. That way each thread could safely access its own private array without any serialization needed. The copying operation would need to be handled carefully, probably using some sort of inter-thread message-passing protocol.
As simplified case: I need to transfer a VARIANT to another process over the existing COM interface. I currently use the MIDL-generated marshaller.
The actual transfer is for many values, is part of a time-critical process, and may involve large strings or safearray's (a few MB), thus number of copies made seems relevant.
Since the receiver needs to "keep" the data beyond the function call, at least one copy needs to be made by the marshaler. All signatures I can think of invlove two copies, however:
SetValue([in] VARIANT)
GetValue([out] VARIANT *) // called by receiver
In both cases, in my understanding the marshaller makes a cross-process copy that does get destroyed by the marshaller. Since I need to keep the data in the receiver, I need to make a second copy.
I considered "detaching" the data at the receiver:
SetValue([in, out] VARIANT *)
// receiver detaches value and sets to VT_EMPTY for return
But this would also destroy the source.
Q1: Is it possible to get the MIDL-generated marshaling code to do only one copy?
Q2: Would this be possible with a custom marshaller, and at what cost? (My first looks into that were extremly discouraging)
I am pretty mouch bound to using SAFEARRAY and/or other VARIANT/PROPVARIANT types, and to transfer the whole array.
[edit]
Both sides use C++, the interfaces are IUnknown-based, and it needs to work cross-process on a single machine, in the same context.
You don't say so explicitly, but it seems the problem you are seeking to solve is speed issues. In any case, consider using a profiler to identify the bottleneck if you haven't already done so.
I very much doubt in this case that it is the copying which is taking the time. Rather, it is likely to be the context-switching between processes involved, as you are getting the values one at a time. This means that for each value you retrieve, you have to switch processes to the target of the call, then switch back again.
You could speed this up enormously be making your design less "chatty" when setting or getting multiple values.
Something like this:
SetMultipleValues(
[in] SAFEARRAY(BSTR)* asNames,
[in] SAFEARRAY(VARIANT)* avValues
)
GetMultipleValues(
[in] SAFEARRAY(BSTR)* asNames,
[out,retval] SAFEARRAY(VARIANT)* pavValues
)
I.e. when calling GetMultipleValues, pass in an array of 10 names, and receive an array of 10 values in the same order as the names passed in, (or VT_EMPTY if the value does not exist).
I'm working on some code that has a global array that can be accessed by two threads for reading writing purposes.
There will be no batch processing where a range of indexes are read or written, so I'm trying to figure out if I should lock the entire array or only the array index I am currently using.
The easiest solution would be to consider the array a CS and put a big fat lock around it, but can I avoid this and just lock an index?
Cheers.
Locking one index implies that you can keep track of which thread is accessing what part of the array. Keeping track of this information, which is shared between the reading and the writing thread, implies that you have one lock around this information. So, you still end up with a global lock.
In this situation, I think that the most efficient approaches are:
- using a reader/writer lock
- or dividing the big array into a few subsets, each subset using a distinct lock.
If this is C++ i suggest you to use STL containers. std::vector or something else which suits your job. They are fast, easy to use, no memory leaks.
If you want to do it all by your self, then of course one method will be to use a single mutex ( which is bad ).
or you can use some reader writer thingy for the whole array.
I think its not feasible to make each element of an array thread safe with its own lock!! that would eat your memory. Check the link and there are 3 solutions with different out comes. Test them out and use the best for your case. ( don't think like "ok i think my program needs the readers preference algorithm". try using it in your system and decide. because we really cant assume such things sometimes )
There is no way of knowing what will be optimal unless you profile under realistic running conditions. I would suggest implementing an array-like class, where you can lock a varying number of elements in groups. Then you fine-tune the size of these groups.
Another option would be to enqueue all read/write operations using an active object. This would make all access sequential, and means you could use a non-concurrent array type to store the data. It would require some sort of concurrent queue data structure under the hood.
I have a code that is running 24/7. And, I am wondering if there is any methodology which I could use to allow me to make changes to the variables in real-time without invoking any error? Had been using raw_input() but this 'stops' the program since it's running sequentially.
My idea is to use a while true loop:
while true:
...
...
and for the first few loops, it'll use the default catch all values that i have pre-programmed into the system. As it's running, I'll like to make changes to some constant terms (which act as control) in 'real-time'. So, in the next loop and beyond, it'll use the new values rather than the pre-programmed version.
Some of your code or details of what you are trying to do would help.
But one way to do it is to have two processes, one process that reads from standard in with raw_input(), we can call it p1; and one that handles the data structure, in this case the list, we call it p2.
The two processes could communicate with message passing using sockets or what ever you want.
Then to be sure to avoid race conditions that new data is read in p1, but not yet updated in p2. Thus p2 will carry on and use the out of date data. One way to do this is using locks.
Project: typical chat program. Server must receive text from multiple clients and fan each input out to all clients.
In the server I want to have each client to have a struct containing the socket fd and a std::queue. Each structure will be on a std::list.
As input is received from a client socket I want to iterate over the list of structs and put new input into each client struct's queue. A string is new[ed] because I don't want copies of the string multiplied over all the clients. But I also want to avoid the headache of have multiple pointers to the string spread out and deciding when it is time to finally delete the string.
Is this an appropriate occassion for a shared pointer? If so, is the shared_ptr incremented each time I push them into the queue and decremented when I pop them from the queue?
Thanks for any help.
This is a case where a pseudo-garbage collector system will work much better than reference counting.
You need only one list of strings, because you "fan every input out to all clients". Because you will add to one end and remove from the other, a deque is an appropriate data structure.
Now, each connection needs only to keep track of the index of the last string it sent. Periodically (every 1000th message received, or every 4MB received, or something like that), you find the minimum of this index across all clients, and delete strings up to that point. This periodic check is also an opportunity to detect clients which have fallen far behind (possible broken connection) and recover. Without this check, a single stuck client will cause your program to leak memory (even under the reference counting scheme).
This scheme is several times less data than reference counting, and also removes one of the major points of cache contention (reference counts must be written from multiple threads, so they ruin performance). If you aren't using threads, it'll still be faster.
That is an appropriate use of a shared_ptr. And yes, the use count will be increment because a new shared_ptr will be create to push.