I'd like to make a system that pulls github repositories automaticly using
System.cmd("git",["pull", link])
Is this command blocking? If I start it concurrently in many actors will I be always able to get as many pulls as actors (or at least socket limit for the system)?
If not is there anyway to acheive it?
Erlang and thus Elixir IO is non-blocking, so the IO of one process does not generally affect other processes in any way. Joe Armstrong describes this in a blog post:
So our code “looks like” we’re doing a synchronous blocking read.
Looks like was in quotes, because it’s not actually a blocking read,
it’s really an asynchronous read which does not block any other Erlang
processes.
Related
I am working on an app to start multiple streams in listener and caller modes after creating sockets. Right now, if I start one stream, the process kind of hangs because the stream is waiting for data. So this is clear to me that I need to start the stream in an async kind of process, so that the rest of the app keeps working.
Do I start the stream in:
separate threads
separate processes using fork
also read about select, will that work
Does blocking/non-blocking sockets solve this problem.
This app is being done in c++.
You can either use a library like Boost.Asio or the C function poll() (or select() which does basically the same thing) to wait on multiple sockets at once. Either way, you want to "multiplex" the sockets, meaning you block until any of them has data available, then you read from that one. This is how many network applications are built, and is usually more efficient, more scalable, and less error-prone than having a thread or process for each connection.
The program is a client server socket application being developed with C on Linux. There is a remote server to which each client connects and logs itself as being online. There will be most likely be several clients online at any given point of time, all trying to connect to the server to log themselves as being online/busy/idle etc. So how can the server handle these concurrent requests. What's a good design approach (Forking/multithreading for each connection request maybe?)?
personally i would use the event driven approach for servers. there you register a callback that is called as soon as a connection arrives. and event callbacks whenever the socket is ready to read or write.
with a huge amount of connections you will have a great performance and resource benefit compared to threads. But i would also prefere this for a smaler count of connections.
i only would use threads if you really need to use multiple cores or if you have some request that could take longer to process and where it is too complicate to handle it without threads.
i use libev as base library to handle event driven networking.
Generally speaking, you want a thread pool to service requests.
A typical structure will start with a single thread that does nothing but queue up incoming requests. Since it doesn't do very much, it's typically pretty easy for one thread to keep up with the maximum speed of the network.
That puts the items into some sort of concurrent queue. Then you have a pool of other threads reading items from the queue, doing what's needed, then depositing the result in another queue (and repeating, and repeating until the servers shuts down).
Finally, you have another single thread that just takes items from the result queue, and sends replies out to the clients.
Best approach is a combination of event driven model with multithreaded model.
You create a bunch of nonblocking sockets, but threads count should be much fewver. I.e. 10 sockets per thread.
Then you just listen for an event (incoming request) on every thread in a non-blocking mode and process it as it happens.
This technique usually performs better then non-blocking sockets or multithreaded model separately.
Take a look at Comer's "Internetworking with TCP/IP" volume 3 (BSD sockets version), it has detailed examples for different ways of writing servers and clients. The full code (sans explanations, unfortunally) is on the web. Or rummage around in http://tldp.org, there you'll find a collection of tutorials.
select or poll or epoll
These are facilities on *nix systems to aggregate multiple event sources (connections) into a single waiting point. The server adds the connections to a data structure, and then waits by calling select etc. It gets woken up when stuff happens on any of these connections, figures out which one, handles it, and then goes back to sleep. See manual for details.
There are several higher level libraries built on top of these mechanisms, that make programming them somewhat easier e.g. libevent, libev etc.
I'm writing a daemon that needs to both run in the background and take care of tasks and also receive input directly from a frontend. I've been attempting to use sockets to take care of this task, however, I can't get it to work properly since sockets pause the program while waiting for a connection. Is there anyway to get around this?
I'm using the socket wrappers provided at http://linuxgazette.net/issue74/tougher.html
Thank you for any and all help
You will need to use threads to make the socket operations asynchronous. Or use some library that has already implemented it, one of the top ones is Boost Asio.
There are a few ways to handle this problem. This most common is using an event loop and something like libevent. Then you use non-blocking sockets.
Doing this in an event driven fashion can require a big shift in your program logic. But doing it with threads has its own complexities and isn't clearly a better choice.
Usually the daemons use event loops to avoid the problem of waiting for events.
It's the smartest solution to the problem that you present (do not wait to an asynchronous event). ç
Althought, usually the entire daemon is build over the event loop and it's callback architecture, and can cause a partial rewritting, so usually the quick and dirty solution is creating a separate thread to handle those events wich usually creates more bugs than it solves. So, use an event loop:
libevent.
glib event loop.
libev.
boost::asio
...
From your description, you have already divided your application into a frontend (receiving input) and backend (socket handling and tasks). If the input from the frontend is sent over the socket (via the backend) rather receiving input from the socket then it seems like you are describing a client and not a server. Client programs are typically not implemented as daemons.
You have created a blocking socket and need to either monitor in a separate thread execution a thread or even separate process) or make a non-blocking socket and poll frequently for updates.
The link to the LinuxGazette is a basic intro to network programming. If you would like a little more depth then take a look at Beej's Guide to Network Programming where the various API calls available to you are explained in a little detail.. and will, perhaps, make you appreciate more wrapper libraries such as Boost::ASIO.
Can be worth retaining control of the event loop yourself - its no complicated and provides flexibility down the track.
"C++ pseudo-code" for an event loop.
while (!done)
{
bool workDone = false;
// Loop over each event source or internal worker
for each module
{
// If it has work to do, do some.
if (module.hasWorkDoTo())
{
// Generally, do as little work as possible; e.g. process a single event for this module.
// But tinker with this to manage priorities if need be.
// E.g. Maybe allow the GUI to flush its queue.
module.doSomeWork();
workDone = true;
}
}
if (!workDone)
{
// System idle. No Sleep for a bit so we have benign idle baheviour.
nanosleep(...);
}
}
I'm designing event loop for asynchronous socket IO using epoll/devpoll/kqueue/poll/select (including windows-select).
I have two options of performing, IO operation:
Non-blocking mode, poll on EAGAIN
Set socket to non-blocking mode.
Read/Write to socket.
If operation succeeds, post completion notification to event loop.
If I get EAGAIN, add socket to "select list" and poll socket.
Polling mode: poll and then execute
Add socket to select list and poll it.
Wait for notification that it is readable writable
read/write
Post completion notification to event loop of sucseeds
To me it looks like first would require less system calls when using in normal mode,
especially for writing to socket (buffers are quite big).
Also it looks like that it would be possible to reduce the overhead over number of "select"
executions, especially it is nice when you do not have something that scales well
as epoll/devpoll/kqueue.
Questions:
Are there any advantages of the second approach?
Are there any portability issues with non-blocking operations on sockets/file descriptors over numerous operating systems: Linux, FreeBSD, Solaris, MacOSX, Windows.
Notes: Please do not suggest using existing event-loop/socket-api implementations
I'm not sure there's any cross-platform problem; at the most you would have to use Windows Sockets API, but with the same results.
Otherwise, you seem to be polling in either case (avoiding blocking waits), so both approaches are fine. As long as you don't put yourself in a position to block (ex. read when there's no data, write when buffer's full), it makes no difference at all.
Maybe the first approach is easier to code/understand; so, go with that.
It might be of interest to you to check out the documentation of libev and the c10k problem for interesting ideas/approaches on this topic.
The first design is the Proactor Pattern, the second is the Reactor Pattern
One advantage of the reactor pattern is that you can design your API such that you don't have to allocate read buffers until the data is actually there to be read. This reduces memory usage while you're waiting for I/O.
from my experience with low latency socket apps:
for writes - try to write directly into the socket from writing thread (you need to obtain event loop mutex for that), if write is incomplete subscribe to write readiness with event loop (select/waitformultipleobjects) and write from event loop thread when socket gets writable
for reads - be always "subscribed" for read readiness for all sockets, so you always read from within event loop thread when the socket gets readable
I have a simple c++ application that generates reports on the back end of my web app (simple LAMP setup). The problem is the back end loads a data file that takes about 1.5GB in memory. This won't scale very well if multiple users are running it simultaneously, so my thought is to split into several programs :
Program A is the main executable that is always running on the server, and always has the data loaded, and can actually run reports.
Program B is spawned from php, and makes a simple request to program A to get the info it needs, and returns the data.
So my questions are these:
What is a good mechanism for B to ask A to do something?
How should it work when A has nothing to do? I don't really want to be polling for tasks or otherwise spinning my tires.
Use a named mutex/event, basically what this does is allows one thread (process A in your case) to sit there hanging out waiting. Then process B comes along, needing something done, and signals the mutex/event this wakes up process A, and you proceed.
If you are on Microsoft :
Mutex, Event
Ipc on linux works differently, but has the same capability:
Linux Stuff
Or alternatively, for the c++ portion you can use one of the boost IPC libraries, which are multi-platform. I'm not sure what PHP has available, but it will no doubt have something equivalent.
Use TCP sockets running on localhost.
Make the C++ application a daemon.
The PHP front-end creates a persistent connection to the daemon. pfsockopen
When a request is made, the PHP sends a request to the daemon which then processes and sends it all back. PHP Sockets C++ Sockets
EDIT
Added some links for reference. I might have some really bad C code that uses sockets of interprocess communication somewhere, but nothing handy.
IPC is easy on C++, just call the POSIX C API.
But what you're asking would be much better served by a queue manager. Make the background daemon wait for a message on the queue, and the frontend PHP just add there the specifications of the task it wants processed. Some queue managers allow the result of the task to be added to the same object, or you can define a new queue for the finish messages.
One of the best known high-performance queue manager is RabbitMQ. Another one very easy to use is MemcacheQ.
Or, you could just add a table to MySQL for tasks, the background process just queries periodically for unfinished ones. This works and can be very reliable (sometimes called Ghetto queues), but break down at high tasks/second.