Suppose we have systems which are speed critical (like for example statistics/analytic's,socket programming etc), how do we design the traces and logs.
To be more specific, logs and traces generally reduce the performance (even if we have a switch off mechanism or verbose extension mechanism). In such scenarios, is there any reference guideline on how to 'put' logs/traces so that when the issue occurs (especially at production site) developer/post production team are able to pin-point the actual issue.
PS: I come from background where such applications are developed in C/C++ (run on Linux)
You can accumulate logs inside a buffer which you can describe and implement using Google Protocol Buffers. You can have a different thread periodically(every 5minutes) empty this buffer to disk or sending it through a UNIX domain socket (or other Linux IPC mechanisms) to a daemon that listens and writes them to a persistent DB or simply writes them to disk.
If you don't want to hit the disk on the machine that produces logs, you can send them to a different machine through a regular socket and write them to disk on that machine.
If you are aggregating logs from multiple machines, consider using 0MQ or CrossRoads as message queues to pass your logs through the network to a machine where they are stored persistently. You can find some information about using 0MQ in conjuction with Google Protocol Buffers here.
Related
We have 2 systems between which we want messages to be exchanged. I am currently designing the application and have been given 2 choices.
System 1 to push messages to an intermediate location (FTP or SQS) and system 2 (running BizTalk) reading the messages from that location and processing it.
Exposing Schema/Orchestration as a web service in system 2 which would be consumed by system 1.
Any suggestions which method would be better in terms of error handling and scalability.
If you can, always go for an asynchronous approach, through a queuing system. This way, your application can be running independent of your back end. And then I would advise for Service Bus for Windows Server (heavier installation), Windows Azure Service Bus (as a service, in the cloud, internet connection needed) or with MSMQ (store and forward included!). These provide transactional behavior and can be considered as very reliable. Other lightweight options are indeed through file exchange or FTP.
Web service or REST connectivity is also very easy to set up, but then you have synchronous behavior, which has its benefits:
you can get a 'real-time' ack back when your message is delivered by BizTalk
it's easy to set up and to monitor
So, as mostly, the answer is 'it depends'.
There's only a 'best way' for you particular app and there are a number of conditions to consider.
The easiest way is a shared location on the File System (OS File System vs FTP doesn't matter so much), especially if order is not important.
If order has to be maintained to there's a guaranteed delivery requirement, then a Message Queue is a good choice, MSMQ/WMQ.
Of course, HTTP/SOAP is always an option.
Realistically, any of these methods will get the message there so you have to consider the benefits of each protocol.
If I have a server running on my machine, and several clients running on other networks, what are some concepts of testing for synchronicity between them? How would I know when a client goes out-of-sync?
I'm particularly interested in how network programmers in the field of game design do this (or just any continuous network exchange application), where realtime synchronicity would be a commonly vital aspect of success.
I can see how this may be easily achieved on LAN via side-by-side comparisons on separate machines... but once you branch out the scenario to include clients from foreign networks, I'm just not sure how it can be done without clogging up your messaging system with debug information, and therefore effectively changing the way that synchronicity would result without that debug info being passed over the network.
So what are some ways that people get around this issue?
For example, do they simply induce/simulate latency on the local network before launching to foreign networks, and then hope for the best? I'm hoping there are some more concrete solutions, but this is what I'm doing in the meantime...
When you say synchronized, I believe you are talking about network latency. Meaning, that a client on a local network may get its gaming information sooner than a client on the other side of the country. Correct?
If so, then I'm sure you can look for books or papers that cover this kind of topic, but I can give you at least one way to detect this latency and provide a way to manage it.
To detect latency, your server can use a type of trace route program to determine how long it takes for data to reach each client. A common Linux program example can be found here http://linux.about.com/library/cmd/blcmdl8_traceroute.htm. While the server is handling client data, it can also continuously collect the latency statistics and provide the data to the clients. For example, the server can update each client on its own network latency and what the longest latency is for the group of clients that are playing each other in a game.
The clients can then use the latency differences to determine when they should process the data they receive from the server. For example, a client is told by the server that its network latency is 50 milliseconds and the maximum latency for its group it 300 milliseconds. The client then knows to wait 250 milliseconds before processing game data from the server. That way, each client processes game data from the server at approximately the same time.
There are many other (and probably better) ways to handle this situation, but that should get you started in the right direction.
Right now I have:
a multithreaded windows service written in C++ which use common static libraries as well as dynamic DLLs;
each thread performs different tasks and produces different errors (DB errors, function invocation errors, etc.). Each thread further will act as a logger client (and will send all messages to a logger server);
a separate thread which has no body yet, but which will act as a logger server for handling all log messages from the logger clients.
I need a good advise of how I should implement following idea into a working solution. The idea is to add a server-client logging architecture to my multithreaded server with following requirements (though some parts I need to implement by myself, please consider just the basic idea of logger client and logger server):
there should be a lot of log clients (as I already mentioned, the log client is just an existed working thread), each should register an entity with a unique name or/and ID and following behavior:
if the logger server is up and is working now, this log client starts to send log messages,
otherwise (the logger server is down), the log client endlessly tries to register itself with the log server using a small timeout.
there should be a logger server, with following behavior:
log server registers all log clients with their unique name or/an ID and endlessly checks if there appears a new log client to be registered
log server handles all messages from different log clients and writes to DB, file, etc.
there should be an opportunity to establish connection to the log server from an external application (for example, MySuperThreadViewerProgram to monitor all thread activity/errors/etc). At the connection, the log server should consider an external application as a one more log client. It's the most important requirement.
Summing up, there are three architecture parts to be implemented:
Server-client logger architecture;
Message queue facility between log clients and log server. And log server periodically checks if there any available log clients to be registered;
Inter-process communication between log server and external application, where the latter acts as a new log client.
Please, note, I consider a logger server as a kind of log message router.
So, the main question is:
Is there any solution (software framework) which has all described above features (which is much preferably) or I should use different libraries for different parts?
If the answer is: "there is no such solution", can you review the choice I made:
For #1: using Pantheios logger framework;
For #2: using any kind of register-subscribe library with server-client architecture and message-queue support (update: ipc library) ;
For #3: using Boost.Interprocess - using SharedMemory.
UPDATE:
The good example of #2 is this ipc library. And may be I was a bit incorrect describing logger client - logger server relations, but what I really mean is similar to approach, fully described and implemented in ipc library: when one entity (thread) subscribes to another to receive its messages (or "publish-subscribe" model).
And I want to use a kind of this technique to implement my logging architecture. But in what way?
UPDATE2:
OS is Windows. Yeah, I know, under Linux there is a bunch of useful tools and frameworks (D-Bus, Syslog). May be some of you could provide a helpful link to cross-platform library, which can be useful? Maybe there is a logger framework over D-Bus under Windows?
Any comments are highly appreciated.
Thanks a lot!
ØMQ (ZeroMQ) might be a viable alternative to the ipc library you mentioned, as it has a lot of features along the lines of your requirements.
It fully supports the PUB/SUB model, allows you to work between threads, bteween processes and even between machines. It is a client-server architecture, a message queue and works as IPC, too.
Of course, you need a specific way of coding and decoding messages, the protocol buffers are indeed a great idea.
As far as I know the logging backend pantheios uses (i.e. the log sink: DB, file or whatever) is specified at link-time. The severity of logs going to the backend can be specified at launch-time and with some simple tweaks also during runtime.
If I got you right, then you have one process (let's forget about the external application just for a minute) with multiple worker threads running. Some of these threads should log to a common backend (e.g. DB) and some to another. Because pantheios cannot do this out-of-the-box you'll need to write a custom backend that can route the logs to the correct backend.
If memory consumption is not an issue and you don't need the fastest logging performance, then you might want to look into log4cxx, because it is highly configurable and could possibly spare you from implementing a client-server-architecture with all the synchronization-problems it brings about.
About the external application: If you can guarantee, that it's only one external client, then you could use a pipe mechanism to communicate with the service. The service process would then have a separate thread, which corresponds to your server thread, that opens a named pipe and can also be specified as a log sink so your worker threads can log to it as well as to other log sinks (DB, file etc.).
There are some syslog servers for win as well. Winsyslog for example is coming from the producers of the famous rsyslog. Once you have syslogd running on win, there are plenty of OS independent syslog clients, such as SysLog4j if you're using Java, or the Syslog handler for the std. python logging.
I have a program that performs compute-heavy local processing. The compute packages have 16MiB input buffers and generate between 10KB and 16MiB of output. This all happens in a multi-threaded environment.
I have a number of users who are compute-bound. I'm interested in adding support for peer-to-peer multiprocessing. That is, clients would be run on multiple computers in the organization. The master would find available clients, send them 16MiB buffers, and then get the results.
The master will have a list of available clients. It will then send a message to each client that it is free. The clients will start a thread for each core that they have. Each thread will ask the master for a work buffer, will compute the results, and will then send the results to the master. The master will incorporate the results from each client into the final output.
A typical run involves crunching between 100 and 50,000 16MiB buffers. The output is typically in the 100MB to 10GB range. There are no data dependencies between the buffers.
I'm looking for the best way to arrange the communications between the client and the server. My plan is to use some kind of RPC. I do not want to embed a web server in either the client or the server. Instead, my plan is to simply receive connections on a TCP socket and have some kind of basic RPC.
Here are the options I'm considering:
I can role my own client/server system and use protocol buffers to put the whole thing together.
I could use one of the existing RPC systems for potocol buffers.
I could embed a web server and use XMLRPC.
The implementation language is C++. The program compiles for Linux, MacOS, and with mingw for Windows.
Thanks.
We found that 0MQ met our criteria.
I'm planning to develop a program for our university research that has to send lots of post requests to different urls. It must work as quick as possible (we should process about 100kk urls). What language shoud i use (currently i'm writing in c++, delphi and perl a bit)?
Also, I've heard that it's possible to write an multithreaded app in perl using prefork that can process about 20-30k per minute. Is it true?
// Sorry for my bad english, but it seems to be the only place where i can get the right answer
Andrew
The 20-30k per minute is completely arbitrary. If you run this on an 8-core machine with a beefy network connection you could probably surpass that.
However, I don't think your choice of programming language / library is going to matter much here. Instead, you're going to run into the number of concurrent TCP connections allowed by the machine, and also the bandwidth of the link itself.
Webserver Stress Tool claims capable of simulating the HTTP requests generated by up to 10.000 simultaneous users and has an entry in Torry's site: Presumably it's written in Delphi or C++ Builder.
My suggestion:
You can write your custom stress tool (HTTP(S) Client) with Delphi (It happens to be my favorite language so I advocate it) using light HTTP(S) library such as RTC SDK and OmniThreadLibrary for multithreading.
See this page for a clue/hint.
Edit:
Excerpt from Demos\Readme_Demos.txt in RealThinClient_SDK331.zip
App Client, Server and ISAPI demos can be used to stress-test RTC
component using Remote Functions with strong encryption by opening
hundreds of connections from each client and flooding the
Server/ISAPI with requests.
App Client Demo is ideal for stress-testing RTC remote functions using
multiple connections in multi-threaded mode, visualy showing activity
and stage for each connection in a live graph. Client can choose
between "Proxy" and standard connection components, to see the
difference in bandwidth usage and distribution.
I have heard Erlang is pretty good for such applications as it is very efficient to spawn many processes in Erlang quickly. But I think using Python would be fine too, just use the popen module to spawn multiple processes.
After all you are limited by how many you can run at the same time depending on how many processors your machine has. The choice of language may not matter as much depending on what you are doing with the data downloaded from these URLs as that may be more processing intensive than the cost of spawning.