Logging Etiquette - c++

I have a server program that I am writing. In this program, I log allot. Is it customary in logging (for a server) to overwrite the log of previous runs, append to the file with some sort of new run header, or to create a new log file (it won't be restarted too often).
Which of these solutions is the way of doing things under Linux/Unix/MacOS?
Also, can anyone suggest a logging library for C++/C? I need one, regardless of the answer to the above question.

Take a look in /var/log/...you'll see that files are structured like
serverlog
serverlog.1
serverlog.2
This is done by logrotate which is called in a cronjob. But everything is simply in chronological order within the files. So you should just append to the same log file each time, and let logrotate split it up if needed.
You can also add a configuration file to /etc/logrotate.d/ to control how a particular log is rotated. Depending on how big your logfiles are, it might be a good idea to add here information about your logging. You can take a look at other files in this directory to see the syntax.

This is a rather complex issue. I don't think that there is a silver bullet that will kill all your concerns in one go.
The first step in deciding what policy to follow would be to set your requirements. Why is each entry logged? What is its purpose? In most cases this will result in some rather concrete facts, such as:
You need to be able to compare the current log with past logs. Even when an error message is self-evident, the process that led to it can be determined much faster by playing spot-the-difference, rather than puzzling through the server execution flow diagram - or, worse, its source code. This means that you need at least one log from a past run - overwriting blindly is a definite No.
You need to be able to find and parse the logs without going out of your way. That means using whatever facilities and policies are already established. On Linux it would mean using the syslog facility for important messages, to allow them to appear in the usual places.
There is also some good advice to heed:
Time is important. No only because there's never enough of it, but also because log files without proper timestamps for each entry are practically useless. Make sure that each entry has a timestamp - most system-wide logging facilities will do that for you. Make also sure that the clocks on all your computers are as accurate as possible - using NTP is a good way to do that.
Log entries should be as self-contained as possible, with minimal cruft. You don't need to have a special header with colors, bells and whistles to announce that your server is starting - a simple MyServer (PID=XXX) starting at port YYYYY would be enough for grep (or the search function of any decent log viewer) to find.
You need to determine the granularity of each logging channel. Sending several GB of debugging log data to the system logging daemon is not a good idea. A good approach might be to use separate log files for each logging level and facility, so that e.g. user activity is not mixed up with low-level data that in only useful when debugging the code.
Make sure your log files are in one place, preferably separated from other applications. A directory with the name of your application is a good start.
Stay within the norm. Sure you may have devised a new nifty logfile naming scheme, but if it breaks the conventions in your system it could easily confuse even the most experienced operators. Most people will have to look through your more detailed logs in a critical situation - don't make it harder for them.
Use the system log handling facilities. E.g. on Linux that would mean appending to the same file and letting an external daemon like logrotate to handle the log files. Not only would it be less work for you, it would also automatically maintain any general logging policies as a whole.
Finally: Always copy log important data to the system log as well. Operators watch the system logs. Please, please, please don't make them have to look at other places, just to find out that your application is about to launch the ICBMs...

https://stackoverflow.com/questions/696321/best-logging-framework-for-native-c
For the logging, I would suggest creating a new log file and clean it using a certain frequency to avoid it growing too fat. Overwrite logs of previous login is usually a bad idea.

Related

Process queue as folder with files. Possible problems?

I have an executable that needs to process records in the database when the command arrives to do so. Right now, I am issuing commands via TCP exchange but I don't really like it cause
a) queue is not persistent between sessions
b) TCP port might get locked
The idea I have is to create a folder and place files in it whose names match the commands I want to issue
Like:
1.23045.-1.1
2.999.-1.1
Then, after the command has been processed, the file will be deleted or moved to Errors folder.
Is this viable or are there some unavoidable problems with this approach?
P.S. The process will be used on Linux system, so Antivirus problems are out of the question.
Yes, a few.
First, there are all the problems associated with using a filesystem. Antivirus programs are one (though I cannot see why it doesn't apply to Linux - no delete locks?). Disk space, file and directory count maximums are others. Then, open file limits and permissions...
Second, race conditions. If there are multiple consumers, more than one of them might see and start processing the command before the first one has [re]moved it.
There are also the issues of converting commands to filenames and vice versa, and coming up with different names for a single command that needs to be issued multiple times. (Though these are programming issues, not design ones; they'll merely annoy.)
None of these may apply or be of great concern to you, in which case I say: Go ahead and do it. See what we've missed that Real Life will come up with.
I probably would use an MQ server for anything approaching "serious", though.

the best approaches for logging localization using c++

I am working on a multinational project where target audience for logs might be from two nationalities. Therefore it is becoming important to log in more than one language , I am thinking about writing to 2 different log folders based on language every time I am logging something, but I am also wondering if there's some out of the box functionality that is coming along with logging frameworks like log4cpp?
As other commenters have mentioned, it sounds like you are going down the wrong track by looking to do multilingual logging.
My recommendation would be to use English (which is the standard for technical information, and which I guess is the language you know best) and to make sure that the language you use is clear, grammatically correct and unambiguous. Then if one of the technicians cannot understand it, they can very easily and efficiently run it through a machine translation engine such as Google Translate. Or indeed they could process the logs and run everything through Google Translate to append translated text, particularly if you annotate the logs to mark the language content.
Assuming that the input language is well-written, machine transation usually gives a good result which the end user can understand. If the message isn't clear, has typos or abbreviations, then that's where machine translation fails spectacularly.
Writing log naturally brings down the speed of execution due to file open, seek and write operations involved as part of it.
This is one primary reason why many developer and architects suggest to write log at different levels.Increasing the depth of log entries as level increases to trace down the problems better. At higher level, you will notice that your process speed drops due to more log entries getting generated.
Rather suggest you to use services that can translate from one language to other.
I'm sure there are libraries free or paid which does this translation. You can create a small utility program that runs in the background and does this conversion during process idle time.
Well one suggestion is you can use a different process/thread which listens for your log messages, which you can log it from there ..
This reduces I/O logging time in your main process/thread and you can make all changes related to Logging language over there..
For multi - Lingual support I think you can try writing with widechar string .. though I am not sure..
the best approaches for logging localization using c++
Install Qt 4 and use QObject::tr/ tr() macro for strings. Write strings in whatever language you want. Hire/Get a translator to localize strings using QT Linguist.
Please note that perfect translation is impossible, so there will be many "amusing" misunderstandings, even if your translator is a genius. So it might be a better idea to select main language for programming team.
--EDIT--
Didn't notice this part before:
in more than one language
One way to approach it is to implement log reader. Instead of writing plaintext messages, you could dump message ids (generated by some kind of macros) and string arguments if strings are formatted. "Log reader" will allow user to select desired language while viewing log file, and translate messages based on their ids/arguments using mechanism similar to QTranslator. The good thing about this approach is that you'll be able to add more languages later - so it'll be possible to retranslate old logs. The bad thing is that this format will be harder to read for "normal human", although you can add plaintext messages in addition to message ids and arguments and you'll need to write log viewer.
Qt 4 has most of this framework implemented (there are routines for dumping variants into text/data streams, and so on) along with translation tool. See QTranslator documentation and Linguist manual for more info.

Methods for encrypting an archive in C++

I'm writing a game that will have a lot of information (configuration, some content, etc) inside of some xml documents, as well as resource files. This will make it easier for myself and others to edit the program without having to edit the actual C++ files, and without having to recompile.
However, as the program is starting to grow there is an increase of files in the same directory as the program. So I thought of putting them inside a file archive (since they are mostly text, it goes great with compression).
My question is this: Will it be easier to compress all the files and:
Set a password to it (like a password-protected ZIP), then provide the password when the program needs it
Encrypt the archive with Crypto++ or similar
Modify the file header slightly as a "makeshift" encryption, and fix the file's headers while the file is loaded
I think numbers 1 and 2 are similar, but I couldn't find any information on whether zlib could handle password-protected archives.
Also note that I don't want the files inside the archive to be "extracted" into the folder while the program is using it. It should only be in the system's memory.
I think you misunderstands the possibilities brought up by encryption.
As long as the program is executed on an untrusted host, it's impossible to guarantee anything.
At most, you can make it difficult (encryption, code obfuscation), or extremely difficult (self-modifying code, debug/hooks detection), for someone to reverse engineer the code, but you cannot prevent cracking. And with Internet, it'll be available for all as soon as it's cracked by a single individual.
The same goes, truly, for preventing an individual to tamper with the configuration. Whatever the method (CRC, Hash --> by the way encryption is not meant to prevent tampering) it is still possible to reverse engineer it given sufficient time and means (and motivation).
The only way to guarantee an untampered with configuration would be to store it somewhere YOU control (a server), sign it (Asymmetric) and have the program checks the signature. But it would not, even then, prevent someone from coming with a patch that let's your program run with a user-supplied (unsigned) configuration file...
And you know the worst of it ? People will probably prefer the cracked version because freed from the burden of all those "security" measures it'll run faster...
Note: yes it is illegal, but let's be pragmatic...
Note: regarding motivation, the more clever you are with protecting the program, the more attractive it is to hackers --> it's like a brain teaser to them!
So how do you provide a secured service ?
You need to trust the person who executes the program
You need to trust the person who stores the configuration
It can only be done if you offer a thin client and executes everything on a server you trust... and even then you'll have trouble making sure that no-one finds doors in your server that you didn't thought about.
In your shoes, I'd simply make sure to detect light tampering with the configuration (treat it as hostile and make sure to validate the data before running anything). After all file corruption is equally likely, and if a corrupted configuration file meant a ruined client's machine, there would be hell to pay :)
If I had to choose among your three options, I'd go for Crypto++, as it fits in nicely with C++ iostreams.
But: you are
serializing your data to XML
compressing it
encrypting it
all in memory, and back again. I'd really reconsider this choice. Why not use eg. SQLite to store all your data in a file-based database (SQLite doesn't require any external database process)?
Encryption can be added through various extensions (SEE or SQLCipher). It's safe, quick, and completely transparent.
You don't get compression, but then again, by using SQLite instead of XML, this won't be an issue anyway (or so I think).
Set a password to it (like a password-protected ZIP), then provide the password when the program needs it
Firstly, you can't do this unless you are going to ask a user for the password. If that encryption key is stored in the code, don't bet on a determined reverse engineer from finding it and decrypting the archive.
The one big rule is: you cannot store encryption keys in your software, because if you do, what is the point of using encryption? I can find your key.
Now, onto other points. zlib does not support encryption and as they point out, PKZip is rather broken anyway. I suspect if you were so inclined to find one, you'd probably find a zip/compression library capable of handling encryption. (ZipArchive I believe handles Zip+AES but you need to pay for that).
But I second Daniel's answer that's just displayed on my screen. Why? Encryption/compression isn't going to give you any benefit unless the user presents some form of token (password, smartcard etc) not present in your compiled binary or related files. Similarly, if you're not using up masses of disk space, why compress?

Easiest way to sign/certify text file in C++?

I want to verify if the text log files created by my program being run at my customer's site have been tampered with. How do you suggest I go about doing this? I searched a bunch here and google but couldn't find my answer. Thanks!
Edit: After reading all the suggestions so far here are my thoughts. I want to keep it simple, and since the customer isn't that computer savy, I think it is safe to embed the salt in the binary. I'll continue to search for a simple solution using the keywords "salt checksum hash" etc and post back here once I find one.
Obligatory preamble: How much is at stake here? You must assume that tampering will be possible, but that you can make it very difficult if you spend enough time and money. So: how much is it worth to you?
That said:
Since it's your code writing the file, you can write it out encrypted. If you need it to be human readable, you can keep a second encrypted copy, or a second file containing only a hash, or write a hash value for every entry. (The hash must contain a "secret" key, of course.) If this is too risky, consider transmitting hashes or checksums or the log itself to other servers. And so forth.
This is a quite difficult thing to do, unless you can somehow protect the keypair used to sign the data. Signing the data requires a private key, and if that key is on a machine, a person can simply alter the data or create new data, and use that private key to sign the data. You can keep the private key on a "secure" machine, but then how do you guarantee that the data hadn't been tampered with before it left the original machine?
Of course, if you are protecting only data in motion, things get a lot easier.
Signing data is easy, if you can protect the private key.
Once you've worked out the higher-level theory that ensures security, take a look at GPGME to do the signing.
You may put a checksum as a prefix to each of your file lines, using an algorithm like adler-32 or something.
If you do not want to put binary code in your log files, use an encode64 method to convert the checksum to non binary data. So, you may discard only the lines that have been tampered.
It really depends on what you are trying to achieve, what is at stakes and what are the constraints.
Fundamentally: what you are asking for is just plain impossible (in isolation).
Now, it's a matter of complicating the life of the persons trying to modify the file so that it'll cost them more to modify it than what they could earn by doing the modification. Of course it means that hackers motivated by the sole goal of cracking in your measures of protection will not be deterred that much...
Assuming it should work on a standalone computer (no network), it is, as I said, impossible. Whatever the process you use, whatever the key / algorithm, this is ultimately embedded in the binary, which is exposed to the scrutiny of the would-be hacker. It's possible to deassemble it, it's possible to examine it with hex-readers, it's possible to probe it with different inputs, plug in a debugger etc... Your only option is thus to make debugging / examination a pain by breaking down the logic, using debug detection to change the paths, and if you are very good using self-modifying code. It does not mean it'll become impossible to tamper with the process, it barely means it should become difficult enough that any attacker will abandon.
If you have a network at your disposal, you can store a hash on a distant (under your control) drive, and then compare the hash. 2 difficulties here:
Storing (how to ensure it is your binary ?)
Retrieving (how to ensure you are talking to the right server ?)
And of course, in both cases, beware of the man in the middle syndroms...
One last bit of advice: if you need security, you'll need to consult a real expert, don't rely on some strange guys (like myself) talking on a forum. We're amateurs.
It's your file and your program which is allowed to modify it. When this being the case, there is one simple solution. (If you can afford to put your log file into a seperate folder)
Note:
You can have all your log files placed into a seperate folder. For eg, in my appplication, we have lot of DLLs, each having it's own log files and ofcourse application has its own.
So have a seperate process running in the background and monitors the folder for any changes notifications like
change in file size
attempt to rename the file or folder
delete the file
etc...
Based on this notification, you can certify whether the file is changed or not!
(As you and others may be guessing, even your process & dlls will change these files that can also lead to a notification. You need to synchronize this action smartly. That's it)
Window API to monitor folder in given below:
HANDLE FindFirstChangeNotification(
LPCTSTR lpPathName,
BOOL bWatchSubtree,
DWORD dwNotifyFilter
);
lpPathName:
Path to the log directory.
bWatchSubtree:
Watch subfolder or not (0 or 1)
dwNotifyFilter:
Filter conditions that satisfy a change notification wait. This parameter can be one or more of the following values.
FILE_NOTIFY_CHANGE_FILE_NAME
FILE_NOTIFY_CHANGE_DIR_NAME
FILE_NOTIFY_CHANGE_SIZE
FILE_NOTIFY_CHANGE_SECURITY
etc...
(Check MSDN)
How to make it work?
Suspect A: Our process
Suspect X: Other process or user
Inspector: The process that we created to monitor the folder.
Inpector sees a change in the folder. Queries with Suspect A whether he did any change to it.
if so,
change is taken as VALID.
if not
clear indication that change is done by *Suspect X*. So NOT VALID!
File is certified to be TAMPERED.
Other than that, below are some of the techniques that may (or may not :)) help you!
Store the time stamp whenever an application close the file along with file-size.
The next time you open the file, check for the last modified time of the time and its size. If both are same, then it means file remains not tampered.
Change the file privilege to read-only after you write logs into it. In some program or someone want to tamper it, they attempt to change the read-only property. This action changes the date/time modified for a file.
Write to your log file only encrypted data. If someone tampers it, when we decrypt the data, we may find some text not decrypted properly.
Using compress and un-compress mechanism (compress may help you to protect the file using a password)
Each way may have its own pros and cons. Strength the logic based on your need. You can even try the combination of the techniques proposed.

How do I extract the network protocol from the source code of the server?

I'm trying to write a chat client for a popular network. The original client is proprietary, and is about 15 GB larger than I would like. (To be fair, others call it a game.)
There is absolutely no documentation available for the protocol on the internet, and most search results only come back with the client's scripting interface. I can understand that, since used in the wrong way, it could lead to ruining other people's experience.
I've downloaded the source code of a couple of alternative servers, including the one I want to connect to, but those
contain no documentation other than install instructions
are poorly commented (I did a superficial browsing)
are HUGE (the src folder of the target server contains 12 MB worth of .cpp and .h files), and grep didn't find anything related
I've also tried searching their forums and contacting the maintainers of the server, but so far, no luck.
Packet sniffing isn't likely to help, as the protocol relies heavily on encryption.
At this point, all my hope is my ability to chew through an ungodly amount of code. How do I start?
Edit: A related question.
If your original code is encrypted with some well known library like OpenSSL or Ctypto++ it might be useful to write your wrapper for the main entry points of these libraries, then delagating the call to the actual library. If you make such substitution and build the project successfully, you will be able to trace everything which goes out in the plain text way.
If your project is not using third party encryption libs, hopefully it is still possible to substitute the encryption routines with some wrappers which trace their input and then delegate encryption to the actual code.
Your bet is that usually enctyption is implemented in separate, relatively small number of source files so that should be easier for you to track input/output in these files.
Good luck!
I'd say
find the command that is used to send data through the socket (the call depends on the network library)
find references of this command and unroll from there. If you can modify-recompile the server code, it might help.
On the way, you will be able to log decrypted (or, more likely, not yet encrypted) network activity.
IMO, the best answer is to read the source code of the alternative server. Try using a good C++ IDE to help you. It will make a lot of difference.
It is likely that the protocol related material you need to understand will be limited to a subset of the files. These will contain references to network sockets and things. Start from there and work outwards as far as you need to.
A viable approach is to tackle this as a crypto challenge. That makes it easy, because you control so much.
For instance, you can use a current client to send a known message to the server, and then check server memory for that string. Once you've found out in which object the string ends, it also becomes possible to trace its ancestry through the code. Set a breakpoint on any non-const method of the object, and find the stacktraces. This gives you a live view of how messages arrive at the server, and a list of core functions essential to message processing. You can next find related functions (caller/callee of the functions on your list).