python2.7: reading journald's binary log - python-2.7

I'm currently working on a parser for plaso. For this I need to read journald's binary log files and convert those to a plaso timeline object.
My question now is: How do I read a binary file in python, keeping in mind that the file may contain strings and integers. Is a byte array sufficient for this? If so, how can I find the correct delimiters for the message fields?
Since I'm new to python I can't provide useful code just yet, still trying to wrap my head around this.

You can deal with binary data using struct package.
If I had been you I would have seen the struct of the file by journald (from journald docs or its source code) and parsed binary data into fields.

Related

Writing and reading ADTF3 Files

I am using ADTF Libraries to write a structure data. I need to verify whether the data is being written properly. How can i do this?
I am assuming you are writing structured data to a .dat/.adtfdat file. In that case, you can always convert a .dat/.adtfdat file into a csv to verify. See examples on how to do so.
If you have access to MATLAB, then the easiest way would be using a simple function in MATLAB : adtffilereader
Alternatively, there are these tools that help in extracting data out of a dat file.

How to convert .trc file type to text file using C++?

I have got a trace file that is binary in nature. I want to convert it to a text file and convert the data inside it to decimal form. I mean I am not sure, how to do this. This .trc file contains data in the form of telegrams and I want to extract particular kind of telegram and save them in text file which is readable in nature. I have to do all of this using C++.
Do you suggest any other language for it or does anyone has any idea about doing this in C++?
Binary trace files are usually encoded in proprietary formats. And there are applications or profilers specifically built to parse them.
Unless you know the file format, the only way to decode it is through reverse engineering. And in most cases it's not worth the effort.
Try to find documentation about it. Or maybe an application or utility that loads the file and exports data that is easier to read.
In case you are speaking about .trc binary files from Teledyne Lecroy Oscilloscopes, I would suggest to any of the following libraries out there for that:
https://pypi.org/project/lecroyparser/
https://github.com/jneer/lecroy-reader
https://github.com/yetifrisstlama/readTrc
https://igit.ific.uv.es/ferhue/lecroyparser

Append to compressed file using zlib

Looking around I have found the question being asked, but not great answers. If this is a stackoverflow duplicate (sorry!)
My goal is to have a zlib compressed file that I append to using C/C++ at different intervals (such as a log file). Due to buffer size constraints I was hoping to avoid having to keep the entire file in memory for appending new items.
Mark Adler's answer was very close to what I needed, but due to already being entrenched in the zlib library and on an embedded device with limited resources I was/am stuck.
I ended up simply appending a delimiter to each section of data (ex: ##delimiter##) and once ready to read the finished file, (different application) it seeks these sections and creates an array object of the compressed sections that are then individually decompressed.
I am still marking Adler's answer as correct, as it was useful info that will be of more help to other programmers.
It sounds like you are trying to keep something like a compressed log, appending small amounts of data each time. For that you can look at gzlog.h and gzlog.c for an example of how to do this.
You can also look at gzappend, which appends data to a gzip file.
These are all easily adaptable to a zlib stream.

Real time parsing

I am quite new to parsing text files. While googling a bit, I found out that a parser builds a tree structure usually out of a text file. Most of the examples consists of parsing files, which in my view is quite static. You load the file to parser and get the output.
My problem is something different from parsing files. I have a stream of JSON data coming from a server socket at TCP port 6000. I need to parse the incoming data.I have some questions in mind:
1) Do I need to save the incoming JSON data at the client side with some sought of buffer? Answer: I think yes I need to save it, but are there any parsers which can do it directly like passing the JSON object as an argument to the parse function.
2) How would the structure of the real time parser look like`? Answer: Since on google only static parsing tree structure is available. In my view each object is parsed and have some sought of parsed tree and then it is deleted from the memory. Otherwise it will cause memory overflow because the data is continuous.
There are some parser libraries available like JSON-C and JSON lib. One more thing which comes into my mind is that can we save a JSON object in any C/C++ array. Just thought of that but could realize how to do that.

Best way to parse a complex log file?

I need to parse a log file that consist in many screenshot of real-time OS stdout.
In particular, every section of my log_file.txt is a text version of what appear on screen. In this machine there's not monitor, so the stdout is written on a downloadable log_file.txt.
The aim would be to create a .csv of this file for data mining purpose but I'm still wondering what could be the best method to compute this file.
I would the first csv file line with the description (string) of the value and from the second line I would the respective values (int).
I was thinking about a parser generator (JavaCC, ANTLR, etc..) but before starting with them I would get some opinions.
Thank you.
P.S.
I put a short version of my log at the following link: pastebin.com/r9t3PEgb