SIEM streaming over TCP, getting multiple messages put into one event

SIEM streaming over TCP, getting multiple messages put into one event - c++

Adjusting question:
SIEM is a management system that takes syslog and other types of log messages and allows an admin to search, combine, and report on logs in ways that helps them better understand what is going on. I am working with Splunk and sending Syslog (CEF) formatted messages to Splunk. When I send two messages to splunk, that appear in the same message as seen here.
<1286>Sep 16, 2014 2:07:38 PM dbrLnxRv CEF:0|MyCompany|MyApp|2.0|Malicious|6|FileName eicar.cab dname=www.csm-testcenter.org dst=10.204.64.137 dpt=8080 prot=HTTP src=10.204.82.168 spt=49809 suser="" xAuthenticatedUser="" requestMethod=GET requestClientApplication="" reason=0-1492-EICARFile.Detection_Test.Web.RTSS request=http://www.csm-testcenter.org/download/archives/cab/eicar.cab AnalysisType="" ThreatName=EICARFile ThreatReason=0-1492-EICARFile.Detection_Test.Web.RTSS Category=128 Direction=inbound Manual=1 TicketNumber=0 FileType=unknown FileHash=654ec5ae29c1718501af794822663da40aec51fc FileSize=168 Status=completed SessionId=79421 TransactionId=5
<1286>Sep 16, 2014 2:07:39 PM dbrLnxRv CEF:0|MyCompany|MyApp|2.0|Malicious|6|FileName eicar.cab dname=www.csm-testcenter.org dst=85.214.28.69 dpt=80 prot=HTTP src=10.204.64.137 spt=40378 suser="" xAuthenticatedUser="" requestMethod=GET requestClientApplication="" reason=0-1492-EICARFile.Detection_Test.Web.RTSS request=http://www.csm-testcenter.org/download/archives/cab/eicar.cab AnalysisType="" ThreatName=EICARFile ThreatReason=0-1492-EICARFile.Detection_Test.Web.RTSS Category=128 Direction=inbound Manual=1 TicketNumber=0 FileType=unknown FileHash=654ec5ae29c1718501af794822663da40aec51fc FileSize=168 Status=completed SessionId=79432 TransactionId=3
My questions is, how can I make them appear in separate blocks.
Currently have CR/LF between each message (verified by looking at the TCP transaction using Wireshark). Tried adding a NULL too, did not make a difference.
I know I am not down to the MS in the date/Time field, is that an issue?
Is there a message ID I am missing that will force Splunk to separate the messages?
Other ideas?
(When sending via UDP, the each event appears in it's own message)
Also tried disabling the nagle algo. and still same issue.
I created a custom C++ app to send SIEM messages from my data source to Splunk. If I send 6 SIEM messages over a socket at one time with each message is separated by a CR/LF (I also tried adding a NULL between the messages), Splunk puts them into one single event. What should I send to cause the messages to be in unique events? I've look everywhere for the spec on the SIEM protocol and have not found and binary base documents on the actual protocol.

TCP is 'stream' protocol and not message oriented. It does not maintain message boundaries. What one sends is not guaranteed to be read in the same way. It is upto the applications above TCP to interpret the bytes and form 'messages'
UDP on the hand maintains message boundaries. One sendto of X bytes will translate to recvfrom of X bytes. Though UDP will not gurantee that the message will reach the receiver.
The above stated reason is what you are witnessing. Multiple sends translating to single recv and in UDP the opposite.

Got it working
The protocol uses a basic /r/n to terminate the stream, which I had tried in the past. The real trick lies with the Splunk configuration. One needs to create a config files called props.conf and included the following line
SHOULD_LINEMERGE=false
Then everything works fine.

Related

QTcpSocket: Setting LowDelayOption seems to have no effect?

I have a Qt GUI application that uses QTcpSocket to send and receive TCP packets to and from a server. So far I've had success making the TCP socket connections (there are 2 separate socket connections because there are 2 different message sets. Same IP address for both but 2 different port numbers) and sending and receiving packets. Most of the messages that my application sends are kicked off via push-button on the GUI's main window (one message is sent periodically using a QTimer that expires every 1667ms).
The server has a FIFO (128 messages deep) and sends a specific message to my application that communicates when the FIFO is 1/2 full, 3/4 full, and full. It's tedious to test this functionality by just mashing the send button on the GUI so I had the idea of loading a .csv file that could be pre-filled (the message has several different configurable parameters) with what I want to send. Each line gets read and turned into a message and sent on the TCP socket.
From my main window I open up a QFileDialog when a push-button on the GUI is clicked. Then when a .csv file is navigated to and selected the function reads the .csv file one line at a time, pulls out all the individual parameters, fills the message with the parameters, and then sends it out to the socket. Each message is 28 bytes. It repeats this until there are no lines left in the .csv file.
What I am noticing on Wireshark is that instead of sending a bunch of individual TCP packets they are all being put together and sent as one large TCP packet.
When I first tested this out I did not know about the LowDelayOption so when I found the information about it in the documentation for QAbstractSocket I thought "Aha! That must be it! The solution to my problem!" but when I added it to my code it did not seem to have any kind of effect at all. It's still being sent as one large TCP packet. For each socket, I am calling setSocketOption to set the LowDelayOption to 1 in the slot function that receives the connected() signal from the socket. I thought maybe the setSocketOption call wasn't working so I checked this by calling socketOption to get the value of the LowDelayOption and it's 1.
Is there something else I need to be doing? Am I doing something wrong?
Thanks for your time and your help. If it matters I am developing this on Windows and I am using Qt 5.9.1

... send and TCP packets to and from a server.
From this I am getting the vibe that your application relies on a certain amount of data - 'a packet' being received in a single receive call.
You can't really rely on that. Data you send over TCP can also be fragmented on the way. Also in your receiving end TCP implementation multiple packets received from the network may be put in the receiving sockets buffer before you have read the first one, and you have no way of telling which kind of fragments they were originally sent in.
So you should just treat TCP as a pipe through which bytes of data flow with some unknown and potentially variable delay. That variable delay causes data to be received in bigger or smaller chunks at random.
If you want to have a packet structure, you should add a packet header containing at least the packet length to the data you transmit.
I hope this helps.

From QTcpSocket documentation:
TCP (Transmission Control Protocol) is a reliable, stream-oriented, connection-oriented transport protocol. It is especially well suited for continuous transmission of data.
Stream-orientet means that there is no something like datagrams in UDP sockets.
There is only stream of data, and you never know in what parts it will be sent.
TCP protocol gives only reliability and you have to provide message extraction on your own. I.e send message length before each message, or use QDataStream (check
Fortune server and Fortune client for examples).
LowDelayOption from QAbstractSocket::SocketOption
Try to optimize the socket for low latency. For a QTcpSocket this would set the TCP_NODELAY option and disable Nagle's algorithm. Set this to 1 to enable.
It is equavilent of setsockopt with TCP_NODELAY option
First thing is:
The TCP_NODELAY option is specific to TCP/IP service providers.
And it doesn't work for me too :)
MSDN says that they do not recommend to disable Nagle's algorithm:
It is highly recommended that TCP/IP service providers enable the Nagle Algorithm by default, and for the vast majority of application protocols the Nagle Algorithm can deliver significant performance enhancements. However, for some applications this algorithm can impede performance, and TCP_NODELAY can be used to turn it off. These are applications where many small messages are sent, and the time delays between the messages are maintained. Application writers should not set TCP_NODELAY unless the impact of doing so is well-understood and desired because setting TCP_NODELAY can have a significant negative impact on network and application performance.
The question is: Do you really need to send your messages as fast as possible?
If yes, consider using QUdpSocket. Maybe tell us more about messages that you are sending.

Unable to send binary data over WebSockets

I am developing a viewer application, in which server captures image, perform some image processing operations and this needs to be shown at the client end, on HTML5 canvas. The server that I've written is in VC++ and uses http://www.codeproject.com/Articles/371188/A-Cplusplus-Websocket-server-for-realtime-interact.
So far I've implemented the needed functionality. Now all I need to do is Optimization. Reference was a chat application which was meant to send strings, and so I was encoding data into 7-bit format. Which is causing overhead. I need binary data transfer capability. So I modified the encoding and framing (Now opcode is 130, for binary messages instead of 129.) and I can say that server part is alright. I've observed the outgoing frame, it follows protocol. I'm facing problem in the client side.
Whenever the client receives the incoming message, if all the bytes are within limits (0 to 127) it calls onMessage() and I can successfully decode the incoming message. However even a single introduction of character which is >127 causes the client to call onClose(). The connection gets closed and I am unable to find cause. Please help me out.
PS: I'm using chrome 22.0 and Firefox 17.0

Looks like your problem is related to how you assemble your frames? As you have an established connection that terminates when the onmessage event is about to fire, i asume that it is frame related?
What if you study the network -> WebSocket -> frame of your connection i Google Chrome? what does it say?
it may be out-of-scope for you ?, but im one of the developers of XSockets.NET (C#) framework, we have binary support there, if you are interested there is an example that i happend to publish just recently, it can be found on https://github.com/MagnusThor/XSockets.Binary.Controller.Example

How did you observe the outgoing frame and what were the header bytes that you observed? It sounds like you may not actually be setting the binary opcode successfully, and this is triggering UTF-8 validation in the browser which fails.

Are TCP packets reordered usually?

I am reimplementing an old network layer library, but using boost asio this time. Our software is tcpip dialoging with a 3rd party software. Several messages behave very well on both sides, but there is one case I misunderstand:
The 3rd party sends two messages (msg A and B) one after the other (real short timing) but I receive only a part of message A in tcp-packet 1, and the end of message A and the whole message B in tcp-packet 2. (I sniff with wireshark).
I had not thought of this case, I am wondering if it is common with tcp, and if my layer should be adaptative to that case - or should I say to the 3rd party to check what they do on their side so as I received both message in different packets.

Packets can be fragmented and arrive out-of-sequence. The TCP stack which receives them should buffer and reorder them, before presenting the data as an incoming stream to the application layer.
My problem is with message B, that I don't see because it's after the end of message one in the same packet.
You can't rely on "messages" having a one-to-one mapping to "packets": to the application, TCP (not UDP) looks like a "streaming" protocol.
An application which sends via TCP needs another way to separate messages. Sometimes that's done by marking the end of each message. For example SMTP marks the end-of-message as follows:
The transmission of the body of the mail message is initiated with a
DATA command after which it is transmitted verbatim line by line and
is terminated with an end-of-data sequence. This sequence consists of
a new-line (), a single full stop (period), followed by
another new-line. Since a message body can contain a line with just a
period as part of the text, the client sends two periods every time a
line starts with a period; correspondingly, the server replaces every
sequence of two periods at the beginning of a line with a single one.
Such escaping method is called dot-stuffing.
Alternatively, the protocol might specify a prefix at the start of each message, which will indicate the message-length in bytes.
If you're are coding the TCP stack, then you'll have access to the TCP message header: the "Data offset" field tells you how long each message is.

Yes, this is common. TCP/IP is a streaming protocol and your "logical" packet may be split across many "physical" packets, so the client is responsible for assembling the higher-level packets. Additionally, TCP/IP guarantees the proper ordering, so you don't have to worry about assembling out of order packets.

your problem has got nothing to do with TCP at all. your problem is that you expected asio to do the message parsing for you. it does not, you have to implement it.
if your messages are all the same size do an async read for that size.
if they are of different length do a async read for your header size, analyze the header and do an async read for the rest of the message according to the header.
if your messages are of variable length and the size is unknown but there is a defined end character or sequence then you have to save the remaining bytes behind that end sequence and append the next read to that remainder.

Implementing a TLV protocol via TCP

I'm currently trying to implement a (T)LV protocol to be used on top of TCP. A very early version of this protocol was built by just sending one message per send-recv pair. (i.e. send("message to transmit" -- recv(... )). This is really bad bandwidth-wise - I guess because I'm sending really small packets.
So now I am trying to switch to a LV protocol, sending several messages at once only seperated by their respective length (I am now using Protocol Buffers to serialize my data).
I now have two questions:
In python I send by doing
sock.send(struct.pack("<H", len(gtMessage.SerializeToString())))
sock.send(gtMessage.SerializeToString())
If I now put this into a loop and sent several of those messages I'd end up with my old problem, as far as I understand. Can I somehow string the string to be sent together?
In C++ I receive receive first the length of the message and then read the number of bytes indicated by the length field.
Is it better performance-wise to first read everything from TCP and then parse it, or can I read one message, then parse it and only then read the next bit from the wire?
Edit: So after doing some more research I'd rephrase the first question as:
Is
sock.send("somestring")
sock.send("somestring")
the same as
sock.send("somestring"+"somestring")
?

Doing two sends in a row may result in two actual packets going out, which is not so great. To fix this you can concatenate the two pieces yourself, or use writev (aka "gather write"), or TCP_CORK on the first send to prevent it from turning into a packet all by itself.
As for the receive side, you should receive a big block (as much as you can up to some reasonable limit, say a couple megabytes or something), and then parse it. Do not try to receive just one or two bytes for the size then do another receive after that--this is inefficient and you may still end up with "short reads" if the sent message was fragmented.

Is acknowledgment response necessary when using send()/recv() of Winsock?

Using Winsock, C++, I send and receive the data with send()/recv(), TCP connection. I want to be sure that the data has been delivered to the other party, and wonder if it is recommended to send back some acknowledgment message after (if) receiving data with recv.
Here are two possibilities, and please advice which way to go:
If send returns the size of passed buffer, assume that the data has been delivered at least to recv function on the other side of wire. When I say "at least", I mean even if the recv fails there (e.g. due to insufficient buffer, etc.), I don't care, I just want to be sure I've done my server part of work properly - I've sent the data completely (i.e. the data reached the other machine).
Use additional acknowledgment: after receiving the data with recv, send back some ID of received packet (part of header of each data sent) signaling the successful receive operation of that packet. If I don't receive such "acknowledgment message" after some interval, return failure code from the sender function.
The second answer looks more safe, but I don't want to complicate the transfer protocol if it is redundant. Also please note that I'm talking about the TCP connection (which is more safe by itself than UDP).
Is there any other mechanisms (maybe some other APIs? maybe WSARecv()/WSASend() work differently?) of ensuring that the data was delivered to the recv function on the other side?
If you recommend the second way, could you please give me some code snippet that allows me to use recv with timeout to receive the acknowledgment? recv is a blocking operation so it will hang forever if the previous send attempt failed (the other party was not notified). Is there any simple way of using recv with timeout (without creating separate thread every time which would probably be the overkill for each and every send operation).
Also the amount of data I pass to send function might be quite big (several megabytes), so how to choose the timeout for "acknowledgment message"? Maybe I should "split" large buffers and use several send calls? I think it will get quite complicated, please advice!
EDIT: OK, you people are suggesting that TCP/IP stack will handle it (i.e. no manual acknowledgment required), but this is what I found on MSDN page: "The successful completion of a send function does not indicate that the data was successfully delivered and received to the recipient. This function only indicates the data was successfully sent." So even if the TCP mechanism has the ability to ensure data delivery, I can't get that status (success or not) via send() function, or any other Winsock function I know. Do you know any way of getting the status from the TCP layer? Again - return value of send() function seems to be not enough!
========================================================
EDIT 2: OK, I think we agree that even though TCP protocol considers the error handling when something goes wrong, the send() function of Winsock is not capable of reporting the errors (simply because it returns before actual transmitting of data starts by the network driver). So here is a million dollar question: Does the send() function of Winsock at least ensure that no other packets will be delivered to the other party until the current packet will be? In other words, if the sending fails for some network failure (but not reported by send() call), and then the network failure will be fixed before next call of send() function with next chunk of data, will it be ensured that the previous packet (which failed but not reported by send()) will be delivered before the next packet? In other words, is there a chance that the one particular send() function will fail "silently", so that subsequent send() calls will succeed but the first packet will be lost? AGAIN - I'm not talking at the TCP level, I'm talking at the Winsock API level!

Why don't you trust your TCP/IP stack to guarantee delivery. After all, that is the whole point of using TCP instead of UDP.

The existing answers here are mostly correct: if you use TCP you really don't need to worry about reliable delivery of your packets to your peer.
But this is a dangerous view for some systems where data integrity must be taken to the next level: the common criteria auditing requirement FAU_STG.4.1 requires the ability to prevent auditable events if the audit log might suffer a loss of audit entries. (For example, the Linux auditd(8) audit logging daemon can be configured to place the computer in single-user-mode or halt the system completely when there is no more space left for audit logs.) Audit logs from remote systems should probably be maintained until it is known that they have been successfully written to centralized log servers.
Financial transactions would probably be best handled with a more reliable protocol than simple TCP as well -- crediting or debiting accounts would be best handled with a multi-staged protocol to ensure availability of funds, perform the transaction, then report the result of the transaction to the origination point.
TCP allows nearly a gigabyte of in-flight data between two peers (under extreme conditions); depending upon the requirements of your application, you might need to maintain that data at the sending side until you receive positive confirmation from your peer that the data has been properly handled.
Thankfully, most applications aren't this critical; losing a megabyte of data here or there down a socket that reports a closed connection at some point "in the future" really isn't horrible -- we just re-try our HTTP request, or re-attempt the SFTP connection.
Update
A socket will only accept enough data to fill its available window. The window size is negotiated between the two peers during the session handshake. So your calls to send() will begin blocking when the socket's window fills. (The OS might keep letting you add data to its internal buffers too, but at some point the writes will block.) If the peer breaks the connection with a RST or ICMP Unreachable message, a future call to send() will return an error value for Connection Reset or Broken Pipe.
Update 2
I'm not talking at the TCP level, I'm talking at the Winsock API level
This might be the source of confusion. send() has no choice but to adhere to the TCP behavior when used with TCP.
TCP guarantees in-order reliable delivery of a stream of bytes, to the extent that packets can be delivered. (See #Hans's comment about a pony and careless people kicking power cords.) The peer program will see bytes in the correct order they were sent. (Well, okay, TCP also has out-of-band urgent packet delivery, but I haven't actually seen any applications that use it. Using OOB packets, you can get some data out-of-line. Forget I mentioned it.)
If the remote program receives a byte sent on a TCP stream, it reliably received all preceding bytes as well. (Well, there are entire classes of replay attacks that splice together legitimate and fake packets for the remote peer, but those are increasingly difficult on systems with randomized initial sequence numbers. If this is within your threat model, you should be using TLS on top of TCP to provide cryptographically strong tamper evident information. But TLS can't provide better per-packet delivery notification.)

If you use UDP and you care about the data actually being received by the other side you NEED to use ACK, but if you don't need the speed of UDP you should use TCP, as it does the ACKing for you.

I think you are over complicating this, trust your TCP/IP software stack and the reliable delivery it offers. TCP sockets operate on streams of data, not packets. Also one call to send does not guarantee one call to recv.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js