decoding internet packets payload in python - python-2.7

I have used scapy to sniff internet packets from my computer knowing that they are not encrypted how can I decode the data being sent so it comes out as clear text , something like wireshark does, I would like a code exemple for it.
I do not want to use wireshark I want to code this myself for learning.
I used the following simple script to capture the packets :
from scapy.all import *
def callback(pkt) :
print pkt.summary()
print pkt.show()
sniff(store=0, prn= callback)

It depends on the application that sends the taffic. If it sends the data unencrypted and in plain text (ascii) you can access and display it using the atribute load of the packet. For example:
def callback(pkt) :
print pkt.load
If the data is not plain text you need to know how the application is encoding the data and decode it. If you're looking for more similar output to that of wireshark you can try with hexdump(pkt).

Related

get the binary data transferred from grpc client

I am new to gRPC framework, and I have created a sample client-server on my PC (referring to this).
In my client-server application I have implemented a simple RPC
service NameStudent {
rpc GetRoll(RollNo) returns (Details) {}
}
The client sends a RollNo and receives his/her details which are name, age, gender, parent name, and roll no.
message RollNo{
int32 roll = 1;
}
message Details {
string name = 1;
string gender = 2;
int32 age = 3;
string parent = 4;
RollNo rollid = 5;
}
The actual server and client codes are adaptation of the sample code explained here
Now my server is able to listen to "0.0.0.0:50051(address:port)" and client is able to send the roll no on "localhost:50051" and receive the details.
I want to see the actual binary data that is transferred between client and server. i have tried using Wireshark, but I don't understand what I am seeing here.
Here is the screenshot of wireshark capture
And here are the details of highlighted entry from above screenshot.
Need help in understanding wireshark here, Or any other way that can be used to see the binary data.
Wireshark uses the port to determine how to decode the communication, and it doesn't know any protocol associated with 50051. So you need to configure it to treat this as HTTP.
Right click on a row and select "Decode As..." in the context menu.
Then set "Current" to "HTTP" or "HTTP2" (HTTP will generally auto-detect HTTP2) and hit "OK".
Then the HTTP/2 frames should be decoded. And if using a recent version of Wireshark, you may also see the gRPC frames decoded.
The whole idea of grpc is to HIDE that. Let's say we ignore that and you know what you're doing.
Look at https://en.wikipedia.org/wiki/Protocol_Buffers. gRPC uses Protocol Buffers for it's data representation. You might get a hint at the data you're seeing.
Two good starting points for a reverse engineer exercise are:
Start simple: compile a program that sends an integer. Understand it. Sniff it. Then compile a program that sends a string. Try several values. Once you understand it, pass to tacke the problem of understanding how's google sending your structure.
Use known data and do small variations: knowing what 505249... means is easier if you start knowing the data you're sending (as an example, send "Hello world" string; then change it to "Hella world"; see what changes on the coded sniff; also check that sending several times the same data produces the same sniffed output). Apply prior point: start simple, first empty string, then " ", then "a", then "b", etc. and then pass to complex and larger strings. Don't be affraid to start simple.

QTcpSocket sends more data than wanted - Qt/C++

first of all a little background on my situation:
- Qt/C++ UI desktop application
- embedded device (Stm32l4xx family) +ATWINC1500 wifi module
I'm developing the gui application in order to send commands and files to the emdedded device via sockets.
For simple commands I've done all successfully, but for sending files (text files in GCODE format) I am stuck with some issues.
The embedded device has already a socket management(not written by me, so I have not the possibility to modify the way sockets are managed, coming from third party company), and the reception of that type of files is managed in a way that the API waits for every single line of the file being sent, and then wrotes it into a reserved portion of the flash.
My problem is that when I send file from qt Application(by reading each line and and calling write() on the line, in reality my socket sends an entire chunk of the file, like 50 lines, resulting in my device not managing the file reception.
My sending code is this:
void sendGCODE(const QString fileName)
{
QFile *file = new QFile(fileName,this);
bool result = true;
if (file->open(QIODevice::ReadOnly))
{
while (!file->atEnd())
{
QByteArray bytes(file->readLine());
result = communicationSocket->write(bytes);
communicationSocket->flush();
if(result)
{
console->append("-> GCODE line sent:"+ QString(bytes));
}
else
{
console->append("-> Error sending GCODE line!");
}
}
file->close();
}
}
Have anyone of you guys any hints on what I am doing wrong?
I've already searched and someone suggests on other topic that for this purpose it should be better to use UDP instead of TCP sockets, but unfortunately I cannot touch the embedded-device-side code.
thank you all!
EDIT
After suggestions from comments, I've sniffed tcp packets and the packets are sent correctly(i.e. each packet contains a single line). BUT... at the receiver(device), I understood that there is something regarding memory which is not well managed. an example:
sender sends the line "G1 X470.492 Y599.623 F1000" ; receiver receives correctly the string "G1 X470.492 Y599.623 F1000"
next, if the line length is less than the previous sent, i.e. sending "G1 Z5", the receiver receives: "G1 Z5\n\n.492 Y599.623 F1000", so it is clear that the buffer used to store the data packet is not re-initialized from previous packet content, and the new part overwrites the previous values where the remaining part is from the previous packet
I'm trying to figure out how I could reset that part of memory.
This is all wrong. TCP is not a message-oriented protocol. There is no way to ensure that the TCP packets contain any particular amount of data. The receiver code on the device mustn't expect that either - you perhaps misunderstood the receiver's code, or are otherwise doing something wrong (or the vendor is). What the receiver must do is wait for a packet, add the packet's data to a buffer, then extract and process as many complete lines as it can, then move the remaining data to the beginning of the buffer. And repeat that on every packet.
Thus you're looking for the wrong problem at the wrong place, unless your device never ever had a chance of working. If that device works OK with other software, then your "packetized" TCP assumption doesn't hold any water.
Here's how to proceed:
If the device is commercially available and has been tested to work, then you're looking in the wrong place.
If the device is a new product and still in development, then someone somewhere did something particularly stupid and you either need to fix that stupidity, or have the vendor fix it, or hire a consultant to fix it. But just to be completely clear: that's not how TCP works, and you cannot just accept that "it's how it is".

Python2.7 --Reconstruct packets to print html

Using wireshark, I could see the html page I was requesting (segment reconstruction). I was not able to use pyshark to do this task, so I turned around to scapy. Using scapy and sniffing wlan0, I am able to print request headers with this code:
from scapy.all import *
def http_header(packet):
http_packet=str(packet)
if http_packet.find('GET'):
return GET_print(packet)
def GET_print(packet1):
ret = packet1.sprintf("{Raw:%Raw.load%}\n")
return ret
sniff(iface='wlan0', prn=http_header, filter="tcp port 80")
Now, I wish to be able to reconstruct the full request to find images and print the html page requested.
What you are searching for is
IP Packet defragmentation
TCP Stream reassembly
see here
scapy
provides best effort ip.defragmentation via defragment([list_of_packets,]) but does not provide generic tcp stream reassembly. Anyway, here's a very basic TCPStreamReassembler that may work for your usecase but operates on the invalid assumption that a consecutive stream will be split into segments of the max segment size (mss). It will concat segments == mss until a segment < mss is found. it will then spit out a reassembled TCP packet with the full payload.
Note TCP Stream Reassembly is not trivial as you have to take care of Retransmissions, Ordering, ACKs, ...
tshark
according to this answer tshark has a command-line option equivalent to wiresharks "follow tcp stream" that takes a pcap and creates multiple output files for all the tcp sessions/"conversations"
since it looks like pyshark is only an interface to the tshark binary it should be pretty straight forward to implement that functionality if it is not already implemented.
With Scapy 2.4.3+, you can use
sniff([...], session=TCPSession)
to reconstruct the HTTP packets

Sending an image in base64 via Telnet

I am currently working on a project for school and have ran into an issue with a large amount of data being sent via Telnet. If I send a message less than 10KB it is fine. However if I send a message that is above 10KB, I receive the following error "501 Syntax error - line too long" after a few minutes of it running.
Does anyone know of a better way to implement what I am trying to accomplish, that will preferably work with the send()? The data being sent is 5 pages (in Word) of an image in base64.
Thank you, any help is greatly appreciated.
Here is the code portions that I am currently using, which work, with small amounts of data.
char *MailContents = new char[20000000];
std::ifstream in("C:\\test.txt");
std::string MailData((std::istreambuf_iterator<char(in)),std::istreambuf_iterator<char>());
//The following streams in the data into MailData() from a .txt file.
memcpy(MailContents, MailData.c_str(), MailData.length()); //This takes the data and copies it to MailContents
strcat(MailContents, "\r\n");
send(Connection, MailContents, strlen(MailContents), 0); //The following line will take the data in MailContents and echo it to the Telnet data section to be sent.
send(Connection, ".\r\n", strlen(".\r\n"), 0); //This line terminates the data entry and sends it.

Code to analyze pcap file

I am trying to analyse a file containing packets captured using tcpdump. I first want to categorize the packets into flows using 5-tuple. Then I need to get the size and inter-arrival time of each packet in each flow. I tried Conversation list in wireshark but it gives only the number of packets in the flow not information about each packet in the flow. A suggestion for any code (c++ or shell script) that can do the job? Thank you
UmNyobe,
If you haven't heard of Scapy yet I beleive what you are trying to do would be a near perfect fit. For example I wrote this little snippet to parse a pcap field and give me something like what you are talking about using Scapy.
#!/usr/bin/python -tt
from scapy import *
import sys
from datetime import datetime
'''Parse PCAP files into easy to read NETFLOW like output\n
Usage:\n
python cap2netflow.py <[ pcap filename or -l ]>\n
-l is live capture switch\n
ICMP packets print as source ip, type --> dest ip, code'''
def parse_netflow(pkt):
# grabs 'netflow-esqe' fields from packets in a PCAP file
try:
type = pkt.getlayer(IP).proto
except:
pass
snifftime = datetime.fromtimestamp(pkt.time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[1]
if type == 6:
type = 'TCP'
if type == 17:
type = 'UDP'
if type == 1:
type = 'ICMP'
if type == 'TCP' or type == 'UDP':
print( ' '.join([snifftime, type.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , str(pkt.getlayer(type).sport).rjust(5, ' ') , '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' ') , str(pkt.getlayer(type).dport).rjust(5, ' ')]))
elif type == 'ICMP':
print(' '.join([snifftime, 'ICMP'.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , ('t: '+ str(pkt.getlayer(ICMP).type)).rjust(5, ' '), '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' '), ('c: ' + str(pkt.getlayer(ICMP).code)).rjust(5, ' ')]))
else:
pass
if '-l' in sys.argv:
sniff(prn=parse_netflow)
else:
pkts = rdpcap(sys.argv[1])
print(' '.join(['Date: ',datetime.fromtimestamp(pkts[0].time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[0]]))
for pkt in pkts:
parse_netflow(pkt)
Install Python and Scapy then use this to get you started. Let me know if you need any assistance figuring it all out, if you know C++ chances are this will already make alot of sense to you.
Get Scapy here
http://www.secdev.org/projects/scapy/
There are tons of links on this page to helpful tutorials, keep in mind Scapy does alot more but hone in on the areas that talk about pcap parsing..
I hope this helps!
dc
I worked on a library to analyze tcp dump but it was for a business so I cannot just give to you. if you don't find what you are looking for then my answer can help. A tcpdump is just nested network data like the Matryoshka dolls, where the pcap layer is added by tcpdump.
If you only want to work on the captures, the format of a dump is specified in Libpcap File Format. To get the size and time of arrival of each packet you need to process the dump using this specification.
If you have to go deeper in the analysis these are the following layers in order
the link layer
the internet layer
Transport layer
The application layer
Each layer has a header definition. So you need to find which protocol stack your pcap data contains and to parse the header to get information.
What are the members of the 5-tuple? If the flows are TCP or UDP, the source and destination IP addresses and port numbers, plus, perhaps, a number to distinguish multiple flows over time between the two endpoints would work; for SCTP, it would be similar, although if a flow is a stream, you might need more.
If the members of the 5-tuple are all "named fields" in Wireshark, you could use TShark with the -T fields option, and use the -e option to specify which fields to print, and select a field with the time stamp (frame.time_epoch would give you the time as seconds and fractions of a second since the UN*X epoch), a field the appropriate size (frame.len gives you the raw number of bytes in the link-layer packet PLUS any meta-data such as a radiotap header for 802.11 radio information), and the other fields, and then feed the output of TShark to a script or program that does the processing you want to do. That lets TShark do the processing of the protocol layers, so that your program only needs to process the resulting data.