Using wireshark, I could see the html page I was requesting (segment reconstruction). I was not able to use pyshark to do this task, so I turned around to scapy. Using scapy and sniffing wlan0, I am able to print request headers with this code:
from scapy.all import *
def http_header(packet):
http_packet=str(packet)
if http_packet.find('GET'):
return GET_print(packet)
def GET_print(packet1):
ret = packet1.sprintf("{Raw:%Raw.load%}\n")
return ret
sniff(iface='wlan0', prn=http_header, filter="tcp port 80")
Now, I wish to be able to reconstruct the full request to find images and print the html page requested.
What you are searching for is
IP Packet defragmentation
TCP Stream reassembly
see here
scapy
provides best effort ip.defragmentation via defragment([list_of_packets,]) but does not provide generic tcp stream reassembly. Anyway, here's a very basic TCPStreamReassembler that may work for your usecase but operates on the invalid assumption that a consecutive stream will be split into segments of the max segment size (mss). It will concat segments == mss until a segment < mss is found. it will then spit out a reassembled TCP packet with the full payload.
Note TCP Stream Reassembly is not trivial as you have to take care of Retransmissions, Ordering, ACKs, ...
tshark
according to this answer tshark has a command-line option equivalent to wiresharks "follow tcp stream" that takes a pcap and creates multiple output files for all the tcp sessions/"conversations"
since it looks like pyshark is only an interface to the tshark binary it should be pretty straight forward to implement that functionality if it is not already implemented.
With Scapy 2.4.3+, you can use
sniff([...], session=TCPSession)
to reconstruct the HTTP packets
Related
I am new to gRPC framework, and I have created a sample client-server on my PC (referring to this).
In my client-server application I have implemented a simple RPC
service NameStudent {
rpc GetRoll(RollNo) returns (Details) {}
}
The client sends a RollNo and receives his/her details which are name, age, gender, parent name, and roll no.
message RollNo{
int32 roll = 1;
}
message Details {
string name = 1;
string gender = 2;
int32 age = 3;
string parent = 4;
RollNo rollid = 5;
}
The actual server and client codes are adaptation of the sample code explained here
Now my server is able to listen to "0.0.0.0:50051(address:port)" and client is able to send the roll no on "localhost:50051" and receive the details.
I want to see the actual binary data that is transferred between client and server. i have tried using Wireshark, but I don't understand what I am seeing here.
Here is the screenshot of wireshark capture
And here are the details of highlighted entry from above screenshot.
Need help in understanding wireshark here, Or any other way that can be used to see the binary data.
Wireshark uses the port to determine how to decode the communication, and it doesn't know any protocol associated with 50051. So you need to configure it to treat this as HTTP.
Right click on a row and select "Decode As..." in the context menu.
Then set "Current" to "HTTP" or "HTTP2" (HTTP will generally auto-detect HTTP2) and hit "OK".
Then the HTTP/2 frames should be decoded. And if using a recent version of Wireshark, you may also see the gRPC frames decoded.
The whole idea of grpc is to HIDE that. Let's say we ignore that and you know what you're doing.
Look at https://en.wikipedia.org/wiki/Protocol_Buffers. gRPC uses Protocol Buffers for it's data representation. You might get a hint at the data you're seeing.
Two good starting points for a reverse engineer exercise are:
Start simple: compile a program that sends an integer. Understand it. Sniff it. Then compile a program that sends a string. Try several values. Once you understand it, pass to tacke the problem of understanding how's google sending your structure.
Use known data and do small variations: knowing what 505249... means is easier if you start knowing the data you're sending (as an example, send "Hello world" string; then change it to "Hella world"; see what changes on the coded sniff; also check that sending several times the same data produces the same sniffed output). Apply prior point: start simple, first empty string, then " ", then "a", then "b", etc. and then pass to complex and larger strings. Don't be affraid to start simple.
I have a python script which assembles and sends AVB (IEEE) packets into a network.
The packets will be captured by wireshark.
With an other python script I iterate through the capture file.
But I can't access a few parameters in a few layers because scapy doesn't know them.
So I have to add those layers to scapy.
Here's the packet in wireshark:
I added the following code to the file "python2.7/dist-packages/scapy/layers/l2.py"
class ieee(Packet):
name = "IEEE 1722 Packet"
fields_desc=[ XByteField("subtype", 0x00),
XByteField("svfield", 0x81),
XByteField("verfield", 0x81)]
bind_layers(Dot1Q, ieee1722, type=0x22f0)
When I execute the python script which should grab the parameters in the new layer (IEEE 1722 Protocol), the following error occurs:
"IndexError: Layer [ieee1722] not found"
What's wrong?
Ok, found the solution by editing the type value:
bind_layers(Dot1Q, ieee1722, type=0x88f7) ---> works
Dot1Q is the layer above the created ieee1722 layer (see wireshark).
You can see the type value by clicking at the layer of a packet in wireshark.
This is old, maybe they didn't have the doc page but they have it now:
"Adding new protocols"
https://scapy.readthedocs.io/en/latest/build_dissect.html
I have an issue trying to decompress an imap message compressed using deflate method. The things I've tryed so far were isolating one of the directions of an IMAP conversation (using wireshark's follow tcp function) and saving the message data in an raw format that I hope it contains only the deflated message part. I then found some programs like tinf (1st and 3rd example) and miniz (tgunzip example) and tryed to inflate back that file, but with no succes.
I am missing something? Thank you in advance.
tinf - http://www.ibsensoftware.com/download.html
Miniz - https://code.google.com/archive/p/miniz/source/default/source
Try piping that raw data to:
perl -MCompress::Zlib -pe 'BEGIN{$i = inflateInit(-WindowBits => -15)}
$_=$i->inflate($_)'
The important part is the -WindowBits => -15 that changes the expected format into a raw one without adler checksum.
(that's derived from the dovecot source, works for me on Thunderbird to gmail network capture).
From RFC4978 that specifies IMAP compression (emphasis mine):
When using the zlib library (see RFC1951), the functions
deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
implement this extension. The windowBits value must be in the range
-8 to -15, or else deflateInit2() uses the wrong format.
deflateParams() can be used to improve compression rate and resource
use. The Z_FULL_FLUSH argument to deflate() can be used to clear the
dictionary (the receiving peer does not need to do anything).
I have used scapy to sniff internet packets from my computer knowing that they are not encrypted how can I decode the data being sent so it comes out as clear text , something like wireshark does, I would like a code exemple for it.
I do not want to use wireshark I want to code this myself for learning.
I used the following simple script to capture the packets :
from scapy.all import *
def callback(pkt) :
print pkt.summary()
print pkt.show()
sniff(store=0, prn= callback)
It depends on the application that sends the taffic. If it sends the data unencrypted and in plain text (ascii) you can access and display it using the atribute load of the packet. For example:
def callback(pkt) :
print pkt.load
If the data is not plain text you need to know how the application is encoding the data and decode it. If you're looking for more similar output to that of wireshark you can try with hexdump(pkt).
I am trying to analyse a file containing packets captured using tcpdump. I first want to categorize the packets into flows using 5-tuple. Then I need to get the size and inter-arrival time of each packet in each flow. I tried Conversation list in wireshark but it gives only the number of packets in the flow not information about each packet in the flow. A suggestion for any code (c++ or shell script) that can do the job? Thank you
UmNyobe,
If you haven't heard of Scapy yet I beleive what you are trying to do would be a near perfect fit. For example I wrote this little snippet to parse a pcap field and give me something like what you are talking about using Scapy.
#!/usr/bin/python -tt
from scapy import *
import sys
from datetime import datetime
'''Parse PCAP files into easy to read NETFLOW like output\n
Usage:\n
python cap2netflow.py <[ pcap filename or -l ]>\n
-l is live capture switch\n
ICMP packets print as source ip, type --> dest ip, code'''
def parse_netflow(pkt):
# grabs 'netflow-esqe' fields from packets in a PCAP file
try:
type = pkt.getlayer(IP).proto
except:
pass
snifftime = datetime.fromtimestamp(pkt.time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[1]
if type == 6:
type = 'TCP'
if type == 17:
type = 'UDP'
if type == 1:
type = 'ICMP'
if type == 'TCP' or type == 'UDP':
print( ' '.join([snifftime, type.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , str(pkt.getlayer(type).sport).rjust(5, ' ') , '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' ') , str(pkt.getlayer(type).dport).rjust(5, ' ')]))
elif type == 'ICMP':
print(' '.join([snifftime, 'ICMP'.rjust(4, ' '), str(pkt.getlayer(IP).src).rjust(15, ' ') , ('t: '+ str(pkt.getlayer(ICMP).type)).rjust(5, ' '), '-->' , str(pkt.getlayer(IP).dst).rjust(15, ' '), ('c: ' + str(pkt.getlayer(ICMP).code)).rjust(5, ' ')]))
else:
pass
if '-l' in sys.argv:
sniff(prn=parse_netflow)
else:
pkts = rdpcap(sys.argv[1])
print(' '.join(['Date: ',datetime.fromtimestamp(pkts[0].time).strftime('%Y-%m-%d %H:%M:%S').split(' ')[0]]))
for pkt in pkts:
parse_netflow(pkt)
Install Python and Scapy then use this to get you started. Let me know if you need any assistance figuring it all out, if you know C++ chances are this will already make alot of sense to you.
Get Scapy here
http://www.secdev.org/projects/scapy/
There are tons of links on this page to helpful tutorials, keep in mind Scapy does alot more but hone in on the areas that talk about pcap parsing..
I hope this helps!
dc
I worked on a library to analyze tcp dump but it was for a business so I cannot just give to you. if you don't find what you are looking for then my answer can help. A tcpdump is just nested network data like the Matryoshka dolls, where the pcap layer is added by tcpdump.
If you only want to work on the captures, the format of a dump is specified in Libpcap File Format. To get the size and time of arrival of each packet you need to process the dump using this specification.
If you have to go deeper in the analysis these are the following layers in order
the link layer
the internet layer
Transport layer
The application layer
Each layer has a header definition. So you need to find which protocol stack your pcap data contains and to parse the header to get information.
What are the members of the 5-tuple? If the flows are TCP or UDP, the source and destination IP addresses and port numbers, plus, perhaps, a number to distinguish multiple flows over time between the two endpoints would work; for SCTP, it would be similar, although if a flow is a stream, you might need more.
If the members of the 5-tuple are all "named fields" in Wireshark, you could use TShark with the -T fields option, and use the -e option to specify which fields to print, and select a field with the time stamp (frame.time_epoch would give you the time as seconds and fractions of a second since the UN*X epoch), a field the appropriate size (frame.len gives you the raw number of bytes in the link-layer packet PLUS any meta-data such as a radiotap header for 802.11 radio information), and the other fields, and then feed the output of TShark to a script or program that does the processing you want to do. That lets TShark do the processing of the protocol layers, so that your program only needs to process the resulting data.