Reproducing legacy binary file with Python - python-2.7

I'm trying to write a legacy binary file format in Python 2.7 (the file will be read by a C program).
Is there a way to output the hex representation of integers to a file? I suspect I'll have to roll my own (not least because I don't think Python has the concept of short int, int and long int), but just in case I thought I'd ask. If I have a list:
[0x20, 0x3AB, 0xFFFF]
Is there an easy way to write that to a file so a hex editor would show the file contents as:
20 00 AB 03 FF FF
(note the endianness)?

Since you have some specific formatting needs, I think that using hex is out - you don't need the prefix. We use format instead.
data = [0x20, 0x3AB, 0xFFFF]
def split_digit(n):
""" Bitmasks out the first and second bytes of a <=32 bit number.
Consider checking if isinstance(n, long) and throwing an error.
"""
return (0x00ff & n, (0xff00 & n) >> 8)
[hex(x) + ' ' + hex(y) for x, y in [split_digit(d) for d in data]]
# ['0x20 0x0', '0xab 0x3', '0xff 0xff']
with open('myFile.bin', 'wb') as fh:
for datum in data:
little, big = split_digit(datum)
fh.write(format(little, '02x'))
fh.write(format(big, '02x'))
...or something like that? You'll need to change the formatting a bit, I bet.

Related

How do I access binary data via python registry?

The data in the registry key looks like:
Name Type Value
Data REG_BINARY 60 D0 DB 9E 2D 47 Cf 01
The data represent 8 bytes (QWORD little endian) filetime value. So why they chose to use binary rather than REG_QWORD is anyones guess.
If the python 2.7 code I can see the data value has been located and a value object contains the key information such as
print "***", value64.name(), value64.value_type(), value64.value
*** Data 3 <bound method RegistryValue.value of <Registry.Registry.RegistryValue object at 0x7f2d500b3990>>
The name 'Data' is correct and the value_type of 3 means REG_BINARY so that is correct.
The documentation to the python.registry (assuming I have the right doc) is
https://github.com/williballenthin/python-registry/blob/master/documentation/registry.html
However I am can't figure out what methods/functions have been provided to process binary data.
Because I know this binary data will always be 8 bytes I'm tempted to cast the object pointer to a QWORD (double) pointer and get the value directly but I'm not sure the object points to the data or how I would do this in python anyway.
Any pointers appreciated.
I figured out the type of the value64.value() was a 'str' so then I used simple character indexing to reference each of the 8 bytes and converted the value to a float.
def bin_to_longlong(binval):
return ord(binval[7])*(2**56) + ord(binval[6])*(2**48) + ord(binval[5])*(2**40) + ord(binval[4])*(2**32) + \
ord(binval[3])*(2**24) + ord(binval[2])*(2**16) + ord(binval[1])*(2**8) + ord(binval[0])
Code by me.
which can be tidied up by using struct.unpack like so:
return struct.unpack('<Q', binval)[0] # '<Q' little endian long long
And converted the float (filetime value) to a date.
EPOCH_AS_FILETIME = 116444736000000000 # January 1, 1970 as MS file time
HUNDREDS_OF_NANOSECONDS = 10000000
def filetime_to_dt(ft):
return datetime.fromtimestamp((ft - EPOCH_AS_FILETIME) / HUNDREDS_OF_NANOSECONDS)
Code from : https://gist.github.com/Mostafa-Hamdy-Elgiar/9714475f1b3bc224ea063af81566d873
Like so :
value64date = filetime_to_dt(bin_to_longlong(value64.value()))
Now hopefully someone can show me how to do that elegantly in python!

Python Read then Write project

I am trying to write a program that will read a text file and convert what it reads to another text file but using the given variables. Kinda like a homemade encryption. I want the program to read 2 bytes at a time and read the entire file. I am new to python but enjoy the application. any help would be greatly appreciated
a = 12
b = 34
c = 56
etc... up to 20 different types of variables
file2= open("textfile2.text","w")
file = open("testfile.txt","r")
file.read(2):
if file.read(2) = 12 then;
file2.write("a")
else if file.read(2) = 34
file2.write("b")
else if file.read(2) = 56
file2.write("c")
file.close()
file2.close()
Text file would look like:
1234567890182555
so the program would read 12 and write "a" in the other text file and then read 34 and put "b" in the other text file. Just having some logic issues.
I like your idea here is how I would do it. Note I convert everything to lowercase using lower() however if you understand what I am doing it would be quite simple to extend this to work on both lower and uppercase:
import string
d = dict.fromkeys(string.ascii_lowercase, 0) # Create a dictionary of all the letters in the alphabet
updates = 0
while updates < 20: # Can only encode 20 characters
letter = input("Enter a letter you want to encode or type encode to start encoding the file: ")
if letter.lower() == "encode": # Check if the user inputed encode
break
if len(letter) == 1 and letter.isalpha(): # Check the users input was only 1 character long and in the alphabet
encode = input("What do want to encode %s to: " % letter.lower()) # Ask the user what they want to encode that letter to
d[letter.lower()] = encode
updates += 1
else:
print("Please enter a letter...")
with open("data.txt") as f:
content = list(f.read().lower())
for idx, val in enumerate(content):
if val.isalpha():
content[idx] = d[val]
with open("data.txt", 'w') as f:
f.write(''.join(map(str, content)))
print("The file has been encoded!")
Example Usage:
Original data.txt:
The quick brown fox jumps over the lazy dog
Running the script:
Enter a letter you want to encode or type encode to start encoding the file: T
What do want to encode t to: 6
Enter a letter you want to encode or type encode to start encoding the file: H
What do want to encode h to: 8
Enter a letter you want to encode or type encode to start encoding the file: u
What do want to encode u to: 92
Enter a letter you want to encode or type encode to start encoding the file: 34
Please enter a letter...
Enter a letter you want to encode or type encode to start encoding the file: rt
Please enter a letter...
Enter a letter you want to encode or type encode to start encoding the file: q
What do want to encode q to: 9
Enter a letter you want to encode or type encode to start encoding the file: encode
The file has been encoded!
Encode data.txt:
680 992000 00000 000 092000 0000 680 0000 000
I would read the source file and convert the items as you go into a string. Then write the entire result string separately to the second file. This would also allow you to use the better with open construct for file reading. This allows python to handle file closing for you.
This code will not work because it only reads the first two characters. you need to create your own idea on how to iterate it, but here is an idea (without just making a solution for you)
with open("textfile.text","r") as f:
# you need to create a way to iterate over these two byte/char increments
code = f.read(2)
decoded = <figure out what code translates to>
results += decoded
# now you have a decoded string inside `results`
with open("testfile.txt","w") as f:
f.write(results)
the decoded = <figure out what code translates to> part can be done much better than using a bunch of serial if/elseifs....
perhaps define a dictionary of the encodings?
codings = {
"12": "a",
"45": "b",
# etc...
}
then you could just:
results += codings[code]
instead of the if statements (and it would be faster).

Python 2 str.decode('hex') in Python 3?

I want to send hex encoded data to another client via sockets in python. I managed to do everything some time ago in python 2. Now I want to port it to python 3.
Data looks like this:
""" 16 03 02 """
Then I used this function to get it into a string:
x.replace(' ', '').replace('\n', '').decode('hex')
It then looks like this (which is a type str by the way):
'\x16\x03\x02'
Now I managed to find this in python 3:
codecs.decode('160302', 'hex')
but it returns another type:
b'\x16\x03\x02'
And since everything I encode is not a proper language, i cannot use utf-8 or some decoders, as there are invalid bytes in it (e.g. \x00, \xFF). Any ideas on how I can get the string solution escaped again just like in python 2?
Thanks
'str' objects in python 3 are not sequences of bytes but sequences of unicode code points.
If by "send data" you mean calling send then bytes is the right type to use.
If you really want the string (not 3 bytes but 12 unicode code points):
>>> import codecs
>>> s = str(codecs.decode('16ff00', 'hex'))[2:-1]
>>> s
'\\x16\\xff\\x00'
>>> print(s)
\x16\xff\x00
Note that you need to double backslashes in order to represent them in code.
There is an standard solution for Python2 and Python3. No imports needed:
hex_string = """ 16 03 02 """
some_bytes = bytearray.fromhex(hex_string)
In python3 you can treat it like an str (slicing it, iterate, etc) also you can add byte-strings: b'\x00', b'text' or bytes('text','utf8')
You also mentioned something about to encode "utf-8". So you can do it easily with:
some_bytes.encode()
As you can see you don't need to clean it. This function is very effective. If you want to return to hexadecimal string: some_bytes.hex() will do it for you.
a = """ 16 03 02 """.encode("utf-8")
#Send things over socket
print(a.decode("utf-8"))
Why not encoding with UTF-8, sending with socket and decoding with UTF-8 again ?

Python rounding error for ints in a list (list contains ints and strings)

I have a python (2.7) script that reads an input file that contains text setup like this:
steve 83 67 77
The script averages the numbers corresponding to each name and returns a list for each name, that contains the persons name along with the average, for example the return output looks like this:
steve 75
However, the actual average value for "steve" is "75.66666667". Because of this, I would like the return value to be 76, not 75 (aka I would like it to round up to the nearest whole integer). I'm not sure how to get this done... Here is my code:
filename = raw_input('Enter a filename: ')
file=open(filename,"r")
line = file.readline()
students=[]
while line != "":
splitedline=line.split(" ")
average=0
for i in range(len(splitedline)-1) :
average+=int(splitedline[i+1])
average=average/(len(splitedline)-1)
students.append([splitedline[0],average])
line = file.readline()
for v in students:
print " ".join(map(str, v))
file.close()
While your code is very messy and should be improved overall, the solution to your problem should be simple:
average=average/(len(splitedline)-1)
should be:
average /= float(len(splitedline) - 1)
average = int(round(average))
By default in Python 2.x / with two integers does flooring division. You must explicitly make one of the parameters a floating point number to get real division. Then you must round the result and turn it back into an integer.
In Python 3 flooring division is //, and regular division is /. You can get this behavior in Python 2 with from __future__ import division.

I need to write a Python stub to print names of image files and whether they are blurry or not

New user here, and just started Python a few days ago!
My question is:
I need to write a Python stub to print names of image files and whether they are blurry or not. They are considered blurry if the value is > 0.3. There are 5 bits of information in each line, the second bit (index 1) is the number in question. In total there are 1868 lines.
Here is a sample of the data:
['out04-32-44-03.tif,0.295554,536047.6051,5281850.4252,19.8091\n',
'out04-32-44-15.tif,0.337232,536047.2831,5281850.5974,19.8256\n',
'out04-32-44-27.tif,0.2984,536046.9611,5281850.7696,19.8420\n',
'out04-32-44-39.tif,0.311989,536046.6392,5281850.9418,19.8584\n',
'out04-32-44-51.tif,0.346901,536046.3172,5281851.1140,19.8749\n',
'out04-32-44-63.tif,0.358519,536045.9953,5281851.2862,19.8913\n',
'out04-32-44-75.tif,0.342837,536045.6733,5281851.4584,19.9078\n',
'out04-32-44-87.tif,0.32909,536045.3513,5281851.6306,19.9242\n',
'out04-32-44-99.tif,0.294824,536045.0294,5281851.8028,19.9406\n']
Any suggestions greatly appreciated :-)
Based on the code you have written in the comments. This is for python 2.7
fin = open('E:\KGG 375 - GIS Advanced\Assignment 2 - Python\TIR043109gpxpos.txt')
for line in fin: # no need to read these into a list first
info = line.split(',')
blurry = float(info[1])
print info[0],
if blurry > 0.3:
print ' is blurry'
else:
print ' is not blurry'
Explanation:
There is no need to read the lines of a file to a list, you can just iterate over a file and it will read line by line
To be able to compare against a float, you need to convert the 2nd element (info[1]) into a float.
print info[0], will print the filename and the comma will prevent a line break so " is blurry" will print out to the same line. HOX! This is python2.7 syntax so it will not work with python 3.x