Python Read then Write project - python-2.7

I am trying to write a program that will read a text file and convert what it reads to another text file but using the given variables. Kinda like a homemade encryption. I want the program to read 2 bytes at a time and read the entire file. I am new to python but enjoy the application. any help would be greatly appreciated
a = 12
b = 34
c = 56
etc... up to 20 different types of variables
file2= open("textfile2.text","w")
file = open("testfile.txt","r")
file.read(2):
if file.read(2) = 12 then;
file2.write("a")
else if file.read(2) = 34
file2.write("b")
else if file.read(2) = 56
file2.write("c")
file.close()
file2.close()
Text file would look like:
1234567890182555
so the program would read 12 and write "a" in the other text file and then read 34 and put "b" in the other text file. Just having some logic issues.

I like your idea here is how I would do it. Note I convert everything to lowercase using lower() however if you understand what I am doing it would be quite simple to extend this to work on both lower and uppercase:
import string
d = dict.fromkeys(string.ascii_lowercase, 0) # Create a dictionary of all the letters in the alphabet
updates = 0
while updates < 20: # Can only encode 20 characters
letter = input("Enter a letter you want to encode or type encode to start encoding the file: ")
if letter.lower() == "encode": # Check if the user inputed encode
break
if len(letter) == 1 and letter.isalpha(): # Check the users input was only 1 character long and in the alphabet
encode = input("What do want to encode %s to: " % letter.lower()) # Ask the user what they want to encode that letter to
d[letter.lower()] = encode
updates += 1
else:
print("Please enter a letter...")
with open("data.txt") as f:
content = list(f.read().lower())
for idx, val in enumerate(content):
if val.isalpha():
content[idx] = d[val]
with open("data.txt", 'w') as f:
f.write(''.join(map(str, content)))
print("The file has been encoded!")
Example Usage:
Original data.txt:
The quick brown fox jumps over the lazy dog
Running the script:
Enter a letter you want to encode or type encode to start encoding the file: T
What do want to encode t to: 6
Enter a letter you want to encode or type encode to start encoding the file: H
What do want to encode h to: 8
Enter a letter you want to encode or type encode to start encoding the file: u
What do want to encode u to: 92
Enter a letter you want to encode or type encode to start encoding the file: 34
Please enter a letter...
Enter a letter you want to encode or type encode to start encoding the file: rt
Please enter a letter...
Enter a letter you want to encode or type encode to start encoding the file: q
What do want to encode q to: 9
Enter a letter you want to encode or type encode to start encoding the file: encode
The file has been encoded!
Encode data.txt:
680 992000 00000 000 092000 0000 680 0000 000

I would read the source file and convert the items as you go into a string. Then write the entire result string separately to the second file. This would also allow you to use the better with open construct for file reading. This allows python to handle file closing for you.
This code will not work because it only reads the first two characters. you need to create your own idea on how to iterate it, but here is an idea (without just making a solution for you)
with open("textfile.text","r") as f:
# you need to create a way to iterate over these two byte/char increments
code = f.read(2)
decoded = <figure out what code translates to>
results += decoded
# now you have a decoded string inside `results`
with open("testfile.txt","w") as f:
f.write(results)
the decoded = <figure out what code translates to> part can be done much better than using a bunch of serial if/elseifs....
perhaps define a dictionary of the encodings?
codings = {
"12": "a",
"45": "b",
# etc...
}
then you could just:
results += codings[code]
instead of the if statements (and it would be faster).

Related

python file reading and splitting the words

I am reading a file in python and splitting the file with '\n' . when i am printing the splitted list it is giving 'Magni\xef\xac\x81cent Mary' instead of 'Magnificient Mary'
Here is my code...
with open('/home/naveen/Desktop/answer.txt') as ans:
content = ans.read()
content = content.split('\n')
print content
note: answer.txt contains following lines
Magnificent Mary
Flying Sikh
Payyoli Express
Here is my output of the program
the problem is in your text file. There are some unicodes characters in "Magnificent Mary" If you fix that your program should work. If you want to read with unicodes characters, you have to properly decode texts to UTF-8.
Have a look at this one (assuming you want to use python 2) Backporting Python 3 open(encoding="utf-8") to Python 2
python2
with codecs.open(filename='/Users/emily/Desktop/answers.txt', mode='rb', encoding='UTF-8') as ans:
content = ans.read().splitlines()
for i in content: print i
If you can use python3, you can actually do this:
with open('/home/naveen/Desktop/answer.txt', encoding='UTF-8') as ans:
content = ans.read().splitlines()
print(content)
There is a problem with your 'f' in Magnificent Mary . It is not the normal f , but it is the
LATIN SMALL LIGATURE FI . You can simply delete your 'f' and retype it in gedit.
To verify the difference , simply include
print [(ord(a),a) for a in (file.split("\n"))[0]]
at the end of your code for both the fs.
If there is no way to edit the file , you could first convert the string to unicode , and then use the unicodedata of python.
import unicodedata
file = open("answer.txt")
file = (file.read()).decode('utf-8')
print unicodedata.normalize('NFKD',
file).encode('ascii','ignore').split("\n")

In python insert one space after every 5th Character in each line of a text file

I am reading a text file in python(500 rows) and it seems like:
File Input:
0082335401
0094446049
01008544409
01037792084
01040763890
I wanted to ask that is it possible to insert one space after 5th Character in each line:
Desired Output:
00823 35401
00944 46049
01008 544409
01037 792084
01040 763890
I have tried below code
st = " ".join(st[i:i + 5] for i in range(0, len(st), 5))
but the below output was returned on executing it:
00823 35401
0094 44604 9
010 08544 409
0 10377 92084
0104 07638 90
I am a novice in Python. Any help would make a difference.
There seems to be two issues here - By running your provided code, you seem to be reading the file into one single string. It would be much preferable (in your case) to read the file in as a list of strings, like the following (assuming your input file is input_data.txt):
# Initialize a list for the data to be stored
data = []
# Iterate through your file to read the data
with open("input_data.txt") as f:
for line in f.readlines():
# Use .rstrip() to get rid of the newline character at the end
data.append(line.rstrip("\r\n"))
Then, to operate on the data you obtained in a list, you could use a list comprehension similar to the one you have tried to use.
# Assumes that data is the result from the above code
data = [i[:5] + " " + i[5:] if len(i) > 5 else i for i in data]
Hope this helped!
If your only requirement is to insert a space after the fifth character than you could use the following simple version:
#!/usr/bin/env python
with open("input_data") as data:
for line in data.readlines():
line = line.rstrip()
if len(line) > 5:
print(line[0:5]+" "+line[5:])
else:
print(line)
If you don't mind if lines with less than five characters get a space at the end, you could even omit the if-else-statement and go with the print-function from the if-clause:
#!/usr/bin/env python
with open("input_data") as data:
for line in data.readlines():
line = line.rstrip()
print(line[0:5]+" "+line[5:])

Python:How can you recursively search a .txt file, find matches and print results

I have been searching for an answer to this, but can not seem to get what I need. I would like a python script that reads my text file and starting from the top working its way through each line of the file and then prints out all the matches in another txt file. Content of the text file is just 4 digit numbers like 1234.
example
1234
3214
4567
8963
1532
1234
...and so on.
I would like the output to be something like:
1234 : matches found = 2
I know that there are matches in the file do to almost 10000 lines. I appreciate any help. If someone could just point me in the right direction here would be great. Thank you.
import re
file = open("filename", 'r')
fileContent=file.read()
pattern="1234"
print len(re.findall(pattern,fileContent))
If I were you I would open the file and use the split method to create a list with all the numbers in and use the Counter method from collections to count how many of each number in the list are dupilcates.
`
from collections import Counter
filepath = 'original_file'
new_filepath = 'new_file'
file = open(filepath,'r')
text = file.read()
file.close()
numbers_list = text.split('\n')
numbers_set = set(numbers_list)
dupes = [[item,':matches found =',str(count)] for item,count in Counter(numbers_list).items() if count > 1]
dupes = [' '.join(i) for i in dupes]
new_file = open(new_filepath,'w')
for i in dupes:
new_file.write(i)
new_file.close()
`
Thanks to everyone who helped me on this. Thank you to #csabinho for the code he provided and to #IanAuld for asking me "Why do you think you need recursion here?" – IanAuld. It got me to thinking that the solution was a simple one. I just wanted to know which 4 digit numbers had duplicates and how many, and also which 4 digit combos were unique. So this is what I came up with and it worked beautifully!
import re
a=999
while a <9999:
a = a+1
file = open("4digits.txt", 'r')
fileContent = file.read()
pattern = str(a)
result = len(re.findall(pattern, fileContent))
if result >= 1:
print(a,"matches",result)
else:
print (a,"This number is unique!")

readlines() function and unicodes

I have this file, testpi.txt, which i'd like to convert into a list of sentences.
>>>cat testpi.txt
This is math π.
That is moth pie.
Here's what I've done:
r = open('testpi.txt', 'r')
sentence_List = r.readlines()
print sentence_List
And, when the output is sent to another text file - output.txt , this is how it looks like in output.txt:
['This is math \xcf\x80. That is moth pie.\n']
I tried codecs too, r = codecs.open('testpi.txt', 'r',encoding='utf-8'),
but the output then consists of a leading 'u' in all the entries.
How could I display this byte string - \xcf\x80 as π, in the output.txt
Please guide me, thanks.
The problem is you're printing the entire list which gives you an output format you don't want. Instead, print each string individually and it will work:
r = open('t.txt', 'r')
sentence_List = r.readlines()
for line in sentence_List:
print line,
Or:
print "['{}']".format("', '".join(map(str.rstrip, sentence_List)))

I need to write a Python stub to print names of image files and whether they are blurry or not

New user here, and just started Python a few days ago!
My question is:
I need to write a Python stub to print names of image files and whether they are blurry or not. They are considered blurry if the value is > 0.3. There are 5 bits of information in each line, the second bit (index 1) is the number in question. In total there are 1868 lines.
Here is a sample of the data:
['out04-32-44-03.tif,0.295554,536047.6051,5281850.4252,19.8091\n',
'out04-32-44-15.tif,0.337232,536047.2831,5281850.5974,19.8256\n',
'out04-32-44-27.tif,0.2984,536046.9611,5281850.7696,19.8420\n',
'out04-32-44-39.tif,0.311989,536046.6392,5281850.9418,19.8584\n',
'out04-32-44-51.tif,0.346901,536046.3172,5281851.1140,19.8749\n',
'out04-32-44-63.tif,0.358519,536045.9953,5281851.2862,19.8913\n',
'out04-32-44-75.tif,0.342837,536045.6733,5281851.4584,19.9078\n',
'out04-32-44-87.tif,0.32909,536045.3513,5281851.6306,19.9242\n',
'out04-32-44-99.tif,0.294824,536045.0294,5281851.8028,19.9406\n']
Any suggestions greatly appreciated :-)
Based on the code you have written in the comments. This is for python 2.7
fin = open('E:\KGG 375 - GIS Advanced\Assignment 2 - Python\TIR043109gpxpos.txt')
for line in fin: # no need to read these into a list first
info = line.split(',')
blurry = float(info[1])
print info[0],
if blurry > 0.3:
print ' is blurry'
else:
print ' is not blurry'
Explanation:
There is no need to read the lines of a file to a list, you can just iterate over a file and it will read line by line
To be able to compare against a float, you need to convert the 2nd element (info[1]) into a float.
print info[0], will print the filename and the comma will prevent a line break so " is blurry" will print out to the same line. HOX! This is python2.7 syntax so it will not work with python 3.x