how does perl take my text file containing different users - regex

The superadmin.txt file is in the form:
def
ghi
pqr
...etc
How would Perl read different users?
Will it be able to identify different text from superadmin.txt?
Here is the code:
#prelist = ();
#prelist = `cat ../cgi-bin/superadmin.txt`;
foreach $prename (#prelist) {
$prename =~s/\n//g ;
$superadminstaff{$prename} = "Y";

Well let's re write the code so it a) compiles, b) uses current best practice and c) doesn't fork out to cat
my #prelist = ();
my %superadminstaff = ();
open my $admin, '<', '../cgi-bin/superadmin.txt' or die "Can't open ../cgi-bin/superadmin.txt: $!\n";
chomp(#prelist = <$admin>);
#superadminstaff{#prelist} = ("Y") x #prelist;
So we have two variables #prelist and %superadminstaff, #prelist will hold each line of the file and %superadminstaff will end up keyed on each entry in the file.
Line 3 attempts to open the file, and if if can't will stop the script and print out a message explaining what went wrong.
Line 4 reads the file into #prelist and uses chomp to strip off the line endings. Note that chomp is used in preference to chop as the last line in the file my not have a line ending.
Line 5 uses some Perl magic called a hash slice to create entries in %superadminstaff for each element in#prelist and then uses the x operator to create a list containing as many Y elements as there are elements in #prelist this list of Ys is then assigned to the newly created elements in %superadminstaff.
So at the end of this code you now have %superadminstaff containing each entry of the file. However the order of the elements in the file will not be preserved and any duplicate entries in the file will be reduced to one entry in %superadminstaff.

Related

How to iterate through a list and add the contents to a file

and good day fellow developers. I was wondering if say i would like to append every thing on list to a text file but. i want it to look like this
list = ['something','foo','foooo','bar','bur','baar']
#the list
THE NORMAL FILE
this
is
the
text
file
:D
AND WHAT I WOULD LIKE TO DO
this something
is foo
the foooo
text bar
file bur
:D baar
This can be accomplished by reading the original file's contents and appending the added words to each line
example:
# changed the name to list_obj to prevent overriding builtin 'list'
list_obj = ['something','foo','foooo','bar','bur','baar']
path_to_file = "a path name.txt"
# r+ to read and write to/from the file
with open(path_to_file, "r+") as fileobj:
# read all lines and only include lines that have something written
lines = [x for x in fileobj.readlines() if x.strip()]
# after reading reset the file position
fileobj.seek(0)
# iterate over the lines and words to add
for line, word in zip(lines, list_obj):
# create each new line with the added words
new_line = "%s %s\n\n" % (line.rstrip(), word)
# write the lines to the file
fileobj.write(new_line)

Hello I have a code that prints what I need in python but i'd like it to write that result to a new file

The file look like a series of lines with IDs:
aaaa
aass
asdd
adfg
aaaa
I'd like to get in a new file the ID and its occurrence in the old file as the form:
aaaa 2
asdd 1
aass 1
adfg 1
With the 2 element separated by tab.
The code i have print what i want but doesn't write in a new file:
with open("Only1ID.txt", "r") as file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print item.title(), file.count(item)
As you use Python 2, the simplest approach to convert your console output to file output is by using the print chevron (>>) syntax which redirects the output to any file-like object:
with open("filename", "w") as f: # open a file in write mode
print >> f, "some data" # print 'into the file'
Your code could look like this after simply adding another open to open the output file and adding the chevron to your print statement:
with open("Only1ID.txt", "r") as file, open("output.txt", "w") as out_file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print >> out_file item.title(), file.count(item)
However, your code has a few other more or less bad things which one should not do or could improve:
Do not use the same variable name file for both the file object returned by open and your processed list of strings. This is confusing, just use two different names.
You can directly iterate over the file object, which works like a generator that returns the file's lines as strings. Generators process requests for the next element just in time, that means it does not first load the whole file into your memory like file.readlines() and processes them afterwards, but only reads and stores one line at a time, whenever the next line is needed. That way you improve the code's performance and resource efficiency.
If you write a list comprehension, but you don't need its result necessarily as list because you simply want to iterate over it using a for loop, it's more efficient to use a generator expression (same effect as the file object's line generator described above). The only syntactical difference between a list comprehension and a generator expression are the brackets. Replace [...] with (...) and you have a generator. The only downside of a generator is that you neither can find out its length, nor can you access items directly using an index. As you don't need any of these features, the generator is fine here.
There is a simpler way to remove trailing newline characters from a line: line.rstrip() removes all trailing whitespaces. If you want to keep e.g. spaces, but only want the newline to be removed, pass that character as argument: line.rstrip("\n").
However, it could possibly be even easier and faster to just not add another implicit line break during the print call instead of removing it first to have it re-added later. You would suppress the line break of print in Python 2 by simply adding a comma at the end of the statement:
print >> out_file item.title(), file.count(item),
There is a type Counter to count occurrences of elements in a collection, which is faster and easier than writing it yourself, because you don't need the additional count() call for every element. The Counter behaves mostly like a dictionary with your items as keys and their count as values. Simply import it from the collections module and use it like this:
from collections import Counter
c = Counter(lines)
for item in c:
print item, c[item]
With all those suggestions (except the one not to remove the line breaks) applied and the variables renamed to something more clear, the optimized code looks like this:
from collections import Counter
with open("Only1ID.txt") as in_file, open("output.txt", "w") as out_file:
counter = Counter(line.lower().rstrip("\n") for line in in_file)
for item in sorted(counter):
print >> out_file item.title(), counter[item]

Asking user for raw_input to open a file, when attempting to run program comes back with mode 'r'

I am trying to run the following code:
fname = raw_input ('Enter file name:')
fh = open (fname)
count = 0
for line in fh:
if not line.startswith ('X-DSPAM-Confidence:') : continue
else:
count = count + 1
new = fh #this new = fh is supposed to be fh stripped of the non- x-dspam lines
for line in new: # this seperates the lines in new and allows `finding the floats on each line`
numpos = new.find ('0')
endpos = new.find ('5', numpos)
num = new[numpos:endpos + 1]
float (num)
# should now have a list of floats
print num
The intention of this code is to prompt the user for a file name, open the file, read through the file, compile all the lines that start with X-DSPAM, and extract the float number on these lines. I am fairly new to coding so I realise I may have committed a number of errors, but currently when I try to run it, after putting in the file name I get the return:
I looked around and I have seen that mode 'r' refers to different file modes in python in relation to how the end of the line is handled. However the code I am trying to run is similar to other code I have formulated and it does not have any non-text files inside, the file being opened is a .txt file. Is it something to do with converting a list of strings line by line to a list of float numbers?
Any ideas on what I am doing wrong would be appreciated.
The default mode of handling a file is 'r' - which means 'read', which is what you want. It means the program is going to read the file (as opposed to 'w' - write, or 'a' - append, for example - which would allow you to overwrite the file or append to it, which you don't want in this case).
There are some bugs in your code, which I've tried to indicate in the edited code below.
You don't need to assign new = fh - you're not grabbing lines and passing them to a new file. Rather, you're checking each line against the 'XDSPAM' criteria and if it's a match, you can proceed to parse out the desired numbers. If not, you ignore it and go to the next line.
With that in mind, you can move all of the code from the for line in new to be part of the original if not ... else block.
How you find the end of the number is also a bit off. You set endpos by searching for an occurence of the number 5 - but what I think you want is to find a position 5 characters from the start position (numpos + 5).
(There are other ways to parse the line and pull the number, but I'm going to stick with your logic as indicated by your code, so nothing fancy here.)
You can convert to float in the same statement where you slice the number from the line (as below). It's acceptable to do:
num = line[numpos:endpos+1]
float_num = float(num)
but not necessary. In any event, you want to assign the conversion (float(num)) to a variable - just having float(num) doesn't allow you to pass the converted value to another statement (including print).
You say that you should have 'a list of floats' - the code as corrected below - will give you a display of all the floats, but if you want an actual Python list, there are other steps involved. I don't think you wanted a Python list, but just in case:
numlist = [] # at the beginning, declare a new, empty list
...
# after converting to float, append number to list
XDSPAM.append(num)
print XDSPAMs # at end of program, to print full list
In any event, this edited code works for me with an appropriate file of test data, and outputs the desired float numbers:
fname = raw_input ('Enter file name:')
fh = open (fname)
count = 0
for line in fh:
if not line.startswith ('X-DSPAM-Confidence:') : continue
else:
# there's no need to create the 'new' variable
# any lines that meet the criteria can be processed for numbers
count = count + 1
numpos = line.find ('0')
# i think what you want here is to set an endpoint 5 positions to the right
# but your code was looking for the position of a '5' in the line
endpos = numpos + 5
# you can convert to float and slice in the same statement
num = float(line[numpos:endpos+1])
print num

how to write simultaneously in a file while the program is still running

In simple words I have a file which contains duplicate numbers. I want to write unique numbers from the 1st file into a 2nd file. I have opened the 1st file in 'r' mode and the 2nd file in 'a+' mode. But it looks like that nothing is appended in the 2nd file while the program is running which gives wrong output. Any one can help me how do I fix this problem.
Thank you in advance.
This is my code
#!/usr/bin/env python
fp1 = open('tweet_mention_id.txt','r')
for ids in fp1:
ids = ids.rstrip()
ids = int(ids)
print 'ids= ',ids
print ids + 1
fp2 = open('unique_mention_ids.txt','a+')
for user in fp2:
user = user.rstrip()
user = int(user)
print user + 1
print 'user= ',user
if ids != user:
print 'is unique',ids
fp2.write(str(ids) + '\n')
break
else:
print 'is already present',ids
fp2.close()
fp1.close()
If unique_mention_ids.txt is initially empty, then you will never enter your inner loop, and nothing will get written. You should use the inner loop to determine whether or not the id needs to be added, but then do the addition (if warranted) outside the inner loop.
Similar logic applies for a non-empty file, but for a different reason: when you open it for appending, the file pointer is at the end of the file, and trying to read behaves as if the file were empty. You can start at the beginning of the file by issuing a fp2.seek(0) statement before the inner loop.
Either way: as written, you will write a given id from the first file for every entry in the second that it doesn't match, as opposed to it not matching any (which, given the file name, sounds like what you want). Worse, in the second case above, you will be over writing whatever came after the id that didn't match.

Python: read file - modify line - write output

I have a list of terms in a file that I want to read, modify each term and output the new terms to a new file. The new terms should look like this: take the first two characters of the original term put them in quotes, add a '=>' then the original term in quotes and a comma.
This is the code I'm using:
def newFile(newItem):
original = line
first = line[0:2]
newItem = first+'=>'+original+','
return newItem
input = open('/Users/george/Desktop/input.txt', 'r')
output = open('/Users/george/Desktop/output.txt', 'w')
collector = ''
for line in input:
if len(line) != 0:
collector = newFile(input)
output.write(''.join(collector))
if len(line) == 0:
input.close()
output.close()
For example:
If the terms in the input.txt file are these:
term 1
term 2
term 3
term 4
The output is this:
te=>term 1
,te=>term 2
,te=>term 3
,te=>term 4
,
How can I add '' to the first two letters and to the term? And why the second, third and forth terms have ,te not te like it should?
Instead of using collector and newFile() you can use new variable:
modified_line = "'%s'=>'%s'," % (line[:2], line.strip())
and in your loop try this:
...
if len(line) > 2:
output.write('%s\n' % (modified_line))
Also:
if possible do not hard code file names in your program, use sys.argv, standard input/output or config file; of course if you are sure of input/output names then use them
in line[0:2] you can ommit 0 and use line[:2]
you should use try: - open file - read file etc. finally: close file
you don't need to check if len(line) == 0, for loop do it already and you will receive line with CRLF for empty lines, but end of input file is when for loop ends