PYTHON - I would like to input the first two elements of data from the lines of a csv file. For example, 1,3,4 is the first line of the CSV file and I would like to make a dictionary of tuples where the first two numbers(1,3) as the key and the value is the third number(3).
So the output looks like this,
{('1','3') : 4}
Whats the difficulty you face?.Post the code you tried.The follwoing should do the trick:
d={}
fp=open("csv","r")
for i in fp.readlines():
d[(i[0],i[2])]=i[4]
Related
I have gone through similar questions but am having trouble fitting this to my needs. I am reading a csv, creating a list and appending the list to a seperate csv.
with open('in_table.csv', 'rb') as vo:
next(vo) # skip header row
reader = csv.reader(vo)
vo_list = list(reader)
print vo_list
with open('out_table.csv', 'ab') as f:
cf = csv.writer(f)
for row in vo_list:
cf.writerow(row)
I need to write the list starting at the second column and not the first, as the first column will contain separate information. What is the simplest way to do this?
Realistically I have another input CSV exactly like the first one and I need to put them both into the output file into a total of 4 columns. Like so:
Column1, join_count1, grid_id1, join_count2, grid_id2
Blah, 0, U24, 3, U24
I would go with the built-in csv package. Also, you are opening CSV files as binary files, was that intentional? CSVs should be text files by definition, but if yours are binary then please correct the flags below:
import csv
with open("out_table.csv", "a+") as out_file:
writer = csv.writer(out_file)
with open("in_table.csv") as in_file:
reader = csv.reader(in_file)
next(reader) # skip the header
for oid, join_count, grid_id in reader:
writer.writerow([join_count, grid_id])
Hi I am new in the Python. I want to change value of any column in semicolon separated CSV file. I have following CSV file format:
"S. No.";"name";"number";"status";
"1";"Mac";"54";"ABC";
"2";"Jack";"34";"xyz"; '''
I am using following Python code !Python code
!
However I am getting error "list index out of range".
I have search similar examples but most of them are comma separated CSV file. This code without delimiter specified is working fine for comma separated CSV file. I am getting row value like this
Row [" 1"; "Mac";" 54"; "ABC";] so I can not able to access elements of row list. Please help me to sort out the issue.
I need to create a list of tuples from a .csv file. On another post a member suggested using this code:
import csv
with open('movieCatalogue.csv') as f:
data=[tuple(line) for line in csv.reader(f)]
data.pop(0)
print(data)
This is almost perfect except the first column in the .csv file contains the product id which I do not one in the tuples. Is there a way to prevent certain columns in each line from being copied.
First, I suppose you're dropping the title line with data.pop(0). You could save a list dealloc/move by skipping when reading.
Then, when you compose your tuple, just drop the first element using sub-list syntax: line[start:stop:step], starting at index 0.
import csv
with open('movieCatalogue.csv') as f:
cr = csv.reader(f)
# drop the first line: better as next(f)
# since it works even if the title line is multi-line!
next(cr)
data=[tuple(line[1:]) for line in cr] # drop first column of each line
print(data)
The file look like a series of lines with IDs:
aaaa
aass
asdd
adfg
aaaa
I'd like to get in a new file the ID and its occurrence in the old file as the form:
aaaa 2
asdd 1
aass 1
adfg 1
With the 2 element separated by tab.
The code i have print what i want but doesn't write in a new file:
with open("Only1ID.txt", "r") as file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print item.title(), file.count(item)
As you use Python 2, the simplest approach to convert your console output to file output is by using the print chevron (>>) syntax which redirects the output to any file-like object:
with open("filename", "w") as f: # open a file in write mode
print >> f, "some data" # print 'into the file'
Your code could look like this after simply adding another open to open the output file and adding the chevron to your print statement:
with open("Only1ID.txt", "r") as file, open("output.txt", "w") as out_file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print >> out_file item.title(), file.count(item)
However, your code has a few other more or less bad things which one should not do or could improve:
Do not use the same variable name file for both the file object returned by open and your processed list of strings. This is confusing, just use two different names.
You can directly iterate over the file object, which works like a generator that returns the file's lines as strings. Generators process requests for the next element just in time, that means it does not first load the whole file into your memory like file.readlines() and processes them afterwards, but only reads and stores one line at a time, whenever the next line is needed. That way you improve the code's performance and resource efficiency.
If you write a list comprehension, but you don't need its result necessarily as list because you simply want to iterate over it using a for loop, it's more efficient to use a generator expression (same effect as the file object's line generator described above). The only syntactical difference between a list comprehension and a generator expression are the brackets. Replace [...] with (...) and you have a generator. The only downside of a generator is that you neither can find out its length, nor can you access items directly using an index. As you don't need any of these features, the generator is fine here.
There is a simpler way to remove trailing newline characters from a line: line.rstrip() removes all trailing whitespaces. If you want to keep e.g. spaces, but only want the newline to be removed, pass that character as argument: line.rstrip("\n").
However, it could possibly be even easier and faster to just not add another implicit line break during the print call instead of removing it first to have it re-added later. You would suppress the line break of print in Python 2 by simply adding a comma at the end of the statement:
print >> out_file item.title(), file.count(item),
There is a type Counter to count occurrences of elements in a collection, which is faster and easier than writing it yourself, because you don't need the additional count() call for every element. The Counter behaves mostly like a dictionary with your items as keys and their count as values. Simply import it from the collections module and use it like this:
from collections import Counter
c = Counter(lines)
for item in c:
print item, c[item]
With all those suggestions (except the one not to remove the line breaks) applied and the variables renamed to something more clear, the optimized code looks like this:
from collections import Counter
with open("Only1ID.txt") as in_file, open("output.txt", "w") as out_file:
counter = Counter(line.lower().rstrip("\n") for line in in_file)
for item in sorted(counter):
print >> out_file item.title(), counter[item]
Okay so if my file looks like this:
"1111-11-11";1;99.9;11;11.1;11.1
"2222-22-22";2;88.8;22;22.2;22.2
"3333-33-33";3;77.7;3.3;33.3;33.3
How I can read only parts "99.9", "88.8" and "77.7" from that file and make a list [99.9, 88.8, 77.7]? Basically I want to find parts after n semicolons.
You can open the file and read each line with the open command for csv your code might look like:
import csv
with open('filename.csv', 'rb') as f:
reader = csv.reader(f)
listOfRows = list(reader)
You will now have a list of lines, each line requires some processing.
if you lines always have the same structure you can split them by a ;
list_in_line= line.split(";")
and get the third element in that line.
Please show us some of your work, or better explain the structure of your data