int not subscriptable when trying to remove item from list - python-2.7

I'm reading a file outputfromextractand I want to split the contents of that file with the delimiter ',' which I have done.
When reading the contents into a list there's two 'faff' entries at the beginning that I'm just trying to remove however I find myself unable to remove the index
import json
class device:
ipaddress = None
macaddress = None
def __init__(self, ipaddress, macaddress):
self.ipaddress = ipaddress
self.macaddress = macaddress
listofItems = []
listofdevices = []
def format_the_data():
file = open("outputfromextract")
contentsofFile = file.read()
individualItem = contentsofFile.split(',')
listofItems.append(individualItem)
print(listofItems[0][0:2]) #this here displays the entries I want to remove
listofItems.remove[0[0:2]] # fails here and raises a TypeError (int object not subscriptable)
In the file I have created the first three lines are enclosed below for reference:
[u' #created by system\n', u'time at 12:05\n', u'192.168.1.1\n',...
I'm wanting to simply remove those two items from the list and the rest will be put into an constructor

I believe listofItems.remove[0[0:2]] should be listofItems.remove[0][0:2].
But, slicing will be much easier, for example:
with open("outputfromextract") as f:
contentsofFile = f.read()
individualItem = contentsofFile.split(',')[2:]

Related

There is a text file want to return content of file in json format on the matching column

there is a text file containing data in the form:
[sec1]
"ab": "s"
"sd" : "d"
[sec2]
"rt" : "ty"
"gh" : "rr"
"kk":"op"
we are supposed to return dara of matching sections in json format like if user wants sec1 so we are supposed to send sec1 contents
The format you specified is very similar to the TOML format. However, this one uses equals signs for assignments of key-value pairs.
If your format actually uses colons for the assignment, the following example may help you.
It uses regular expressions in conjunction with a defaultdict to read the data from the file. The section to be queried is extracted from the URL using a variable rule.
If there is no hit within the loaded data, the server responds with a 404 error (NOT FOUND).
import re
from collections import defaultdict
from flask import (
Flask,
abort,
jsonify
)
def parse(f):
data = defaultdict(dict)
section = None
for line in f:
if re.match(r'^\[[^\]]+\]$', line.strip()):
section = line[1:-2]
data[section] = dict()
continue
m = re.match(r'^"(?P<key>[^"]+)"\s*:\s*"(?P<val>[^"]+)"$', line.strip())
if m:
key,val = m.groups()
if not section:
raise OSError('illegal format')
data[section][key] = val
continue
return dict(data)
app = Flask(__name__)
#app.route('/<string:section>')
def data(section):
path = 'path/to/file'
with open(path) as f:
data = parse(f)
if section in data:
return jsonify(data[section])
abort(404)

How to use a concatenated string for get Method of requests?

I'm trying to write a small crawler to crawl multiple wikipedia pages.
I want to make the crawl somewhat dynamic by concatenating the hyperlink for the exact wikipage from a file which contains a list of names.
For example, the first line of "deutsche_Schauspieler.txt" says "Alfred Abel" and the concatenated string would be "https://de.wikipedia.org/wiki/Alfred Abel". Using the txt file will result in heading being none, yet when I complete the link with a string inside the script, it works.
This is for python 2.x.
I already tried to switch from " to ',
tried + instead of %s
tried to put the whole string into the txt file (so that first line reads "http://..." instead of "Alfred Abel"
tried to switch from "Alfred Abel" to "Alfred_Abel
from bs4 import BeautifulSoup
import requests
file = open("test.txt","w")
f = open("deutsche_Schauspieler.txt","r")
content = f.readlines()
for line in content:
link = "https://de.wikipedia.org/wiki/%s" % (str(line))
response = requests.get(link)
html = response.content
soup = BeautifulSoup(html)
heading = soup.find(id='Vorlage_Personendaten')
uls = heading.find_all('td')
for item in uls:
file.write(item.text.encode('utf-8') + "\n")
f.close()
file.close()
I expect to get the content of the table "Vorlage_Personendaten" which actually works if i change line 10 to
link = "https://de.wikipedia.org/wiki/Alfred Abel"
# link = "https://de.wikipedia.org/wiki/Alfred_Abel" also works
But I want it to work using the textfile
Looks like the problem in your text file where you have used "Alfred Abel" that is why you are getting the following exceptions
uls = heading.find_all('td')
AttributeError: 'NoneType' object has no attribute 'find_all'
Please remove the string quotes "Alfred Abel" and use Alfred Abel inside the text file deutsche_Schauspieler.txt . it will work as expected.
I found the solution myself.
Although there are no additionaly lines on the file, the content array displays like
['Alfred Abel\n'], but printing out the first index of the array will result in 'Alfred Abel'. It still gets interpreted like the string in the array, thus forming a false link.
So you want to move the last(!) character from the current line.
A solution could look like so:
from bs4 import BeautifulSoup
import requests
file = open("test.txt","w")
f = open("deutsche_Schauspieler.txt","r")
content = f.readlines()
print (content)
for line in content:
line=line[:-1] #Note how this removes \n which are technically two characters
link = "https://de.wikipedia.org/wiki/%s" % str(line)
response = requests.get(link)
html = response.content
soup = BeautifulSoup(html,"html.parser")
try:
heading = soup.find(id='Vorlage_Personendaten')
uls = heading.find_all('td')
for item in uls:
file.write(item.text.encode('utf-8') + "\n")
except:
print ("That did not work")
pass
f.close()
file.close()

Writing search results only from last row in csv

Here is my code:
import urllib
import json
import csv
apiKey = "MY_KEY" # Google API credentials
##perform a text search based on input, place results in text-search-results.json
print "Starting"
myfile = open("results.csv","wb")
headers = []
headers.append(['Search','Name','Address','Phone','Website','Type','Google ID','Rating','Permanently Closed'])
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
wr.writerows(headers)
with open('input_file.csv', 'rb') as csvfile:
filereader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in filereader:
search = ', '.join(row)
search.replace(' ', '+')
url1 = "https://maps.googleapis.com/maps/api/place/textsearch/json?query=%s&key=%s" % (search,apiKey)
urllib.urlretrieve(url1,"text-search-results.json")
print "SEARCH", search
print "Google Place URL", url1
## load text-search-results.json and get the list of place IDs
textSearchResults = json.load(open("text-search-results.json"))
listOfPlaceIds = []
for item in textSearchResults["results"]:
listOfPlaceIds.append(str(item["place_id"]))
## open a nested list for the results
output = []
## iterate through and download a JSON for each place ID
for ids in listOfPlaceIds:
url = "https://maps.googleapis.com/maps/api/place/details/json?placeid=%s&key=%s" % (ids,apiKey)
fn = ids + "-details.json"
urllib.urlretrieve(url,fn)
data = json.load(open(fn))
lineToAppend = []
lineToAppend.append(search)
try:
lineToAppend.append(str(data["result"]["name"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["formatted_address"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["formatted_phone_number"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["website"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["types"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["place_id"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["rating"]))
except KeyError:
lineToAppend.append('')
try:
lineToAppend.append(str(data["result"]["permanently_closed"]))
except KeyError:
lineToAppend.append('')
output.append(lineToAppend)
wr.writerows(output)
myfile.close()
What this is doing is taking the search terms from one column in the input_file and running that search through the Google Places API. However, when I have multiple search terms, it only returns the last search results in the results.csv file. I am not quite sure why this is happening since it is reading all of the search terms and running them through, but only returning the last result. Any suggestions?
Currently you are only writing out the last line because you are resetting the variable lineToAppend within your for loop. However, you are not adding it to your output within your for loop. Therefore, it is getting to the end of your for loop and writing out the last line.
Therefore currently it looks like this: (Shortened for brevity)
for ids in listOfPlaceIds:
url = "https://maps.googleapis.com/maps/api/place/details/json?placeid=%s&key=%s" % (ids,apiKey)
fn = ids + "-details.json"
urllib.urlretrieve(url,fn)
data = json.load(open(fn))
lineToAppend = []
lineToAppend.append(search)
...
output.append(lineToAppend)
wr.writerows(output)
Whereas it should be:
for ids in listOfPlaceIds:
url = "https://maps.googleapis.com/maps/api/place/details/json?placeid=%s&key=%s" % (ids,apiKey)
fn = ids + "-details.json"
urllib.urlretrieve(url,fn)
data = json.load(open(fn))
lineToAppend = []
lineToAppend.append(search)
...
output.append(lineToAppend)
wr.writerows(output)

Adding for-loop elements to a list

I'm writing a code where it fetches some text from a site and then, with a for-loop I take the part of the text of my interest. I can print this text but I would like to know how can I send it to a list for latter use. So far the code I've wrote is this one.
import urllib2
keyword = raw_input('keyword: ')
URL = "http://www.uniprot.org/uniprot/?sort=score&desc=&compress=no&query=%s&fil=&limit=10&force=no&preview=true&format=fasta" % keyword
filehandle = urllib2.urlopen(URL)
url_text = filehandle.readlines()
for line in url_text:
if line.startswith('>'):
print line[line.index(' ') : line.index('OS')]
Just use append:
lines = []
for line in url_text:
if line.startswith('>'):
lines.append(line) # or whatever else you wanted to add to the list
print line[line.index(' ') : line.index('OS')]
Edit: on a side note, python can for loop directly over a file - as in:
url_text = filehandle.readlines()
for line in url_text:
pass
# can be shortened to:
for line in filehandle:
pass

Django smart_str on queryset

I need to use smart_str on the results of a query in my view to take care of latin characters. How can I convert each item in my queryset?
I have tried:
...
mylist = []
myquery_set = Locality.objects.all()
for item in myquery_set:
mylist.append(smart_str(item))
...
But I get the error:
coercing to Unicode: need string or buffer, <object> found
What is the best way to do this? Or can I take care of it in the template as I iterate the results?
EDIT: if I output the values to a template then all is good. However, I want to output the response as an .xls file using the code:
...
filename = "locality.xls"
response['Content-Disposition'] = 'attachment; filename='+filename
response['Content-Type'] = 'application/vnd.ms-excel; charset=utf-8'
return response
The view works fine (gives me the file etc.) but the latin characters are not rendered properly.
In your code you're executing smart_str on Model object, instead of a string (so basically you're trying to convert object to string). The solution is to smart_str on a field:
mylist.append(smart_str(item.fieldname))