Recombee batch does not send all the data - django

I am using recombees API for a recommendation and there is a batch method to send all the user data to the API.
The code for the following:
for i in range(0,len(list_of_ratings)):
name = str(list_of_ratings[i].user)
series = str(list_of_ratings[i].series)
rate = list_of_ratings[i].rating
print(name + ' ' + series + ' ' + str(rate))
request = AddRating(name, series, rate ,cascade_create=True)
requests.append(request)
try:
client.send(Batch(requests))
except APIException as e:
print(e)
except ResponseException as e:
print(e)
except ApiTimeoutException as e:
print(e)
except Exception as e:
print(e)
But the problem is it does not send all the data. There are 946 data objects that I have in a Django model but the first time when i ran this only 20 were sent and during the 2nd time only 6.
I dont know whats causing the issue.
Any help is appreciated.

Perhaps there is some error in your batch. I would suggest printing the batch result to see eventual error messages:
res = client.send(Batch(requests))
print(res)

Related

Python : Try uploading three times if timeout occurs

I am uploading an image file to the server but when timeout occurs it terminates the upload and move to the next. i want to upload the image again if timeout occurs. I want to try to upload three times at max.and after three attempts if still timeout occurs then throw an exception and move to the next.
here's my code
def upload(filename2,sampleFile,curr_time,curr_day,username,password,user_id):
register_openers()
datagen, headers = multipart_encode({"sampleFile": open(sampleFile), "name": filename2, "userID": user_id,'date': curr_day,'time': curr_time, 'username': username,'password': password})
request = urllib2.Request("http://videoupload.hopto.org:5001/api/Synclog", datagen, headers)
try:
response = urllib2.urlopen(request, timeout = 20)
html=response.read()
except URLError , e:
if hasattr(e, 'reason'):
print ('Reason: ', e.reason)
elif hasattr(e, 'code'):
print ('Error code: ', e.code)
except Exception:
print ('generic exception: ' + traceback.format_exc())
return

Django To upload and read and write large excel file

I am new to Django and i need my app to allow users to upload excel files. On server side I am reading the excel file by each cell, append some values and then translate the values and again write back to excel file and download the attachment. I am able to perform this action for small files, but for large file it gives me timeout error. Please see the below .
enter code here
def translatedoc(request):
data=""
convrowstr=""
if request.method=='POST':
response = StreamingHttpResponse (content_type='application/vnd.ms-excel')
try:
form=fileUpload(request.POST,request.FILES)
if form.is_valid():
input_file=request.FILES.get('file')
sl=request.POST.get('fsl')
if sl=="Detect Language":
sl="auto"
else:
# get sl code from database
sl=languagecode.objects.filter(Language=sl).values_list('code')
sl=str(sl[0][0])
# get tl code from database
tl=languagecode.objects.filter(Language=request.POST.get('ftl')).values_list('code')
wb = xlrd.open_workbook(file_contents=input_file.read())
wb_sheet=wb.sheet_by_index(0)
for rownum in range(0, wb_sheet.nrows):
convstr=""
for colnum in range(0,wb_sheet.ncols):
try:
rw=wb_sheet.cell_value(rownum,colnum)
if type(rw)==float or type(rw)==int:
convstr=convstr +'<td>' + str(rw)
else:
convstr=convstr +'<td>' + rw
except Exception as e:
pass
if len(convstr) + len(convrowstr) >20000:
# translate if the length of doc exceed the limit
#call google api module
data=data + translate(convrowstr,sl,str(tl[0][0]))
convrowstr=""
if rownum==wb_sheet.nrows-1:
convrowstr= convrowstr + "<tr>" + convstr
# translate for first or last
#call google api module
data=data + translate(convrowstr,sl,str(tl[0][0]))
convrowstr=""
convrowstr= convrowstr + "<tr>" + convstr
log.error(rownum)
if len(data)>1:
sio=StringIO.StringIO()
try:
workbook = xlwt.Workbook()
sheet = workbook.add_sheet("output")
row=0
for rw in data.split("<tr>")[1:]:
col=0
for cl in rw.split("<td>")[1:]:
try:
sheet.write(row,col,cl.split("<b>")[1].split("</b>")[0])
except Exception as e:
pass
col+=1
row+=1
workbook.save(sio)
sio.seek(0)
sv=sio.getvalue()
response['Content-Disposition'] = 'attachment; filename=Output.xls'
return response
except Exception as e:
log.error(e)
except Exception as e:
log.error(e)
you can do the through celery for large file upload. You can read the file in celery.

Failed attempts to bypass "Rate limits" and other Twitter API errors

My codes try to collect tweets about "cars" on 2014-10-01. In attempt to handle the rate limit or any other Twitter-related errors (ie. over capacity), I implement code at the end telling the program to stop and wait for 20min whenever a TweepError has occur.
Unfortunately, it doesn't work as the script crashes and I can still see the rate limit error message. Please advice, thanks.
import tweepy
import time
import csv
ckey = "xxx"
csecret = "xxx"
atoken = "xxx-xxx"
asecret = "xxx"
OAUTH_KEYS = {'consumer_key':ckey, 'consumer_secret':csecret,
'access_token_key':atoken, 'access_token_secret':asecret}
auth = tweepy.OAuthHandler(OAUTH_KEYS['consumer_key'], OAUTH_KEYS['consumer_secret'])
api = tweepy.API(auth)
startSince = '2014-10-01'
endUntil = '2014-10-02'
searchTerms = 'cars'
for tweet in tweepy.Cursor(api.search, q=searchTerms,
since=startSince, until=endUntil).items(999999999):
try:
print "Name:", tweet.author.name.encode('utf8')
print "Screen-name:", tweet.author.screen_name.encode('utf8')
print "Tweet created:", tweet.created_at
except tweepy.error.TweepError:
time.sleep(60*20)
continue
except tweepy.TweepError:
time.sleep(60*20)
continue
except TweepError:
time.sleep(60*20)
continue
except IOError:
time.sleep(60*5)
continue
except StopIteration:
break
Your issue is that your try-except statement happens independent of your call to the Twitter API. The tweepy.Cursor is what triggers the rate-limit error. Try including this line:
for tweet in tweepy.Cursor(api.search, q=searchTerms,
since=startSince, until=endUntil).items(999999999):
within your try and see if the TweepError is caught (it should be). You may need a small modification to get the cursor to continue from the proper location but it should be trivial.

How to get the contents of the error page in Python's mechanize?

So assume I get an error back when using mechanize.Browser.retrieve and I catch it like this:
try:
br.retrieve(url, fname)
except mechanize.HTTPError as e:
if e.code in [403, 404]:
# how can I get to the contents of the server-sent error page?
else:
raise
How can I access the error page which was sent by the server at this point?
I've tried using br.response().get_data(), but that doesn't seem to get populated when using retrieve().
Since HTTP errors are wrapped by mechanize and contain additional info about the response, you can use e.read():
try:
br.retrieve(url, fname)
except mechanize.HTTPError as e:
if e.code in [403, 404]:
print e.read()
else:
raise

errno 10054 on Status check request

Ok, I've tried to resolve this with a couple of different libraries. I'm working on a script to look at thousands of sites and kick out specific items on the pages. I need to be able to reset the connection so that the script will continue without losing any data. I've tried catching the error and waiting but that doesn't seem to fix it as it eventually causes the script to error completely out. I get the error on the below snippet of code in my status check module.
def status(url): #checks the response code
try:
req=urllib2.urlopen(url)
response=req.getcode()
return response
except urllib2.HTTPError, e:
return e.code
print e.code
except urllib2.URLError, e:
print e.args
return e.args
But before trying this I used the below as instead of urrlib2
parsedurl = urlparse(url)
conn = httplib.HTTPConnection(parsedurl.netloc)
conn.request('HEAD',parsedurl.path)
response = conn.getresponse()
return response.status