Convert MySQLdb return from tuple to string in Python - django

I am new to MySQLdb. I need to read values from a pre-defined database which is stored in MySQL. My problem is when values are collected, they are in tuple format, not string format. So my question: Is there a way to convert tuple to string?
Below are the details of my code
import MySQLdb
#get value from database
conn = MySQLdb.connect("localhost", "root", "123", "book")
cursor = conn.cursor()
cursor.execute("SELECT koc FROM entries")
Koc_pre = str(cursor.fetchone())
#create a input form by Django and assign pre-defined value
class Inp(forms.Form):
Koc = forms.FloatField(required=True,label=mark_safe('K<sub>OC</sub> (mL/g OC)'),initial=Koc_pre)
#write out this input form
class InputPage(webapp.RequestHandler):
def get(self):
html = str(Inp())
self.response.out.write(html)
The output is in tuple format "Koc=('5',)", but I want "koc=5". So can anyone give me some suggestions or reference book I should check?
Thanks in advance!

If you're only going to be retrieving one value at a time (i.e. getting one column using cursor.fetchone()), then you can just change your code so that you get the first element in the tuple.
Koc_pre = str(cursor.fetchone()[0])

Related

Changing the data type of values in the Django model

I have data which is loaded into a dataframe. This dataframe then needs to be saved to a django model. The major problem is that some data which should go into IntegerField or FloatField are empty strings "". On the other side, some data which should be saved into a CharField is represented as np.nan. This leads to the following errors:
ValueError: Field 'position_lat' expected a number but got nan.
If I replace the np.nan with an empty string, using data[database]["df"].replace(np.nan, "", regex = True, inplace = True), I end up with the following error:
ValueError: Field 'position_lat' expected a number but got ''.
So what I would like to do, is to check in the model whether a FloatField or IntegerField gets either np.nan or an empty string and replace it with an empty value. The same for CharField, which should convert integers (if applicable) to strings or np.nan to an empty string.
How could this be implemented? Using ModelManager or customized fields? Or any better approaches? Sorting the CSV files out is not an option.
import pandas as pd
import numpy as np
from .models import Record
my_dataframe = pd.read_csv("data.csv")
record = Record
entries = []
for e in my_dataframe.T.to_dict().values():
entries.append(record(**e))
record.objects.bulk_create(entries)
Maybe the problem was not clear, nevertheless, I would like to post my solution. I create a new dict which only contain keys with corresponding values.
entries = []
for e in my_dataframe.T.to_dict().values():
e = {k: v for k, v in e.items() if v}
entries.append(record(**e))
record.objects.bulk_create(entries)

Django Postgres JSONField query

I have a class with a json field in it
class A(models.Model)
brand = JSONField()
If I post an array of JSON like [{'brand_id:1', 'name':'b1'}, {'brand_id:2', 'name':'b2'}] it is stored as an array of JSON. This works fine.
How should I query so as to check if '1' is present in the brand_id of any dictionary in that array?
Well first of all your JSON here is malformed. I presume it is meant to be:
[{'brand_id': 1, 'name': 'b1'}, {'brand_id': 2, 'name': 'b2'}]
If that's the case, to test for 1 in such a blob, something like this will tell you if 1 is to be found anywhere in the JSON as a value:
def check_for_one(json_data):
return any([1 in data.values() for data in json_data])
But you want to know specifically if 1 is a value owned by a key
brand_id anywhere in the JSON so you can also use a loop to add in some extra conditions:
def check_for_one(json_data):
match = []
for data in json_data:
for key, value in data.items():
if key == 'brand_id' and value == 1:
match.append(True)
return any(match)
You can incorporate such logic as methods on your model class like this:
class A(models.Model):
brand = JSONField()
def check_for_one_comprehension(self):
return any([1 in data.values() for data in self.brand])
def check_for_one_loop(self):
match = []
for data in self.brand:
for key, value in data.items():
if key == 'brand_id' and value == 1:
match.append(True)
return any(match)
But, if you actually want to filter instances from the database where the JSON data is an array at the top level and brand_id == 1, that needs a different approach, and this should do it:
A.objects.filter(brand__contains=[{'brand_id': 1}])
Note the additional [{}] braces! If you just call contains=['brand_id': 1] it will throw a syntax error and if you call contains={'brand_id': 1} it will not match.
This worked:
A.objects.filter(brands__contains=[{'brand_id':1}])
I didnt check it first by using the array.
Link pointed put by #Bear Brown gives sufficient info.
Its easy in django, but finding it out took time :P.

How to update the value of a specific column of every row in DB with the output of a function that was passed in the original value.

For example: if you wanted to replace in each row the value of a column with the value after being run through a hash function.
In my case I have saved many URLs in a column that have 302 redirects and I want to save where they redirect to after passing their current value into this function:
def finalURL(original):
output = requests.get(original)
output.url
return output.url
I am relatively new to Python and the closest examples I can find are not written in Python (cannot translate).
Additionally I've seen several posts on how to iterate through a db and print every value in a column, but no explanation on how to change that value.
To make your hash function available in SQLite, you have to create a user-defined function:
def finalURL(x):
return ...
db = sqlite3.connect(...)
db.create_function("finalURL", 1, finalURL)
Then you can simply use it in queries:
db.execute("UPDATE MyTable SET url = finalURL(url) WHERE ...")

Unable to retrieve data in django

I am writing a weblog application in django. As part of this, I have a view function that fetches an object from the database corresponding to a single blog post. The field that I am using to query the database is the published date (pub_date) which is of type DateTime (Python). I have a MySQL database and the type of the column for this field is datetime. But I am not able to fetch the object from the database though I am passing the correct date attributes. I am getting a 404 error.The following is my view function:
def entry_detail(request,year,month,day,slug):
import datetime,time
date_stamp = time.strptime(year+month+day,"%Y%b%d")
pub_date = datetime.date(*date_stamp[:3])
entry = get_object_or_404(Entry,pub_date__year=pub_date.year,pub_date__month=pub_date.month,pub_date__day=pub_date.day,slug=slug)
return render_to_response('coltrane/entry_detail.html',{'entry':entry})
The following is the URL of the individual post that I want to fetch:
http://127.0.0.1:8000/weblog/2014/oct/28/third-post/
And this is how the pub_date column value for the third-post in the database looks like:
2014-10-28 13:26:39
The following is the URL pattern:
url(r'^weblog/(?P<year>\d{4})/(?P<month>\w{3})/(?P<day>\d{2})/(?P<slug>[-\w]+)/$','coltrane.views.entry_detail'),
You're doing some odd things here: you're converting to a time, then converting that to a datetime.date, then extracting the year, month and day as integers and passing them to the query. You could bypass almost the whole process: the only thing you need is to convert the month, the other parameters can be passed directly:
month_no = datetime.datetime.strptime(month, '%b').month
entry = get_object_or_404(Entry, pub_date__year=year, pub_date__month=month_no, pub_date__day=day, slug=slug)

Importing salesforce report data using python

I am new to sfdc . I have report already created by user . I would like to use python to dump the data of the report into csv/excel file.
I see there are couple of python packages for that. But my code gives an error
from simple_salesforce import Salesforce
sf = Salesforce(instance_url='https://cs1.salesforce.com', session_id='')
sf = Salesforce(password='xxxxxx', username='xxxxx', organizationId='xxxxx')
Can i have the basic steps for setting up the API and some example code
This worked for me:
import requests
import csv
from simple_salesforce import Salesforce
import pandas as pd
sf = Salesforce(username=your_username, password=your_password, security_token = your_token)
login_data = {'username': your_username, 'password': your_password_plus_your_token}
with requests.session() as s:
d = s.get("https://your_instance.salesforce.com/{}?export=1&enc=UTF-8&xf=csv".format(reportid), headers=sf.headers, cookies={'sid': sf.session_id})
d.content will contain a string of comma separated values which you can read with the csv module.
I take the data into pandas from there, hence the function name and import pandas. I removed the rest of the function where it puts the data into a DataFrame, but if you're interested in how that's done let me know.
In case it is helpful, I wanted to write out the steps I used to answer this question now (Aug-2018), based on Obol's comment. For reference, I followed the README instructions at https://github.com/cghall/force-retrieve/blob/master/README.md for the salesforce_reporting package.
To connect to Salesforce:
from salesforce_reporting import Connection, ReportParser
sf = Connection(username='your_username',password='your_password',security_token='your_token')
Then, to get the report I wanted into a Pandas DataFrame:
report = sf.get_report(your_reports_id)
parser = salesforce_reporting.ReportParser(report)
report = parser.records_dict()
report = pd.DataFrame(report)
If you were so inclined, you could also simplify the four lines above into one, like so:
report = pd.DataFrame(salesforce_reporting.ReportParser(sf.get_report(your_reports_id)).records_dict())
One difference I ran into from the README is that sf.get_report('report_id', includeDetails=True) threw an error stating get_report() got an unexpected keyword argument 'includeDetails'. Simply removing it out seemed result in the code working fine.
report can now be exported via report.to_csv('report.csv',index=False), or manipulated directly.
EDIT: parser.records() changed to parser.records_dict(), as this allows the DataFrame to have the columns already listed, rather than indexing them numerically.
The code below is rather long and might be just for our use case but the basic idea is the following:
Find out date interval length and additional needed filtering to never run into the "more the 2'000" limit. In my case I could have weekly date range filter but would need to apply some additional filters
Then run it like this:
report_id = '00O4…'
sf = SalesforceReport(user, pass, token, report_id)
it = sf.iterate_over_dates_and_filters(datetime.date(2020,2,1),
'Invoice__c.InvoiceDate__c', 'Opportunity.CustomField__c',
[('a', 'startswith'), ('b', 'startswith'), …])
for row in it:
# do something with the dict
The iterator goes through every week (if you need daily iterators or monthly then you'd need to change the code, but the change should be minimal) since 2020-02-01 and applies the filter CustomField__c.startswith('a'), then CustomField__c.startswith('b'), … and acts as a generator so you don't need to mess with the filter cycling yourself.
The iterator throws an Exception if there's a query which returns more than 2000 rows, just to be sure that the data is not incomplete.
One warning here: SF has a limit of max 500 queries per hour. Say if you have one year with 52 weeks and 10 additional filters you'd already run into that limit.
Here's the class (relies on simple_salesforce)
import simple_salesforce
import json
import datetime
"""
helper class to iterate over salesforce report data
and manouvering around the 2000 max limit
"""
class SalesforceReport(simple_salesforce.Salesforce):
def __init__(self, username, password, security_token, report_id):
super(SalesforceReport, self).__init__(username=username, password=password, security_token=security_token)
self.report_id = report_id
self._fetch_describe()
def _fetch_describe(self):
url = f'{self.base_url}analytics/reports/{self.report_id}/describe'
result = self._call_salesforce('GET', url)
self.filters = dict(result.json()['reportMetadata'])
def apply_report_filter(self, column, operator, value, replace=True):
"""
adds/replaces filter, example:
apply_report_filter('Opportunity.InsertionId__c', 'startsWith', 'hbob').
For date filters use apply_standard_date_filter.
column: needs to correspond to a column in your report, AND the report
needs to have this filter configured (so in the UI the filter
can be applied)
operator: equals, notEqual, lessThan, greaterThan, lessOrEqual,
greaterOrEqual, contains, notContain, startsWith, includes
see https://sforce.co/2Tb5SrS for up to date list
value: value as a string
replace: if set to True, then if there's already a restriction on column
this restriction will be replaced, otherwise it's added additionally
"""
filters = self.filters['reportFilters']
if replace:
filters = [f for f in filters if not f['column'] == column]
filters.append(dict(
column=column,
isRunPageEditable=True,
operator=operator,
value=value))
self.filters['reportFilters'] = filters
def apply_standard_date_filter(self, column, startDate, endDate):
"""
replace date filter. The date filter needs to be available as a filter in the
UI already
Example: apply_standard_date_filter('Invoice__c.InvoiceDate__c', d_from, d_to)
column: needs to correspond to a column in your report
startDate, endDate: instance of datetime.date
"""
self.filters['standardDateFilter'] = dict(
column=column,
durationValue='CUSTOM',
startDate=startDate.strftime('%Y-%m-%d'),
endDate=endDate.strftime('%Y-%m-%d')
)
def query_report(self):
"""
return generator which yields one report row as dict at a time
"""
url = self.base_url + f"analytics/reports/query"
result = self._call_salesforce('POST', url, data=json.dumps(dict(reportMetadata=self.filters)))
r = result.json()
columns = r['reportMetadata']['detailColumns']
if not r['allData']:
raise Exception('got more than 2000 rows! Quitting as data would be incomplete')
for row in r['factMap']['T!T']['rows']:
values = []
for c in row['dataCells']:
t = type(c['value'])
if t == str or t == type(None) or t == int:
values.append(c['value'])
elif t == dict and 'amount' in c['value']:
values.append(c['value']['amount'])
else:
print(f"don't know how to handle {c}")
values.append(c['value'])
yield dict(zip(columns, values))
def iterate_over_dates_and_filters(self, startDate, date_column, filter_column, filter_tuples):
"""
return generator which iterates over every week and applies the filters
each for column
"""
date_runner = startDate
while True:
print(date_runner)
self.apply_standard_date_filter(date_column, date_runner, date_runner + datetime.timedelta(days=6))
for val, op in filter_tuples:
print(val)
self.apply_report_filter(filter_column, op, val)
for row in self.query_report():
yield row
date_runner += datetime.timedelta(days=7)
if date_runner > datetime.date.today():
break
For anyone just trying to download a report into a DataFrame this is how you do it (I added some notes and links for clarifications):
import pandas as pd
import csv
import requests
from io import StringIO
from simple_salesforce import Salesforce
# Input Salesforce credentials:
sf = Salesforce(
username='johndoe#mail.com',
password='<password>',
security_token='<security_token>') # See below for help with finding token
# Basic report URL structure:
orgParams = 'https://<INSERT_YOUR_COMPANY_NAME_HERE>.my.salesforce.com/' # you can see this in your Salesforce URL
exportParams = '?isdtp=p1&export=1&enc=UTF-8&xf=csv'
# Downloading the report:
reportId = 'reportId' # You find this in the URL of the report in question between "Report/" and "/view"
reportUrl = orgParams + reportId + exportParams
reportReq = requests.get(reportUrl, headers=sf.headers, cookies={'sid': sf.session_id})
reportData = reportReq.content.decode('utf-8')
reportDf = pd.read_csv(StringIO(reportData))
You can get your token by following the instructions at the bottom of this page