'NoneType' object is not iterable...Nominatim geocoding - geocoding

I am supposed to plot maps using Folium.I just have a csv file where one column is location.I am trying to get latitude and longitude for the given column containing thousands of value of places in a given district.Below is the code i am getting error as " 'NoneType' object is not iterable "
from geopandas.tools import geocode
for i in df['Location']:
#if Nan:
time.sleep(1)
info = geocode(i,provider="nominatim")

Related

Can I get error index with bulk_create/bulk_update?

I'm using bulk_create method to insert data from an Excel sheet to db. This data can be incorrect (ex: cell contains "Hello World" when it should be an integer) and I'd like to return user thorough information error.
Currently, bulk_create raises a ValueError, ex: Field 'some_integer_field' expected a number but got 'Hello World'.
Is it possible to get more details about this error, something like : **Line 4** - Field 'some_integer_field' expected a number but got 'Hello World'. without saving each object individually ? Does it even make sense since bulk_create produces a single transaction ?
I'm running Django 4.1.3 with PostgreSQL 12.
My code looks like self.MyModel.bulk_create(self.objects_to_create)

Changing the data type of values in the Django model

I have data which is loaded into a dataframe. This dataframe then needs to be saved to a django model. The major problem is that some data which should go into IntegerField or FloatField are empty strings "". On the other side, some data which should be saved into a CharField is represented as np.nan. This leads to the following errors:
ValueError: Field 'position_lat' expected a number but got nan.
If I replace the np.nan with an empty string, using data[database]["df"].replace(np.nan, "", regex = True, inplace = True), I end up with the following error:
ValueError: Field 'position_lat' expected a number but got ''.
So what I would like to do, is to check in the model whether a FloatField or IntegerField gets either np.nan or an empty string and replace it with an empty value. The same for CharField, which should convert integers (if applicable) to strings or np.nan to an empty string.
How could this be implemented? Using ModelManager or customized fields? Or any better approaches? Sorting the CSV files out is not an option.
import pandas as pd
import numpy as np
from .models import Record
my_dataframe = pd.read_csv("data.csv")
record = Record
entries = []
for e in my_dataframe.T.to_dict().values():
entries.append(record(**e))
record.objects.bulk_create(entries)
Maybe the problem was not clear, nevertheless, I would like to post my solution. I create a new dict which only contain keys with corresponding values.
entries = []
for e in my_dataframe.T.to_dict().values():
e = {k: v for k, v in e.items() if v}
entries.append(record(**e))
record.objects.bulk_create(entries)

How do I calculate the average difference between two dates in Django?

Using Django and Python 3.7. I'm tryhing to write a query to give me the average of the difference between two dates. I have two fields in my model, both "DateTimeField"s, and I try to calculate the average difference like so
everything_avg = Article.objects.aggregate(
avg_score=Avg(F('removed_date') - F('created_on'), output_field=models.DateTimeField())
).filter(removed_date__isnull=False)
return everything_avg
but I end up getting this error when running the above
AttributeError: 'dict' object has no attribute 'filter'
What's the right way to get my average?
As the documentation says:
aggregate() is a terminal clause for a QuerySet that, when invoked, returns a dictionary of name-value pairs. *
aggregate method returns a dictionary, thus you need to make your filtering before it. Thus if you alter your code as following you would get your result:
everything_avg = Article.objects.filter(removed_date__isnull=False)\
.aggregate(
avg_score=Avg(
F('removed_date') - F('created_on'),
output_field=models.DateTimeField()
)
)
return everything_avg

ValueError Scikit learn. Number of features of model don't match input

I am pretty new to machine learning in general and scikit-learn in specific.
I am trying to use the example given on the site http://scikit-learn.org/stable/tutorial/basic/tutorial.html
For practicing on my own, I am using my own data-set. My data set is divided into two different CSV files:
Train_data.csv (Contains 32 columns, the last column is the output value).
Test_data.csv (Contains 31 columns the output column is missing - Which should be the case, no?)
Test data is one column less than training data..
I am using the following code to learn (using training data) and then predict (using test data).
The issue I am facing is the error:
*ValueError: X.shape[1] = 31 should be equal to 29, the number of features at training time*
Here is my code (sorry if it looks completely wrong :( )
import pandas as pd #import the library
from sklearn import svm
mydata = pd.read_csv("Train - Copy.csv") #I read my training data set
target = mydata["Desired"] #my csv has header row, and the output label column is named "Desired"
data = mydata.ix[:,:-3] #select all but the last column as data
clf = svm.SVC(gamma=0.001, C=100.) #Code from the URL above
clf.fit(data,target) #Code from the URL above
test_data = pd.read_csv("test.csv") #I read my test data set. Without the output column
clf.predict(test_data[-1:]) #Code from the URL above
The training data csv labels looks something like this:
Value1,Value2,Value3,Value4,Output
The test data csv labels looks something like this:
Value1,Value2,Value3,Value4.
Thanks :)
Your problem is a Supervised Problem, you have some data in form of (input,output).
The input are the features describing your example and the output is the prediction that your model should respond given that input.
In your training data, you'll have one more attribute in your csv file because in order to train your model you need to give him the output.
The general workflow in sklearn with a Supervised Problem should look like this
X, Y = read_data(data)
n = len(X)
X_train, X_test = X[:n*0.8], X[n*0.8:]
Y_train, Y_test = Y[:n*0.8], Y[n*0.8:]
model.fit(X_train,Y_train)
model.score(X_test, Y_test)
To split your data, you can use train_test_split and you can use several metrics in order to judge your model's performance.
You should check the shape of your data
data.shape
It seems like you're not taking into the account the last 3 columns instead of only the last. Try instead :
data = mydata.ix[:,:-1]

Convert MySQLdb return from tuple to string in Python

I am new to MySQLdb. I need to read values from a pre-defined database which is stored in MySQL. My problem is when values are collected, they are in tuple format, not string format. So my question: Is there a way to convert tuple to string?
Below are the details of my code
import MySQLdb
#get value from database
conn = MySQLdb.connect("localhost", "root", "123", "book")
cursor = conn.cursor()
cursor.execute("SELECT koc FROM entries")
Koc_pre = str(cursor.fetchone())
#create a input form by Django and assign pre-defined value
class Inp(forms.Form):
Koc = forms.FloatField(required=True,label=mark_safe('K<sub>OC</sub> (mL/g OC)'),initial=Koc_pre)
#write out this input form
class InputPage(webapp.RequestHandler):
def get(self):
html = str(Inp())
self.response.out.write(html)
The output is in tuple format "Koc=('5',)", but I want "koc=5". So can anyone give me some suggestions or reference book I should check?
Thanks in advance!
If you're only going to be retrieving one value at a time (i.e. getting one column using cursor.fetchone()), then you can just change your code so that you get the first element in the tuple.
Koc_pre = str(cursor.fetchone()[0])