how to create and polulate a geometry columns in sqlite3 - python-2.7

I am creating a test database on my local machine so I can test some calculations before moving to a bigger run to be saved against the actual postgres database. But I can't get sqlite3 to create and save data to my geom columns (which is being correctly created by python). It seems like the geom column doesn't get created either. Any help will be appreciated.
conn = lite.connect('geom_test.db')
c = conn.cursor()
c.execute("""CREATE TABLE test
(
lat double precision,
long double precision,
speed double precision)"""))
c.execute("""
SELECT AddGeometryColumn('test', 'geom',
4326, 'POINT', 'XY');""")
line = ogr.Geometry(ogr.wkbLineString)
line.AddPoint_2D(lon,lat)
line = line.ExportToWkt()
qry ="INSERT INTO sequence VALUES (?,?,?,?);"
conn = lite.connect('geom_test.db')
c = conn.cursor()
c.executemany(qry, data) #data is 4 columns dict
c.close()
conn.commit()
conn.close()

It looks like you're using SpatiaLite, which adds functionality to SQLite. See Recipe #6: Creating a new Geometry column. The next slide shows how to load some simple geometries into the table.
See the main web resource for SpatiaLite, with cookbooks, references, etc. And if you need to view the geometries, try QGIS.

You need to import the spatialite shared library :
import sqlite3 as lite
from os.path import realpath
conn = lite.connect('geom_test.db')
library_path = "/usr/local/lib/mod_spatialite.so" # on my ubuntu linux
conn.enable_load_extension(True)
conn.load_extension(realpath(library_path))
Or another way to load the extension :
conn.execute("SELECT load_extension(?, 'sqlite3_modspatialite_init');",
(realpath(library_path),))
On windows the name will be 'mod_spatialite.dll' (Also be careful to the path provided to the load_extension(), ie. properly escaping problematic characters and resolving symbolic links before calling it).
Anyway this page on SpatiaLite website show some details about the loading of the shared library.
Edit:
The loading will raise an OperationalError if it fails. If it success, you can check if it use the versions you expected (and do the operations you were trying to do, of course):
vs = str([i for i in c.execute("""SELECT spatialite_version()""")])
vgeos = str([i for i in c.execute("""SELECT geos_version()""")]
print('Spatialite {} (GEOS {})'.format(
vs.strip("()[]',"), vgeos.strip("()',[]")))

It looks like you're using PostGIS, which adds functionality to PostgreSQL. Included in this is the AddGeometryColumn function. This won't be available to SQLite, as it isn't a standard function.

Related

Flask SQLAlchemy pymysql Warning: (1366 Incorrect string value)

I'm using flask_sqlalchemy in my flask application with a local MySQL (8.0.19) database. I've never got this issue before (started to develop this app months ago). Not sure what've changed, what component of the app got updated but I'm getting this error out of nowhere at the moment. I've searched and found that it might be some character encoding issue, but following the instructions I still get the warning when I open my app:
C:\Users\MyUserName\AppData\Local\Programs\Python\Python37\lib\site packages\pymysql\cursors.py:170:Warning:
(1366, "Incorrect string value: '\\xF6z\\xE9p-e...' for column 'VARIABLE_VALUE' at row 1")
result = self._query(query)
This is my url env variable:
MYSQL_URL = mysql+pymysql://user:passoword#localhost:3306/testdb?charset=utf8mb4
And this is how I create my db session:
db_url = os.getenv('MYSQL_URL')
engine = create_engine(db_url, echo=True)
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
This is the most simple usage of the session:
def row_count():
return (
session.query(Value.ValueID).count()
)
When I inspect this local database with HeidiSQL it says its collation is utf8mb4_0900_ai_ci. I don't know what those suffix specifics mean and there's a ton of utf8mb4 variant available. This is the default value.
Anyone has any idea how to resolve this warning? What does it mean exactly? As I'm using an ORM I'm not creating any database or running any query by hand, so how should I handle this?
ai : accent insensitive
ci : case insensitive
Did your try the following URL:
MYSQL_URL = mysql+pymysql://user:passoword#localhost:3306/testdb?charset=utf8mb4_ai_ci

running a function to load csv data into the DB via Django

This should be like nobrainer question., but it intrigues me.
I want to load a roster of countries into the database and yes, I know I could easily do it directly by importing the csv either in mysql or postgres, and I could also write the code snippet into the console (very inconvenient) and would work fine, but I wonder how you do it in django. Because, if I have this:
def loadccountries():
with open(uploads/countries.csv) as f:
reader = csv.reader(f)
for row in reader:
_,created = countries.objects.get_or_create(
name = row[0]
)
then, where do I place that code?, in the views? if yes, then, how do I make that function alone run? I cannot see myself creating artificial urls in urls.py and an html page so that I can call the view from the URL etc. Dont vote me down, I am struggling not to go below 23 score.
You could write a custom management command which calls your loadcountries method.
This would allow you run:
manage.py loadcountries

'idf vector is not fitted' error when using a saved classifier/model

Pardon me if I use the wrong terminology but what I want is to train a set of data (using GaussianNB Naive Bayes from Scikit Learn), save the model/classifier and then load it whenever I need and predict a category.
from sklearn.externals import joblib
from sklearn.naive_bayes import GaussianNB
from sklearn.feature_extraction.text import TfidfVectorizer
self.vectorizer = TfidfVectorizer(decode_error='ignore')
self.X_train_tfidf = self.vectorizer.fit_transform(train_data)
# Fit the model to my training data
self.clf = self.gnb.fit(self.X_train_tfidf.toarray(), category)
# Save the classifier to file
joblib.dump(self.clf, 'trained/NB_Model.pkl')
# Save the vocabulary to file
joblib.dump(self.vectorizer.vocabulary_, 'trained/vectorizer_vocab.pkl')
#Next time, I read the saved classifier
self.clf = joblib.load('trained/NB_Model.pkl')
# Read the saved vocabulary
self.vocab =joblib.load('trained/vectorizer_vocab.pkl')
# Initializer the vectorizer
self.vectorizer = TfidfVectorizer(vocabulary=self.vocab, decode_error='ignore')
# Try to predict a category for new data
X_new_tfidf = self.vectorizer.transform(new_data)
print self.clf.predict(X_new_tfidf.toarray())
# After running the predict command above, I get the error
'idf vector is not fitted'
Can anyone tell me what I'm missing?
Note: The saving of the model, the reading of the saved model and trying to predict a new category are all different methods of a class. I have collapsed all of them into a single screen here to make for easier reading.
Thanks
You need to pickle the self.vectorizer and load it again. Currently you are only saving the vocabulary learnt by the vectorizer.
Change the following line in your program:
joblib.dump(self.vectorizer.vocabulary_, 'trained/vectorizer_vocab.pkl')
to:
joblib.dump(self.vectorizer, 'trained/vectorizer.pkl')
And the following line:
self.vocab =joblib.load('trained/vectorizer_vocab.pkl')
to:
self.vectorizer =joblib.load('trained/vectorizer.pkl')
Delete this line:
self.vectorizer = TfidfVectorizer(vocabulary=self.vocab, decode_error='ignore')
Problem explanation:
You are correct in your thinking to just save the vocabulary learnt and reuse it. But the scikit-learn TfidfVectorizer also has the idf_ attribute which contains the IDF of the saved vocabulary. So you need to save that also. But even if you save both and load them both in a new TfidfVectorizer instance, then also you will get the "not_fitted" error. Because thats just the way most of the scikit transformers and estimators are defined. So without doing anything "hacky" saving the whole vectorizer is your best bet. If you still want to go onto the saving the vocabulary path, then please take a look here to how to properly do that:
http://thiagomarzagao.com/2015/12/08/saving-TfidfVectorizer-without-pickles/
The above page saves vocabulary into json and idf_ into a simple array. You can use pickles there, but you will get the idea about the working of TfidfVectorizer.
Hope it helps.

Django: How to insert sql statements into sqlite3 database and configure with models

I'm using Django for a project. I have a .sql file containing insert statements. How can I get them into my sqlite3 database and have them work with my models?
The project "fortune" has a models.py file that looks like the following:
class fortune(models.Model):
id = models.IntegerField(primary_key=True)
category = models.CharField(max_length=50)
length = models.IntegerField()
aphorism = models.CharField(max_length=5000)
I have a .sql file with a list of Insert statements like the follwing:
INSERT INTO "fortune_fortune" VALUES(1,'fortunes',127,'Arbitrary Text');
When I run .schema on my db.sqlite3 file which is configured with my project I see:
CREATE TABLE fortune_fortune(id integer, category varchar(50), length integer, aphorism varchar(5000));
I've tried using .read in my sqlite shell with no luck. I've tried typing "sqlite3 file.sqlite3 < file.sql" in bash as well. There is something I'm missing here, but I can't seem to ID the problem.
Thanks for your help.
ok, wait.. normally you dont use sql statements to insert data into db, if you work with django.
to insert data into db, you work with django ORM which is way much fun than these ugly sql statements.
fortune = fortune(category='new cat', length=19, aphorism='i love life')
fortune.save()
then as a result, you will have one new row in fortune table in your db. just read django docs and you feel happy!
and one more thing, class names are always in capital.
to your issue:
Django provides a hook for passing the database arbitrary SQL that’s executed just after the CREATE TABLE statements when you run migrate. You can use this hook to populate default records, or you could also create SQL functions, views, triggers, etc.
The hook is simple: Django just looks for a file called sql/<modelname>.sql, in your app directory, where <modelname> is the model’s name in lowercase.
more in docs

Django fixture fails, stating "DatabaseError: value too long for type character varying(50)"

I have a fixture (json) which loads in development environment but fails to do so in server environment. The error says: "DatabaseError: value too long for type character varying(50)"
My development environment is Windows & Postgres 8.4. The server runs Debian and Postgres 8.3. Database encoding is UTF8 in both systems.
It is as if unicode markers in the fixture count as chars on the server and they cause some strings to exceed their field's max length. However that does not happen in the dev environment..
Update: the 50 char limit is now 255 in Django 1.8
--
Original answer:
I just encountered this this afternoon, too, and I have a fix (of sorts)
This post here implied it's a Django bug to do with length of the value allowed for auth_permission. Further digging backs up that idea, as does this Django ticket (even though it's initially MySQL-related).
It's basically that a permission name is created based on the verbose_name of a model plus a descriptive permission string, and that can overflow to more than the 50 chars allowed in auth.models.Permission.name.
To quote a comment on the Django ticket:
The longest prefixes for the string value in the column auth_permission.name are "Can change " and "Can delete ", both with 11 characters. The column maximum length is 50 so the maximum length of Meta.verbose_name is 39.
One solution would be to hack that column to support > 50 characters (ideally via a South migration, I say, so that it's easily repeatable) but the quickest, most reliable fix I could think of was simply to make my extra-long verbose_name definition a lot shorter (from 47 chars in the verbose_name to around 20). All works fine now.
Well, what makes the difference is the encoding of the template databases. On the production server they had ascii encoding while on the dev box it is utf-8.
By default postgres creates a database using the template1. My understanding is that if its encoding is not utf-8, then the database you create will have this issue, even though you create it with utf-8 encoding.
Therefore I dropped it and recreated it with its encoding set to UTF8. The snippet below does it (taken from here):
psql -U postgres
UPDATE pg_database SET datallowconn = TRUE where datname = 'template0';
\c template0
UPDATE pg_database SET datistemplate = FALSE where datname = 'template1';
drop database template1;
create database template1 with template = template0 encoding = 'UNICODE';
UPDATE pg_database SET datistemplate = TRUE where datname = 'template1';
\c template1
UPDATE pg_database SET datallowconn = FALSE where datname = 'template0';
Now the fixture loads smoothly.
Get the real SQL query on both systems and see what is different.
Just for information : I also had this error
DatabaseError: value too long for type character varying(10)
It seems that I was writing data over the limit of 10 for a field. I fixed it by increasing the size of a CharField from 10 to 20
I hope it helps
As #stevejalim says, it's quite possible that the column auth_permission.name is the problem with length 50, you verify this with \d+ auth_permission in postgres's shell. In my case this is the problema, thus when I load django models's fixtures I got “DatabaseError: value too long for type character varying(50)”, then change django.contrib.auth's Permission model is complicated, so ... the simple solution was perform a migrate on Permission model, I did this running ALTER TABLE auth_permission ALTER COLUMN name TYPE VARCHAR(100); command in postgres's shell, this works for me.
credits for this comment
You can make Django use longer fields for this model by monkey-patching the model prior to using it to create the database tables. In "manage.py", change:
if __name__ == "__main__":
execute_manager(settings)
to:
from django.contrib.auth.models import Permission
if __name__ == "__main__":
# Patch the field width to allow for our long model names
Permission._meta.get_field('name').max_length=200
Permission._meta.get_field('codename').max_length=200
execute_manager(settings)
This modifies the options on the field before (say) manage.py syncdb is run, so the databate table has nice wide varchar() fields. You don't need to do this when invoking your app, as you never attempt to modify the Permissions table whle running.