Check if document exists using cloudant-2.0.0b2 in IBM Bluemix and Python - python-2.7

I am using:
A Python application in Bluemix
Bluemix cloudant v2.0.0b2 database linked to the Python app
According to https://pypi.python.org/pypi/cloudant/2.0.0b2, everything broke from 0.5 to 2.0, and they are still working on the documentation as everything is Beta. Next to this, I am also new to Python and databases. Documentation can be found here:
http://python-cloudant.readthedocs.io/en/latest/getting_started.html
What I am trying to do is check if a document already exists.
Things that I have tried:
from cloudant.account import Cloudant
import time
import json
# Connect to the database
client = Cloudant(*hidden*)
client.connect()
# The database we work in
db = client['myDatabase']
# The document we work on
doc = db['myDocument']
print doc.exists()
But the code fails before retrieving the document. I checked the source code, and it looks like it is supposed to:
def __getitem__(self, key):
if key in list(self.keys()):
return super(CouchDatabase, self).__getitem__(key)
if key.startswith('_design/'):
doc = DesignDocument(self, key)
else:
doc = Document(self, key)
if doc.exists():
doc.fetch()
super(CouchDatabase, self).__setitem__(key, doc)
return doc
else:
raise KeyError(key)
Source: https://pypi.python.org/pypi/cloudant/2.0.0b2
Is there a way I can check if the document exists before I retrieve it? Or should I retrieve it and catch the error? Or is there a different approach?

The behavior you are describing is the desired behavior for the python-cloudant library database object, so if you intend to use the database object to retrieve your documents and populate your local database cache you should look to except a KeyError in the event of a non-existent document and handle accordingly. However, if are interested in capturing whether a document exists before bringing it into your local database cache then changing your code to something like:
from cloudant.account import Cloudant
from cloudant.document import Document
# Connect to the database
client = Cloudant(*hidden*)
client.connect()
# The database we work in
db = client['myDatabase']
# The document we work on
if Document(db, 'myDocument').exists():
doc = db['myDocument']
would do the trick.
Similarly you could just do:
from cloudant.account import Cloudant
from cloudant.document import Document
# Connect to the database
client = Cloudant(*hidden*)
client.connect()
# The database we work in
db = client['myDatabase']
# The document we work on
doc = Document(db, 'myDocument')
if doc.exists():
doc.fetch()
But this would not populate your local database cache, the db dictionary.

Related

Two databases in flask with sqlalchemy, one for reading and one for writing data, is it possible

I have a Flask web app where I need two use two existing databases. The first database (DB-A) is used only for reading data (not writing permissions). The second database (DB-B) I am allowed read/write.
I can read DB-A using automap extension
from sqlalchemy.ext.automap import automap_base
Base = automap_base()
Base.prepare(db.engine, reflect=True)
DB-A = Base.classes.visitors
Reading and writing in DB-B I do ti with sqlalchemy common ORM
SQLALCHEMY_DATABASE_URI = ""
db = SQLAlchemy(app)
According to the documentation SQLALCHEMY_BINDS should do the work, but I cannt find anything about, how can I tell my automap engine to read a specific database.
my question is, how can I use automap for reading DB-A along with NOT-automap for writing in DB-B?
Thanks
Have you seen this part of the documentation in Flask explaining how to bind multiple databases?
Quote from documentation:
SQLALCHEMY_DATABASE_URI = 'postgres://localhost/main'
SQLALCHEMY_BINDS = {
'users': 'mysqldb://localhost/users',
'appmeta': 'sqlite:////path/to/appmeta.db'
}
And when this is setup, you can use the following (also quote from documentation):
>>> db.create_all()
>>> db.create_all(bind=['users'])
>>> db.create_all(bind='appmeta')
>>> db.drop_all(bind=None)

Flask SQLAlchemy pymysql Warning: (1366 Incorrect string value)

I'm using flask_sqlalchemy in my flask application with a local MySQL (8.0.19) database. I've never got this issue before (started to develop this app months ago). Not sure what've changed, what component of the app got updated but I'm getting this error out of nowhere at the moment. I've searched and found that it might be some character encoding issue, but following the instructions I still get the warning when I open my app:
C:\Users\MyUserName\AppData\Local\Programs\Python\Python37\lib\site packages\pymysql\cursors.py:170:Warning:
(1366, "Incorrect string value: '\\xF6z\\xE9p-e...' for column 'VARIABLE_VALUE' at row 1")
result = self._query(query)
This is my url env variable:
MYSQL_URL = mysql+pymysql://user:passoword#localhost:3306/testdb?charset=utf8mb4
And this is how I create my db session:
db_url = os.getenv('MYSQL_URL')
engine = create_engine(db_url, echo=True)
Session = sessionmaker()
Session.configure(bind=engine)
session = Session()
This is the most simple usage of the session:
def row_count():
return (
session.query(Value.ValueID).count()
)
When I inspect this local database with HeidiSQL it says its collation is utf8mb4_0900_ai_ci. I don't know what those suffix specifics mean and there's a ton of utf8mb4 variant available. This is the default value.
Anyone has any idea how to resolve this warning? What does it mean exactly? As I'm using an ORM I'm not creating any database or running any query by hand, so how should I handle this?
ai : accent insensitive
ci : case insensitive
Did your try the following URL:
MYSQL_URL = mysql+pymysql://user:passoword#localhost:3306/testdb?charset=utf8mb4_ai_ci

how to use airflow connection as environment variables in python code

Does anyone know how to access airflow environment variable using AIRFLOW_CONN_ and use in the python code. I know we can use hook to get the password, but have been trying to use AIRFLOW_CONN in my python to connect to the database. I have saved the connection in Airflow UI and in the docs, they mentioned to use AIRFLOW_CONN_ prefix to the conn_id to use. I used it in my python code using os.environ['AIRFLOW_CONN_REDSHIFT'], but it does not identify the environment variable. Please help.
Saving the connection to database and setting an AIRFLOW_CONN_ environment variable are two different ways to add a connection. You should only choose one way, unless you want them stored under connection ids.
Assuming you are running your python code through an operator like PythonOperator, you should be able to fetch your connection just like the BaseHook does.
Stored in database:
#classmethod
def _get_connections_from_db(cls, conn_id):
session = settings.Session()
db = (
session.query(Connection)
.filter(Connection.conn_id == conn_id)
.all()
)
session.expunge_all()
session.close()
if not db:
raise AirflowException(
"The conn_id `{0}` isn't defined".format(conn_id))
return db
Stored in environment variable:
#classmethod
def _get_connection_from_env(cls, conn_id):
environment_uri = os.environ.get(CONN_ENV_PREFIX + conn_id.upper())
conn = None
if environment_uri:
conn = Connection(conn_id=conn_id, uri=environment_uri)
return conn
Although I would recommend fetching it via a hook to avoid duplicating this code!

Django Sessions via Memcache: Cannot find session key manually

I recently migrated from database backed sessions to sessions stored via memcached using pylibmc.
Here is my CACHES, SESSION_CACHE_ALIAS & SESSION_ENGINE in my settings.py
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.PyLibMCCache',
'LOCATION': ['127.0.0.1:11211'],
}
}
SESSION_CACHE_ALIAS = 'default'
SESSION_ENGINE = "django.contrib.sessions.backends.cache"
Everything is working fine behind the scenes and I can see that it is using the new caching system. Running the get_stats() method from pylibmc shows me the number of current items in the cache and I can see that it has gone up by 1.
The issue is I'm unable to grab the session manually using pylibmc.
Upon inspecting the request session data in views.py:
def my_view(request):
if request.user.is_authenticated():
print request.session.session_key
# the above prints something like this: "1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
print request.session.cache_key
# the above prints something like this: "django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
return HttpResponse(status=200)
else:
return HttpResponse(status=401)
I noticed that when printing cache_key, it prints with the default KEY_PREFIX whereas for session_key it didn't. Take a look at the comments in the code to see what I mean.
So I figured, "Ok great, one of these key names should work. Let me try grabbing the session data manually just for educational purposes":
import pylibmc
mc = pylibmc.Client(['127.0.0.1:11211'])
# Let's try key "1ay2kcv7axb3nu5fwnwoyf85wkwsttz9"
mc.get("1ay2kcv7axb3nu5fwnwoyf85wkwsttz9")
Hmm nothing happens, no key exists by that name. Ok no worries, let's try the cache_key then, that should definitely work right?
mc.get("django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9")
What? How am I still getting nothing back? As I test I decide to set and get a random key value to see if it works and it does. I run get_stats() again just to make sure that the key does exist. I also test the web app to see if indeed my session is working and it does. So this leads me to conclude that there is a different naming scheme that I'm unaware of.
If so, what is the correct naming scheme?
Yes, the cache key used internally by Django is, in general, different to the key sent to the cache backend (in this case pylibmc / memcached). Let us call these two keys the django cache key and the final cache key respectively.
The django cache key given by request.session.cache_key is for use with Django's low-level cache API, e.g.:
>>> from django.core.cache import cache
>>> cache.get(request.session.cache_key)
{'_auth_user_hash': '1ay2kcv7axb3nu5fwnwoyf85wkwsttz9', '_auth_user_id': u'1', '_auth_user_backend': u'django.contrib.auth.backends.ModelBackend'}
The final cache key on the other hand, is a composition of the key prefix, the django cache key, and the cache version number. The make_key function (from Django docs) below demonstrates how these three values are composed to generate this key:
def make_key(key, key_prefix, version):
return ':'.join([key_prefix, str(version), key])
By default, key_prefix is the empty string and version is 1.
Finally, by inspecting make_key we find that the correct final cache key to pass to mc.get is
:1:django.contrib.sessions.cache1ay2kcv7axb3nu5fwnwoyf85wkwsttz9
which has the form <KEY_PREFIX>:<VERSION>:<KEY>.
Note: the final cache key can be changed by defining KEY_FUNCTION in the cache settings.

Django Redis cache values

I have set the value to Redis server externally using python script.
r = redis.StrictRedis(host='localhost', port=6379, db=1)
r.set('foo', 'bar')
And tried to get the value from web request using django cache inside views.py.
from django.core.cache import cache
val = cache.get("foo")
It is returning None. But when I tries to get it form
from django_redis import get_redis_connection
con = get_redis_connection("default")
val = con.get("foo")
It is returning the correct value 'bar'. How cache and direct connections are working ?
Libraries usually use several internal prefixes to store keys in redis, in order not to be mistaken with user defined keys.
For example, django-redis-cache, prepends a ":1:" to every key you save into it.
So for example when you do r.set('foo', 'bar'), it sets the key to, ":1:foo". Since you don't know the prefix prepended to your key, you can't get the key using a normal get, you have to use it's own API to get.
r.set('foo', 'bar')
r.get('foo') # None
r.get(':1:foo') # bar
So in the end, it returns to the library you use, go read the code for it and see how it exactly saves the keys. redis-cli can be your valuable friend here. Basically set a key with cache.set('foo', 'bar'), and go into redis-cli and check with 'keys *' command to see what key was set for foo.