Cassandra python driver: Client request timeout - python-2.7

I setup a simple script to insert a new record into a Cassandra database. It works fine on my local machine, but I am getting timeout errors from the client when I moved the database to a remote machine. How do I properly set the timeout for this driver? I have tried many things. I hacked the timeout in my IDE and got it to work without timing out, so I know for sure its just a timeout problem.
How I setup my Cluster:
profile = ExecutionProfile(request_timeout=100000)
self.cluster = Cluster([os.getenv('CASSANDRA_NODES', None)], auth_provider=auth_provider,
execution_profiles={EXEC_PROFILE_DEFAULT: profile})
connection.setup(hosts=[os.getenv('CASSANDRA_SEED', None)],
default_keyspace=os.getenv('KEYSPACE', None),
consistency=int(os.getenv('CASSANDRA_SESSION_CONSISTENCY', 1)), auth_provider=auth_provider,
connect_timeout=200)
session = self.cluster.connect()
The query I am trying to perform:
model = Model.create(buffer=_buffer, lock=False, version=self.version)
13..': 'Client request timeout. See Session.execute_async'}, last_host=54.213..
The record I'm inserting is 11mb, so I can understand there is a delay, just increasing the timeout should do it, but I can't seem to figure it out.

The default request timeout is an attribute of the Session object (version 2.0.0 of the driver and later).
session = cluster.connect(keyspace)
session.default_timeout = 60
This is the simplest answer (no need to mess about with an execution profile), and I have confirmed that it works.
https://datastax.github.io/python-driver/api/cassandra/cluster.html#cassandra.cluster.Session

You can set request_timeout in the Cluster constructor:
self.cluster = Cluster([os.getenv('CASSANDRA_NODES', None)],
auth_provider=auth_provider,
execution_profiles={EXEC_PROFILE_DEFAULT: profile},
request_timeout=10)
Reference: https://datastax.github.io/python-driver/api/cassandra/cluster.html

Based on the documentation, request_timeout is an attribute of ExecutionProfile class, and you can give an execution profile to the cluster constructor (this is an example).
So, you can do:
from cassandra.cluster import Cluster
from cassandra.cluster import ExecutionProfile
execution_profil = ExecutionProfile(request_timeout=600)
profiles = {'node1': execution_profil}
cluster = Cluster([os.getenv('CASSANDRA_NODES', None)], execution_profiles=profiles)
session = cluster.connect()
session.execute('SELECT * FROM test', execution_profile='node1')
Important: when you use execute or èxecute_async, you have to specify the execution_profile name.

Related

Cloud SQL: Instance update will not end

I have a problem with my Cloud SQL Second Generation instance.
I used this instance for months without problems.
Every evening I stop the instance and every morning I perform a restart via Java, using "patch" API:
DatabaseInstance requestBody = new DatabaseInstance();
Settings settings = new Settings();
settings.setActivationPolicy(ALWAYS_POLICY);
requestBody.setSettings(settings);
SQLAdmin sqlAdminService = createSqlAdminService();
SQLAdmin.Instances.Patch request =
sqlAdminService.instances().patch(projectId, sql_instanceId, requestBody);
Operation response = request.execute();
Today after the usual restart operation, the instance results in an "Instance update" status that never ends.
All functions on the cloud console are disabled for the instance and I cant't perform any action.
These are the settings of my instance:
Database version is MySQL 5.7
Auto storage increase is enabled
Automated backups are enabled
Binary logging is disabled
Located in europe-west1-c
How can I solve this situation?

How to set timeouts of db calls using flask and SQLAlchemy?

I need to set timeout of db calls, and I looked into SQLAlchemy documentation http://flask-sqlalchemy.pocoo.org/2.1/config/
There are many configuration parameters, but never illustrate an example of how to use them. Could anyone show me how to use SQLALCHEMY_POOL_TIMEOUT in order to set timeout of db calls? I have them in my .py files, but I don't know whether I use the parameter correctly.
app = Flask(__name__)
app.config["LOGGER_NAME"] = ' '.join([app.logger_name,
socket.gethostname(), instance_id])
app.config["SQLALCHEMY_DATABASE_URI"] = config.sqlalchemy_database_uri
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
app.config["SQLALCHEMY_POOL_TIMEOUT"] = 30
The document only states that "Specifies the connection timeout for the pool. Defaults to 10." and I don't even know the unit of this 10, is it seconds or milliseconds?
The unit is seconds. As can be seen in the later documentation. Configuration — Flask-SQLAlchemy Documentation (2.3)

Cassandra Python driver OperationTimedOut issue

I have a python script which is used to interact with cassandra with datastax python driver
It has been running since March 14th, 2016 and had not problem until today.
2016-06-02 13:53:38,362 ERROR ('Unable to connect to any servers', {'172.16.47.155': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})
2016-06-02 13:54:18,362 ERROR ('Unable to connect to any servers', {'172.16.47.155': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})
Below is the function used for creating a session, and shutdown the session (session.shutdown()) every time a query is done.(Every day we only have less than 100 queries from the subscribers side, therefore I chose build connection, do the query and close it instead of leaving the connection alive)
The session is not shared between threads and processes. If i invoke the below function in python console, it connects with the DB properly, but the running script cannot connect to the DB anymore.
Any one can help or shed some light on this issue? Thanks
def get_cassandra_session(stat=None):
"""creates cluster and gets the session base on key space"""
# be aware that session cannot be shared between threads/processes
# or it will raise OperationTimedOut Exception
if config.CLUSTER_HOST2:
cluster = cassandra.cluster.Cluster([config.CLUSTER_HOST1, config.CLUSTER_HOST2])
else:
# if only one address is available, we have to use older protocol version
cluster = cassandra.cluster.Cluster([config.CLUSTER_HOST1], protocol_version=2)
if stat and type(stat) == BatchStatement:
retry_policy = cassandra.cluster.RetryPolicy()
retry_policy.on_write_timeout(BatchStatement, ConsistencyLevel, WriteType.BATCH_LOG, ConsistencyLevel.ONE,
ConsistencyLevel.ONE, retry_num=0)
cluster.default_retry_policy = retry_policy
session = cluster.connect(config.KEY_SPACE)
session.default_timeout = 30.0
return session
Specs:
python 2.7
Cassandra 2.1.11
Quotes from datastax doc:
The operation took longer than the specified (client-side) timeout to complete. This is not an error generated by Cassandra, only the driver.
The problem is I didn't touch the driver. I set the default timeout to 30.0 seconds but why it timedout in 5 seconds(it is said in the log)
The default connect timeout is five seconds. In this case you would need to set Cluster.connect_timeout. The Session default_timeout applies to execution requests.
It's still a bit surprising when any TCP connection takes more than five seconds to establish. One other thing to check would be monkey patching. Did something in the application change patching for Gevent or Eventlet? That could cause a change in default behavior for the driver.
I've learned that the gevent module interferes with the cassandra-driver
cassandra-driver (3.10)
gevent (1.1.1)
Uninstalling gevent solved the problem for me
pip uninstall gevent

boto.sqs connect to non-aws endpoint

I'm currently in need of connecting to a fake_sqs server for dev purposes but I can't find an easy way to specify endpoint to the boto.sqs connection. Currently in java and node.js there are ways to specify the queue endpoint and by passing something like 'localhst:someport' I can connect to my own sqs-like instance. I've tried the following with boto:
fake_region = regioninfo.SQSRegionInfo(name=name, endpoint=endpoint)
conn = fake_region.connect(aws_access_key_id="TEST", aws_secret_access_key="TEST", port=9324, is_secure=False);
and then:
queue = connAmazon.get_queue('some_queue')
but it fails to retrieve the queue object,it returns None. Has anyone achieved to connect to an own sqs instance ?
Here's how to create an SQS connection that connects to fake_sqs:
region = boto.sqs.regioninfo.SQSRegionInfo(
connection=None,
name='fake_sqs',
endpoint='localhost', # or wherever fake_sqs is running
connection_cls=boto.sqs.connection.SQSConnection,
)
conn = boto.sqs.connection.SQSConnection(
aws_access_key_id='fake_key',
aws_secret_access_key='fake_secret',
is_secure=False,
port=4568, # or wherever fake_sqs is running
region=region,
)
region.connection = conn
# you can now work with conn
# conn.create_queue('test_queue')
Be aware that, at the time of this writing, the fake_sqs library does not respond correctly to GET requests, which is how boto makes many of its requests. You can install a fork that has patched this functionality here: https://github.com/adammck/fake_sqs

App Engine local datastore content does not persist

I'm running some basic test code, with web.py and GAE (Windows 7, Python27). The form enables messages to be posted to the datastore. When I stop the app and run it again, any data posted previously has disappeared. Adding entities manually using the admin (http://localhost:8080/_ah/admin/datastore) has the same problem.
I tried setting the path in the Application Settings using Extra flags:
--datastore_path=D:/path/to/app/
(Wasn't sure about syntax there). It had no effect. I searched my computer for *.datastore, and couldn't find any files, either, which seems suspect, although the data is obviously being stored somewhere for the duration of the app running.
from google.appengine.ext import db
import web
urls = (
'/', 'index',
'/note', 'note',
'/crash', 'crash'
)
render = web.template.render('templates/')
class Note(db.Model):
content = db.StringProperty(multiline=True)
date = db.DateTimeProperty(auto_now_add=True)
class index:
def GET(self):
notes = db.GqlQuery("SELECT * FROM Note ORDER BY date DESC LIMIT 10")
return render.index(notes)
class note:
def POST(self):
i = web.input('content')
note = Note()
note.content = i.content
note.put()
return web.seeother('/')
class crash:
def GET(self):
import logging
logging.error('test')
crash
app = web.application(urls, globals())
def main():
app.cgirun()
if __name__ == '__main__':
main()
UPDATE:
When I run it via command line, I get the following:
WARNING 2012-04-06 19:07:31,266 rdbms_mysqldb.py:74] The rdbms API is not available because the MySQLdb library could not be loaded.
INFO 2012-04-06 19:07:31,778 appengine_rpc.py:160] Server: appengine.google.com
WARNING 2012-04-06 19:07:31,783 datastore_file_stub.py:513] Could not read datastore data from c:\users\amy\appdata\local\temp\dev_appserver.datastore
WARNING 2012-04-06 19:07:31,851 dev_appserver.py:3394] Could not initialize images API; you are likely missing the Python "PIL" module. ImportError: No module named _imaging
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:647] Running application dev~palimpsest01 on port 8080: http://localhost:8080
INFO 2012-04-06 19:07:32,052 dev_appserver_multiprocess.py:649] Admin console is available at: http://localhost:8080/_ah/admin
Suggesting that the datastore... didn't install properly?
As of 1.6.4, we stopped saving the datastore after every write. This method did not work when simulating the transactional model found in the High Replication Datastore (you would lose the last couple writes). It is also horribly inefficient. We changed it so the datastore dev stub flushes all writes and saves its state on shut down. It sounds like the dev_appserver is not shutting down correctly. You should see:
Applying all pending transactions and saving the datastore
in the logs when shutting down the server (see source code and source code). If you don't, it means that the dev_appserver is not being shut down cleanly (with a TERM signal or KeyInterrupt).