NotSupportedError when trying to build primary index in N1QL in Couchbase Python SDK - python-2.7

I'm trying to get into the new N1QL Queries for Couchbase in Python.
I got my database set up in Couchbase 4.0.0.
My initial try was to retreive all documents like this:
from couchbase.bucket import Bucket
bucket = Bucket('couchbase://localhost/dafault')
rv = bucket.n1ql_query('CREATE PRIMARY INDEX ON default').execute()
for row in bucket.n1ql_query('SELECT * FROM default'):
print row
But this produces a OperationNotSupportedError:
Traceback (most recent call last):
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 2357, in <module>
globals = debugger.run(setup['file'], None, None, is_module)
File "/Applications/PyCharm.app/Contents/helpers/pydev/pydevd.py", line 1777, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/Users/my_user/python_tests/test_n1ql.py", line 9, in <module>
rv = bucket.n1ql_query('CREATE PRIMARY INDEX ON default').execute()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/couchbase/n1ql.py", line 215, in execute
for _ in self:
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/couchbase/n1ql.py", line 235, in __iter__
self._start()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/couchbase/n1ql.py", line 180, in _start
self._mres = self._parent._n1ql_query(self._params.encoded)
couchbase.exceptions.NotSupportedError: <RC=0x13[Operation not supported], Couldn't schedule n1ql query, C Source=(src/n1ql.c,82)>
Here the version numbers of everything I use:
Couchbase Server: 4.0.0
couchbase python library: 2.0.2
cbc: 2.5.1
python: 2.7.8
gcc: 4.2.1
Anyone an idea what might have went wrong here? I could not find any solution to this problem up to now.
There was another ticket for node.js where the same issue happened. There was a proposal to enable n1ql for the specific bucket first. Is this also needed in python?

It would seem you didn't configure any cluster nodes with the Query or Index services. As such, the error returned is one that indicates no nodes are available.

I also got similar error while trying to create primary index.
Create a primary index...
Traceback (most recent call last):
File "post-upgrade-test.py", line 45, in <module>
mgr.n1ql_index_create_primary(ignore_exists=True)
File "/usr/local/lib/python2.7/dist-packages/couchbase/bucketmanager.py", line 428, in n1ql_index_create_primary
'', defer=defer, primary=True, ignore_exists=ignore_exists)
File "/usr/local/lib/python2.7/dist-packages/couchbase/bucketmanager.py", line 412, in n1ql_index_create
return IxmgmtRequest(self._cb, 'create', info, **options).execute()
File "/usr/local/lib/python2.7/dist-packages/couchbase/_ixmgmt.py", line 160, in execute
return [x for x in self]
File "/usr/local/lib/python2.7/dist-packages/couchbase/_ixmgmt.py", line 144, in __iter__
self._start()
File "/usr/local/lib/python2.7/dist-packages/couchbase/_ixmgmt.py", line 132, in _start
self._cmd, index_to_rawjson(self._index), **self._options)
couchbase.exceptions.NotSupportedError: <RC=0x13[Operation not supported], Couldn't schedule ixmgmt operation, C Source=(src/ixmgmt.c,98)>
Adding query and index node to the cluster solved the issue.

Related

How to run SHOW PARTITIONS on hive table using pyspark?

I am trying to run SHOW PARTITIONS on hive table using pyspark, but it is failing with the below error. I am using dataproc cluster on GCP to run pyspark job.
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
Traceback (most recent call last):
File "/tmp/test/pyspark_test.py", line 64, in <module>
sqlContext.sql(
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/context.py", line 433, in sql
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/session.py", line 723, in sql
File "/usr/lib/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 117, in deco
pyspark.sql.utils.AnalysisException: Table not found for 'SHOW PARTITIONS': dbname.tbname; line 1 pos 0;
'ShowPartitions
+- 'UnresolvedTable [dbname, tbname], SHOW PARTITIONS
I tried below two approaches, but none of the approach works and gives UnresolvedTable error. Please help
spark = SparkSession.builder.enableHiveSupport().appName("test").getOrCreate()
spark.sql("""SHOW PARTITIONS dbname.tbname""")
and
spark = SparkSession.builder.appName("test").getOrCreate()
sc = spark.sparkContext
sqlContext = HiveContext(sc)
sqlContext.sql("""SHOW PARTITIONS dbname.tbname""")
After searching a bit, it seems job is unable to find hive-site.xml (not sure!). If thats the case, how to pass this file to the job?

PyMongo 3 and ServerSelectionTimeoutError while getting data from Mongodb

This seems like an old solved problem here and here and here but Still I am getting this error.I create my db on Docker.And It worked only one time.Before this, I re-created db, did "connect =false",added wait, downgraded pymongo, did previous solutions etc. I stuck.
Python 3.8.0, Pymongo 3.9.0
from pymongo import MongoClient
import pprint
client = MongoClient('mongodb://192.168.1.100:27017/',
username='admin',
password='psw',
authSource='myappdb',
authMechanism='SCRAM-SHA-1',
connect=False)
db = client['myappdb']
serverStatusResult=db.command("serverStatus")
pprint(serverStatusResult)
and I am getting ServerSelectionTimeoutError
Traceback (most recent call last):
File "C:\Users\ME\eclipse2019-workspace\exdjango\exdjango__init__.py",
line 12, in
serverStatusResult=db.command("serverStatus")
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\database.py",
line 610, in command
with self.client._socket_for_reads(
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\contextlib.py",
line 113, in __enter
return next(self.gen)
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\mongo_client.py",
line 1099, in _socket_for_reads
server = topology.select_server(read_preference)
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 222, in select_server
return random.choice(self.select_servers(selector,
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 182, in select_servers
server_descriptions = self._select_servers_loop(
File "C:\Users\ME\AppData\Local\Programs\Python\Python38-32\lib\site-packages\pymongo\topology.py",
line 198, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: 192.168.1.100:27017: timed out
Your connection looks a little misconfigured. Firstly you have half connection string, half parameter format. I'd suggest you stick with one or the other.
Your auth database is usually seperate to your actual databases (and it's usually called admin). Check this is correct.
There's no particular need to specify the authMechanism assuming you are using MongoDB 3.0 or later.
The connect=False is likely a red herring.
So I would try one of either:
client = MongoClient('mongodb://admin:psw#192.168.1.100:27017/myappdb?authSource=admin')
or
client = MongoClient(host='192.168.1.100',
port=27017,
username='admin',
password='psw',
authSource='admin')

ERAlchemy cannot connect to database

I don't understand the syntax to call the render feature of ERAlchemy (https://pypi.org/project/ERAlchemy see "Usage for Python"). I am using Python 2.7, sqlite3, and PyCharm. I have ERAlchemy, GraphViz, and PyGraphViz installed.
I am trying the following, but it cannot connect to the database:
from eralchemy import render_er
render_er("sqlite:///C:\\Users\\myname\\Documents\\Work\\pythonsqlite.db", 'erd_from_sqlite.png')
And this is the error:
Traceback (most recent call last):
File "C:/Users/myname/Documents/Work/_sql_functions_rev0.py", line 81, in <module>
render_er("sqlite:///C:\\Users\\myname\\Documents\\Work\\pythonsqlite.db", 'erd_from_sqlite.png')
File "C:\Python27\ArcGISx6410.6\lib\site-packages\eralchemy\main.py", line 236, in render_er
intermediary_to_output(tables, relationships, output)
File "C:\Python27\ArcGISx6410.6\lib\site-packages\eralchemy\main.py", line 75, in intermediary_to_schema
graph.draw(path=output, prog='dot', format=extension)
File "C:\Python27\ArcGISx6410.6\lib\site-packages\pygraphviz\agraph.py", line 1474, in draw
data = self._run_prog(prog, args)
File "C:\Python27\ArcGISx6410.6\lib\site-packages\pygraphviz\agraph.py", line 1308, in _run_prog
runprog = r'"%s"' % self._get_prog(prog)
File "C:\Python27\ArcGISx6410.6\lib\site-packages\pygraphviz\agraph.py", line 1295, in _get_prog
raise ValueError("Program %s not found in path." % prog)
ValueError: Program dot not found in path.
Ah! Found the answer here
had to find the folder with "dot.exe" and add it to the environment variables -> system variables -> path

"xml.sax._exceptions.SAXReaderNotAvailable: No parsers found" when run in jenkins

So I'm working towards having automated staging deployments via Jenkins and Ansible. Part of this is using a script called ec2.py from ansible in order to dynamically retrieve a list of matching servers to deploy to.
SSH-ing into the Jenkins server and running the script from the jenkins user, the script runs as expected. However, running the script from within jenkins leads to the following error:
ERROR: Inventory script (ec2/ec2.py) had an execution error: Traceback (most recent call last):
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/ec2/ec2.py", line 1262, in <module>
Ec2Inventory()
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/ec2/ec2.py", line 159, in __init__
self.do_api_calls_update_cache()
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/ec2/ec2.py", line 386, in do_api_calls_update_cache
self.get_instances_by_region(region)
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/ec2/ec2.py", line 417, in get_instances_by_region
reservations.extend(conn.get_all_instances(filters = { filter_key : filter_values }))
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/.local/lib/python2.7/site-packages/boto/ec2/connection.py", line 585, in get_all_instances
max_results=max_results)
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/.local/lib/python2.7/site-packages/boto/ec2/connection.py", line 681, in get_all_reservations
[('item', Reservation)], verb='POST')
File "/opt/bitnami/apps/jenkins/jenkins_home/jobs/Deploy API/workspace/deploy/.local/lib/python2.7/site-packages/boto/connection.py", line 1181, in get_list
xml.sax.parseString(body, h)
File "/usr/lib/python2.7/xml/sax/__init__.py", line 43, in parseString
parser = make_parser()
File "/usr/lib/python2.7/xml/sax/__init__.py", line 93, in make_parser
raise SAXReaderNotAvailable("No parsers found", None)
xml.sax._exceptions.SAXReaderNotAvailable: No parsers found
I don't know too much about python, so I'm not sure how to debug this issue further.
So it turns out the issue was to do with Jenkins overwriting the default LD_LIBRARY_PATH variable. By unsetting that variable before running python, I was able to make the python app work!

Solr issues with searching

I was using Apache Solr for quite some time and only recently started running into some severe issues with it. I'm using it with haystack and a django project. When I do it from manage.py shell i'm getting the below:
>>> from haystack.query import SearchQuerySet
>>> emps = SearchQuerySet().filter(django_ct='web.employer').filter(name__icontains='Mi')[:10]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/haystack/query.py", line 241, in __getitem__
self._fill_cache(start, bound)
File "/usr/local/lib/python2.7/dist-packages/haystack/query.py", line 140, in _fill_cache
results = self.query.get_results(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/__init__.py", line 469, in get_results
self.run(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/solr_backend.py", line 501, in run
results = self.backend.search(final_query, **search_kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/__init__.py", line 47, in wrapper
return func(obj, query_string, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/solr_backend.py", line 202, in search
raw_results = self.conn.search(query_string, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 578, in search
response = self._select(params)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 308, in _select
return self._send_request('get', path)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 293, in _send_request
error_message = self._extract_error(resp)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 372, in _extract_error
reason, full_html = self._scrape_response(resp.headers, resp.content)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 404, in _scrape_response
p_nodes = body_node.cssselect('p')
AttributeError: 'NoneType' object has no attribute 'cssselect'
I tried reinstalling haystack, lxml, cssselect, pysolr and still i'm getting these errors. Is there anything else I can try for this? Thanks for any help!
I also tried reading few other SO questions including this:
XML error object has no attribute 'cssselect'
Seems like the issue is with pysolr. You might find some help here.
I had the same issue persist even after bringing up pysolr and lxml to latest version.
Turned out it was because I was not using haystack generated schema which has a few additional fields compared to the default solr one.
You can confirm if this is the case by looking at your solr logs.
It is an issue with pysolr. It hasn't been fixed till 3.3.0.
The only alternative would be to override the pysolr code and make adjustments for when Solr returns a reponse status!=200.
You can check if the response has body attribute or not and make adjustments according to that.