Solr Admin not found: "missing core name in path" - django

I'm running solr on dotcloud for my django app (using haystack) and am running into some trouble. I receive a 404 "missing core name in path" message when trying to access the admin, despite the fact that--as far as I can tell--I only have a single core.
Here is my schema.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<solr persistent="false">
<!--
adminPath: RequestHandler path to manage cores.
If 'null' (or absent), cores will not be manageable via request handler
-->
<cores adminPath="/admin/cores" defaultCoreName="collection1">
<core name="collection1" instanceDir="." shard="shard1"/>
</cores>
</solr>
When I point my browser at .../solr/collection1/admin, still nothing. But since I've only got a single core shouldn't I just be able to go to .../solr/admin?
I've followed the steps on the haystack "getting started tutorial" as well as the dotcloud solr service docs.
The relevant code in my settings.py:
HAYSTACK_SITECONF = 'gigmash.search_sites'
HAYSTACK_SEARCH_ENGINE = 'solr'
HAYSTACK_SOLR_URL = 'http://35543365.dotcloud.com/solr' #provided by dotcloud
HAYSTACK_INCLUDE_SPELLING = True
HAYSTACK_SEARCH_RESULTS_PER_PAGE = 10
And here's the error I get when I try and test in the interpreter:
>>> from haystack.query import SearchQuerySet
>>> sqs = SearchQuerySet().all()
>>> sqs.count()
Failed to query Solr using '*:*': [Reason: None]
java.lang.NullPointerException at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:91) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:297) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
I find it especially amusing that my error output says "reason: none" :-P
Perhaps also relevant: despite having run ./manage.py build_solr_schema and ./manage.py rebuild_index (and having rebuild index (accurately) report the number of models indexed), a data/ directory has not been created in my solr/ directory.
Any help would be greatly appreciated. Total newb with solr/haystack/dotcloud/everything!

Solved.
It ended up being an issue with my dotcloud.yml file.

Related

Error creating new content types when Uwsgi kills a worker

I am Running a django application with uwsgi, I am observing a case Sometimes, when uwsgi kills it's worker and and respawns a worker.
Thu Aug 8 11:02:33 2019 - worker 1 killed successfully (pid: 25499)
Thu Aug 8 11:02:33 2019 - Respawned uWSGI worker 1 (new pid: 4192)
And then the next request received by the django application returns 500 error with Exception:
RuntimeError: Error creating new content types. Please make sure contenttypes is migrated before trying to migrate apps individually.
at line
content_type = ContentType.objects.get_for_model(Coupon)
EDIT:
Django version: 1.8.6
Uwsgi Version: 2.0.12
ISSUE
I went through multiple sources and Content Types code, I found the reason for this case.
The Error occurred on this line
content_type = ContentType.objects.get_for_model(Coupon)
here, get_for_model method first fetches the object from cache. If it fails, it gets from DB, else creates the record in DB.
In my case, the Connection to DB was disrupted leading to OperationalError and for this error it raises RuntimeError with the Error description mentioned in question.
The same Error can also be returned in case of IntegrityError or ProgrammingError.
OperationalError
Exception raised for errors that are related to the database's operation and not necessarily under the control of the programmer, e.g. an unexpected disconnect occurs, the data source name is not found, a transaction could not be processed, a memory allocation error occurred during processing, etc. It must be a subclass of DatabaseError.
IntegrityError
Exception raised when the relational integrity of the database is affected, e.g. a foreign key check fails. It must be a subclass of DatabaseError.
ProgrammingError
Exception raised for programming errors, e.g. table not found or already exists, syntax error in the SQL statement, wrong number of parameters specified, etc. It must be a subclass of DatabaseError.
Snippet from Django 1.8.6 Source-code:
def get_for_model(self, model, for_concrete_model=True):
"""
Returns the ContentType object for a given model, creating the
ContentType if necessary. Lookups are cached so that subsequent lookups
for the same model don't hit the database.
"""
opts = self._get_opts(model, for_concrete_model)
try:
return self._get_from_cache(opts)
except KeyError:
pass
# The ContentType entry was not found in the cache, therefore we
# proceed to load or create it.
try:
try:
# We start with get() and not get_or_create() in order to use
# the db_for_read (see #20401).
ct = self.get(app_label=opts.app_label, model=opts.model_name)
except self.model.DoesNotExist:
# Not found in the database; we proceed to create it. This time we
# use get_or_create to take care of any race conditions.
ct, created = self.get_or_create(
app_label=opts.app_label,
model=opts.model_name,
)
except (OperationalError, ProgrammingError, IntegrityError):
# It's possible to migrate a single app before contenttypes,
# as it's not a required initial dependency (it's contrib!)
# Have a nice error for this.
raise RuntimeError(
"Error creating new content types. Please make sure contenttypes "
"is migrated before trying to migrate apps individually."
)
self._add_to_cache(self.db, ct)
return ct
SOLUTION
I need to upgrade my Uwsgi to or above 2.0.14 as before this version, Uwsgi kills any old worker to only give resources to Cheap Workers.

Django, Apache2 on Google Kubernetes Engine writing Opencensus Traces to Stackdriver Trace

I have a Django web app served from Apache2 with mod_wsgi in docker containers running on a Kubernetes cluster in Google Cloud Platform, protected by Identity-Aware Proxy. Everything is working great, but I want to send GCP Stackdriver traces for all requests without writing one for each view in my project. I found middleware to handle this, using Opencensus. I went through this documentation, and was able to manually generate traces that exported to Stackdriver Trace in my project by specifying the StackdriverExporter and passing the project_id parameter as the Google Cloud Platform Project Number for my project.
Now to make this automatic for ALL requests, I followed the instructions to set up the middleware. In settings.py, I added the module to INSTALLED_APPS, MIDDLEWARE, and set up the OPENCENSUS_TRACE dictionary of options. I also added the OPENCENSUS_TRACE_PARAMS. This works great with the default exporter 'opencensus.trace.exporters.print_exporter.PrintExporter', as I can see the Trace and Span information, including Trace ID and all details in my Apache2 web server logs. However, I want to send these to my Stackdriver Trace processor for analysis.
I tried setting the EXPORTER parameter to opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter, which works when run manually from the shell, as long as you supply the project number.
When it is set up to use StackdriverExporter, the web page will not respond load, the health check starts to fail, and ultimately the web page comes back with a 502 error, stating I should try again in 30 seconds (I believe the Identity-Aware Proxy is generating this error, once it detects the failed health check), but the server generates no errors, and there are no logs in access or errors for Apache2.
There is another dictionary in settings.py named OPENCENSUS_TRACE_PARAMS, which I presume is needed to determine which project number the exporter should be using. The example has GCP_EXPORTER_PROJECT set as None, and SERVICE_NAME set as 'my_service'.
What options do I need to set to get the exporter to send back to Stackdriver instead of printing to logs? Do you have any idea about how I can set this up?
settings.py
MIDDLEWARE = (
...
'opencensus.trace.ext.django.middleware.OpencensusMiddleware',
)
INSTALLED_APPS = (
...
'opencensus.trace.ext.django',
)
OPENCENSUS_TRACE = {
'SAMPLER': 'opencensus.trace.samplers.probability.ProbabilitySampler',
'EXPORTER': 'opencensus.trace.exporters.stackdriver_exporter.StackdriverExporter', # This one just makes the server hang with no response or error and kills the health check.
'PROPAGATOR': 'opencensus.trace.propagation.google_cloud_format.GoogleCloudFormatPropagator',
# 'EXPORTER': 'opencensus.trace.exporters.print_exporter.PrintExporter', # This one works to print the Trace and Span with IDs and details in the logs.
}
OPENCENSUS_TRACE_PARAMS = {
'BLACKLIST_PATHS': ['/health'],
'GCP_EXPORTER_PROJECT': 'my_project_number', # Should this be None like the example, or Project ID, or Project Number?
'SAMPLING_RATE': 0.5,
'SERVICE_NAME': 'my_service', # Not sure if this is my app name or some other service name.
'ZIPKIN_EXPORTER_HOST_NAME': 'localhost', # Are the following even necessary, or are they causing a failure that is not detected by Apache2?
'ZIPKIN_EXPORTER_PORT': 9411,
'ZIPKIN_EXPORTER_PROTOCOL': 'http',
'JAEGER_EXPORTER_HOST_NAME': None,
'JAEGER_EXPORTER_PORT': None,
'JAEGER_EXPORTER_AGENT_HOST_NAME': 'localhost',
'JAEGER_EXPORTER_AGENT_PORT': 6831
}
Here's an example (I prettified the format for readability) of the Apache2 log when it is set to use the PrintExporter:
[Fri Feb 08 09:00:32.427575 2019]
[wsgi:error]
[pid 1097:tid 139801302882048]
[client 10.48.0.1:43988]
[SpanData(
name='services.views.my_view',
context=SpanContext(
trace_id=e882f23e49e34fc09df621867d753532,
span_id=None,
trace_options=TraceOptions(enabled=True),
tracestate=None
),
span_id='bcbe7b96906a482a',
parent_span_id=None,
attributes={
'http.status_code': '200',
'http.method': 'GET',
'http.url': '/',
'django.user.name': ''
},
start_time='2019-02-08T17:00:29.845733Z',
end_time='2019-02-08T17:00:32.427455Z',
child_span_count=0,
stack_trace=None,
time_events=[],
links=[],
status=None,
same_process_as_parent_span=None,
span_kind=1
)]
Thanks in advance for any tips, assistance, or troubleshooting advice!
Edit 2019-02-08 6:56 PM UTC:
I found this in the middleware:
# Initialize the exporter
transport = convert_to_import(settings.params.get(TRANSPORT))
if self._exporter.__name__ == 'GoogleCloudExporter':
_project_id = settings.params.get(GCP_EXPORTER_PROJECT, None)
self.exporter = self._exporter(
project_id=_project_id,
transport=transport)
elif self._exporter.__name__ == 'ZipkinExporter':
_service_name = self._get_service_name(settings.params)
_zipkin_host_name = settings.params.get(
ZIPKIN_EXPORTER_HOST_NAME, 'localhost')
_zipkin_port = settings.params.get(
ZIPKIN_EXPORTER_PORT, 9411)
_zipkin_protocol = settings.params.get(
ZIPKIN_EXPORTER_PROTOCOL, 'http')
self.exporter = self._exporter(
service_name=_service_name,
host_name=_zipkin_host_name,
port=_zipkin_port,
protocol=_zipkin_protocol,
transport=transport)
elif self._exporter.__name__ == 'TraceExporter':
_service_name = self._get_service_name(settings.params)
_endpoint = settings.params.get(
OCAGENT_TRACE_EXPORTER_ENDPOINT, None)
self.exporter = self._exporter(
service_name=_service_name,
endpoint=_endpoint,
transport=transport)
elif self._exporter.__name__ == 'JaegerExporter':
_service_name = self._get_service_name(settings.params)
self.exporter = self._exporter(
service_name=_service_name,
transport=transport)
else:
self.exporter = self._exporter(transport=transport)
The exporter is now named StackdriverExporter, instead of GoogleCloudExporter. I set up a class in my app named GoogleCloudExporter that inherits StackdriverExporter, and updated my settings.py to use GoogleCloudExporter, but it didn't seem to work, I wonder if there is other code referencing these old naming schemes, possibly for the transport. I'm searching the source code for clues... This at least tells me I can get rid of the ZIPKIN and JAEGER param options, as this is determined on the EXPORTER param.
Edit 2019-02-08 11:58 PM UTC:
I scrapped Apache2 to isolate the problem and just set my docker image to use Django's built in webserver CMD ["python", "/path/to/manage.py", "runserver", "0.0.0.0:80"] and it works! When I go to the site, it writes traces to Stackdriver Trace for each request, the Span name is the module and method being executed.
Somehow Apache2 is not being allowed to send these, but I can do so from the shell when running as root. I'm adding Apache2 and mod-wsgi tags to the question, because I have a funny feeling this has to do with forking child processes in Apache2 and mod-WSGI. Would it be the child process being unable to be created as apache2's child process is sandboxed, or could this be a permissions thing? It seems strange, because it is just calling python modules, no external system OS binaries, that I am aware of. Any other ideas would be greatly appreciated!
I had this problem while using gunicorn with gevent as the worker class. To resolve and get cloud traces working the solution was to monkey patch grpc like so
from gevent import monkey
monkey.patch_all()
import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()
See https://github.com/grpc/grpc/issues/4629#issuecomment-376962677

haystack whoosh no results - rebuild_index shows Indexing [number] <django.utils.functional.__proxy__ object at [memory location] >

when I run ./manage.py rebuild_index I get the readout for example:
Indexing 4574 <django.utils.functional.__proxy__ object at at 0x1aab690> .
Having seen other users' readouts, this should show the name of the search index/model instead and I am wondering if this could be part of the explanation as to why I have been experiencing no search results on the website and no objects appear to be indexed when performing:
>>> from haystack.query import SearchQuerySet
>>> sqs = SearchQuerySet().all()
>>> sqs.count()
I did not initially have a
def _unicode_self():
return self.name
on the models I am indexing but then I added it and nothing seemed to change even after doing rebuild_index
This was GitHub pull request #746 for Django Haystack, which has now been merged.
I was seeing this same issue on my local (dev) setup. Updating solved the "functional proxy" placeholder issue for me.
I ran the following command:
pip install -e git+git://github.com/toastdriven/django-haystack.git#master#egg=django-haystack
You may need to tweak the command to suit your own needs and/or environment.

Why am I getting a 404 for django_rq admin screen?

I have django-rq set up and running (jobs are queued, I can run manage.py rqworker) but I cannot seem to get the admin screen working - I consistently get a 404 for the /admin/django_rq url.
I have django-extensions installed, and the show_urls commands shows that the URL is registered:
$ python manage.py show_urls
...
/admin/django_rq/ django_rq.views.stats rq_home
/admin/django_rq/queues/<queue_index>/ django_rq.views.jobs rq_jobs
/admin/django_rq/queues/<queue_index>/<job_id>/ django_rq.views.job_detail rq_job_detail
/admin/django_rq/queues/<queue_index>/<job_id>/delete/ django_rq.views.delete_job rq_delete_job
...
I am logged in as someone who is a staff member, so the #staff_member_required decorator on the stats view should be working.
(r'^admin/django_rq/', include('django_rq.urls'))
must be before
url(r'^admin/', include(admin.site.urls))
I've updated the installation instructions in django-rq's README.rst to drop the "admin" prefix from django-rq's URL pattern so Hugo's issue shouldn't happen again.

django generator function not working with real server

I have some code written in django/python. The principal is that the HTTP Response is a generator function. It spits the output of a subprocess on the browser window line by line. This works really well when I am using the django test server. When I use the real server it fails / basically it just beachballs when you press submit on the page before.
#condition(etag_func=None)
def pushviablah(request):
if 'hostname' in request.POST and request.POST['hostname']:
hostname = request.POST['hostname']
command = "blah.pl --host " + host + " --noturn"
return HttpResponse( stream_response_generator( hostname, command ), mimetype='text/html')
def stream_response_generator( hostname, command ):
proc = subprocess.Popen(command.split(), 0, None, subprocess.PIPE, subprocess.PIPE, subprocess.PIPE )
yield "<pre>"
var = 1
while (var == 1):
for line in proc.stdout.readline():
yield line
Anyone have any suggestions on how to get this working with on the real server? Or even how to debug why it is not working?
I discovered that the generator function is actually running but it has to complete before the httpresponse throws up a page onscreen. I don't want to have to wait for it to complete before the user sees output. I would like the user to see output as the subprocess progresses.
I'm wondering if this issue could be related to something in apache2 rather than django.
#evolution did you use gunicorn to deploy your app. If yes then you have created a service. I am having a similar kind of issue but with libreoffice. As much as I have researched I have found that PATH is overriding the command path present on your subprocess. I did not have a solution till now. If you bind your app with gunicorn in terminal then your code will also work.