How to troubleshoot 500 errors in Python on OpenShift using tail?

How to troubleshoot 500 errors in Python on OpenShift using tail? - python-2.7

Problem
I want to troubleshoot 500 errors showing on the front end of a Python app on OpenShift using tail.
What I've Tried
rhc tail [appname]
This returns lots of results (from over 6 months ago) which are hard to search through and 'too many files open' error messages.

Solution
It took me a while to figure this out, so just posting my current solution for posterity and my own reference.
I ran the following which I think cleans up old logs:
rhc app tidy [appname]
I then ran what I think is a more specified tail request, which just returns the Python log file:
rhc tail -f app-root/logs/python.log [appname]
And can see a bit more relevant detail now.
When you have this open in the shell and take the action on the front end of the website that causes the error, the results will show live in the shell.
I am sure there is even more specific logs available to view, but this was the best I could find at the moment.

Related

Django Celery with Redis Issues on Digital Ocean App Platform

After quite a bit of trial and error and a step by step attempt to find solutions I thought I share the problems here and answer them myself according to what I've found. There is not too much documentation on this anywhere except small bits and pieces and this will hopefully help others in the future.
Please note that this is specific to Django, Celery, Redis and the Digital Ocean App Platform.
This is mostly about the below errors and further resulting implications:
OSError: [Errno 38] Function not implemented
and
Cannot connect to redis://......
The first error happens when you try run the celery command celery -A your_app worker --beat -l info
or similar on the App Platform. It appears that this is currently not supported on digital ocean. The second error occurs when you make a number of potential mistakes.

PART 1:
While Digital Ocean might remedy this in the future here is an approach that will offer a workaround. The problem is the not supported execution pool. Google "celery execution pools" if you want to know more and how they work. The default one is prefork. But what you need is either gevent or eventlet. I went with the former for my purposes.
Whichever you pick you will have to install it as it doesn't come with celery by default. In my case it was: pip install gevent (and don't forget adding it to your requirements as well).
Once you have that you can re-run the celery command but note that gevent and beat are not supported within a single command (will result in an error). Instead do the following:
celery -A your_app worker --pool=gevent -l info
and then separately (if you want to run beat that is) in another terminal/console
celery -A your_app beat -l info
In the first line you can also specify the concurrency like so: --concurrency=100. This is not required but useful. Read up on it what it does as that goes beyond the solution here.
PART 2:
In my specific case I've tested the above locally (development) first to make sure they work. The next issue was getting this into production. I use Redis as the db/broker.
In my specific setup I have most of my celery configuration in the_main_app/celery/__init__.py file but sometimes people put it directly into the_main_app/celery.py. Whichever it is you do make sure that the REDIS_URL is set correctly. For development it usually looks something like this:
YOUR_VAR_NAME = os.environ.get('REDIS_URL', 'redis://localhost:6379') where YOUR_VAR_NAME is then set to the broker with everything as below:
YOUR_VAR_NAME = os.environ.get('REDIS_URL', 'redis://localhost:6379')
app = Celery('the_main_app')
app.conf.broker_url = YOUR_VAR_NAME
The remaining settings are all documented on the "celery first steps with django" help page but are not relevant for what I am showing here.
PART 3:
When you setup your Redis Database on the App Platform (which is very simple) you will see the connection details as 'public network' and 'VPC network'.
The celery documentation says to use the following URL format for production: redis://:password#hostname:port/db_number. This didn't work. If you are not using a yaml file then you can simply copy paste the entire connection string (select from the dropdown!) from the Redis DB connection details and then setup an App-Level environment variable in your Digital Ocean project named REDIS_URL and paste in that entire string (and also encrypt it!).
The string should look like something like this (redis with 2 s!)
rediss://USER:PASS#URL.db.ondigitialocean.com:PORT.
You are almost done. The last step is to setup the workers. It was fine for me to run the PART 1 commands as console commands on the App Platform to test them but eventually I've setup a small worker (+ Add Component) for each line pasted them into the Run Command.
That is basically the process step by step. Good luck!

appcfg.py request_logs certificate verify failed (_ssl.c:661)

We've been using appcfg.py request_logs to download GAE logs, every once in a while it throws the error:
httplib2.SSLHandshakeError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:661)
But after a few times trying it works out, sometimes also it works after updating gcloud using gcloud components update. We thought it might be some network throttling issue of some kind and didn't give it enough thought. Lately though, we're trying to figure out what is causing this.
The full command we use is:
appcfg.py request_logs -A testapp --version=20180321t073239 --severity=0 all_logs.log --append --no_cookies
It seems the error is related to httplib2 library, but since it is part of the appcfg.py calls we're not sure we should tamper with something within its calls
Versions:
Python 2.7.13
Google Cloud SDK 196.0.0
app-engine-python 1.9.67

This has become more persistent now and I couldn't download logs for a few days now no matter how many times I try.
Looking at the download logs command I tried the same command again but without the --no_cookies flag to see what would happen.
appcfg.py request_logs -A testapp --version=20180321t073239 --severity=0 all_logs.log --append
I got the error:
Error 403: --- begin server output ---
You do not have permission to modify this app (app_id=u'e~testapp').
--- end server output ---
Which lead me to the answer provided here https://stackoverflow.com/a/34694577/1394228 by #ninjahoahong. This worked for me and logs where downloaded from first trial in case someone faces the same issue
There's also this Google Group post which I didn't try but seems like it does the same thing.
Not sure if removing the file ~/.appcfg_oauth2_tokens would have other effects, yet to find out.
Update:
I also found out that my httplib2 located at /Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/httplib2 was version = "0.7.5", I upgraded it to version = '0.11.3' using target location(directory) upgrade command:
sudo pip2 install --upgrade httplib2 -t /Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/httplib2/

Readthedocs local server stuck on Triggered state during build

I have installed a local instance of Readthedocs server, but anytime I try to build a github repository the app gets stuck in the Triggered state!.
There is no errors or exceptions, just regular info messages:
[25/Apr/2017 14:21:11] INFO [readthedocs.projects.utils:81] Running: 'ln -nsf /var/www/my-project/user_builds/test1/rtd-builds/latest /var/www/my-project/public_web_root/test1/en/latest' [/var/www/my-project]
[25/Apr/2017 14:21:11] INFO [readthedocs.projects.tasks:844] (Build) [test1:] Updating static metadata
Any idea what could be causing this issue?

so I had this problem, and there seems to be a lot of different things that could cause it, because I've seen various postings about it on different forums however none of the solutions posted helped me. The only posting i have book marked is this github issue.
for me I found that the documentation would build if I ran the command python manage.py runserver 0.0.0.0:8000, but would be stuck in a triggered state if I used my computers ip address; the solution was to use the above command but to add the following to readthedocs/settings/local_settings.py :
import os
# Set this to the root domain where this RTD installation will be running
PRODUCTION_DOMAIN = os.getenv('RTD_PRODUCTION_DOMAIN', '10.x.x.x:8000')
# Enable private Git doc repositories
ALLOW_PRIVATE_REPOS = True
best of luck.

Error deploying solution using Octopus deploy

I have a web project written in Sitecore 8/uCommerce. I am using Teamcity to compile and package the project and Octopus deploy to push it out. When I commit to SVN Teamcity picks up the changes, compiles and packages it up and Octopus deploys it to the Dev environment. All works well. However when I try to promote to test I get an error...
Error running conventions; running failure conventions... Fatal
10:24:19 Deployment on the Tentacle failed.
In the project I have a post deploy script (PostDeploy.ps1) to remove unwanted config files. There is only one line...
.\DeleteConfig.exe $OctopusEnvironmentName
I changed it to this from..
.\DeleteConfig.exe $OctopusParameters['Octopus.Environment.Name']
Due to an article I read, but this hasn't changed the error. I have also tried..
.\DeleteConfig.exe $OctopusParameters['OctopusEnvironmentName']
Again no effect. If I comment out the line of code I no longer get the error.
I have been trying to fix this for sometime now, read and followed the articles and post I can find on the problem but cannot find the fix.
A slight curveball is that this is the second project we deploy in this way. The first is also Sitecore/uCommerce and in the PostDeploy.ps1 the line
.\DeleteConfig.exe $OctopusParameters['Octopus.Environment.Name']
works perfectly.
Any help or pointers would be appreciated.

You don't need a post deploy script as there is a community task that cleans up any extra configuration files. It's at https://library.octopusdeploy.com/step-templates/9a2b84db-2940-4d9a-b61f-c82df35cee6c/actiontemplate-file-system-clean-configuration-transforms.
If you want to do it your way, I would simply use Poweshell like this:
Get-ChildItem -Filter Web.*.config l Remove-Item

Solr/Jetty confusion - how to get persistent service?

I'm on Ubuntu 12.04, using jetty (9_M4), solr (4.0.0) through django-haystack (2.0beta) installed in a django 1.4.2 site.
I've had to make a number of jumps through hoops to get this up and running, as there is very little documentation for getting solr 4.0 up and running in Ubuntu with django-haystack. But how hard could it be?
My main confusion is between what Jetty is doing, and what Solr is doing.
So, I installed Jetty via this tutorial making a small adjustment to the init file as I note in the comment on that tutorial. Jetty is now running, I can see it in browser, even after a reboot.
Great.
Move onto installing Solr via this tutorial again with adjustments. Instead of:
cp -R apache-solr-4.0.0/example/solr /opt
I use:
cp -R apache-solr-4.0.0/example/* /opt/solr/
and therefore add the following to /etc/default/jetty:
JAVA_OPTIONS="-Dsolr.solr.home=/opt/solr/solr $JAVA_OPTIONS"
I can't exactly remember why I did that, but there was a reason at the time. I stop using that tutorial at that point, as I don't understand the solr concept of core very well, and I'm already flustered at how annoyingly difficult this is.
(For context, when I set up django-haystack 2.0 with solr 3.5 about 6 months ago it was terrifyingly easy and didn't require a separate jetty installation - all up took me about two hours)
Anyway, I go back to my Django installation, create the schema.xml, make the stopwords-en.txt changes, copy it across to /opt/solr/solr/collection1/conf.
I edit /opt/solr/solr/collection1/conf/solrconfig.xml to remove the reference to updateLog since any attempt I made to add version field to schema.xml failed dismally with some sort of character error. See here (lucene -solr-user mailing list) and here (django-haystack github) for more info on this.
Finally, I cd into /opt/solr and run it:
sudo java -jar start.jar
Ba-da-boom! I get some results (when I go to my django site and use the search I've set up). Fantastic. This is really great. Now I just need to make the starting of solr persistent.
I create an /etc/init/solr that looks like this:
description "Solr Search Server"
# Make sure the file system and network devices have started before
# we begin the daemon
start on (filesystem and net-device-up IFACE!=lo)
# Stop the event daemon on system shutdown
stop on shutdown
# Respawn the process on unexpected termination
respawn
# The meat and potatoes
exec /usr/bin/java -jar /opt/solr/start.jar >> /var/log/solr.log 2>&1
I restart the server and nothing - I can see solr running, but I'm not getting any results in my django search.
I remove the init file and try running from the cli again - yep, sweet.
So, my questions are:
What the hell have I done wrong?
How do I get solr to start at boot and respawn if it dies accidentally AND produce results through my Django/haystack interface
Why do I need jetty and solr running simultaneously, and what is the relationship of /opt/jetty/webapps/solr.war to my /opt/solr? Am I creating causing conflicts?
Why was this so easy with solr 3.5 and so difficult now? I ask this honestly - I don't want a list of excuses or explanations from solr developers - I want to know how my understanding can be so limited in the first instance (solr 3.5) and get it running in two hours and why I now need to have a comprehensively deeper understanding of jetty/solr architecture and cli/shell script hacking to get it to run?

I am not promising to get all your things, but (numbers do not match questions):
1) Jetty is a web-server. Solr runs as a (web) application inside that web server, however:
2) Jetty can also run an embedded webserver, which is how Solr download works. When you do java -jar start.jar that runs Jetty with everything preconfigured. In which case you do not need a standalone Jetty. I suggest start with embedded Jetty, then switch to external one. However, if only your local app talks to local Solr server, you may be able to get quite far without needing full Jetty.
3) You don't need all the stuff you find in example directory - it has multiple confugurations and support files and is somewhat nested (which is confusing)
4) To start you need two things: Running solr; your configuration directory
5) The easiest way to get Solr running is to put the whole distrubution directory (I know - large) somewhere (e.g. /opt/solr).
6) Your configuration directory is very simple. All you need is two files to start, three if you are picky about names:
- (wherever, but make sure Solr can read/write there)
-- solr.xml (if you are picking about collection name, otherwise you can skip it)
-- collection1/ (that's default name, you can change that in solr.xml)
-- collection1/conf/ (this is configuration directory, Solr will add data directory on the same level once you start right)
schema.xml
-- collection1/conf/shema.xml
-- collection1/conf/solrconfig.xml
7) Then, you need to be in the example directory and run java -Dsolr.solr.home= start.jar . This will get all the pieces up and running on port :8983 . Solr 4 has a pretty new admin interface, so visit it with your browser, maybe do the tutorial, etc.
If you need help with minimal functioning schema/solrconfig files, ask separately, but you cannot just use ones from the example directory as it has all the other file references in the fieldType analysers (though you could just comment those lines out).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js