Jupyter Hub breaks connection on Google Cloud - google-cloud-platform

I'm hosting Jupyter hub on a separate vm instance in Google Cloud and for some reason connection fails every time when I don't do anything there actively for about 15 minutes. And after that I had to relaunch server and rerun everything again.
Is there some kind of timeout that I could change or maybe it's some kind of an optimised mode of usage I could turn off? I tried increasing CPU memory but still the same thing happens all the time.
I pinged the external IP:
PING <EXTERNAL_IP> (<EXTERNAL_IP>) 56(84) bytes of data.
64 bytes from <EXTERNAL_IP>: icmp_seq=1 ttl=61 time=0.857 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=2 ttl=61 time=0.390 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=3 ttl=61 time=0.418 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=4 ttl=61 time=0.363 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=5 ttl=61 time=0.385 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=6 ttl=61 time=0.429 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=7 ttl=61 time=0.440 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=8 ttl=61 time=0.352 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=9 ttl=61 time=0.357 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=10 ttl=61 time=0.396 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=11 ttl=61 time=0.356 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=12 ttl=61 time=0.594 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=13 ttl=61 time=0.408 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=14 ttl=61 time=0.424 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=15 ttl=61 time=0.414 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=16 ttl=61 time=0.390 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=17 ttl=61 time=0.378 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=18 ttl=61 time=0.350 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=19 ttl=61 time=0.437 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=20 ttl=61 time=0.384 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=21 ttl=61 time=0.361 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=22 ttl=61 time=0.340 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=23 ttl=61 time=0.496 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=24 ttl=61 time=0.361 ms
64 bytes from <EXTERNAL_IP>: icmp_seq=25 ttl=61 time=0.333 ms
Logs from the Serial port 1:
Jun 21 17:11:09 jupyterhub bash[28473]: [I 2021-06-21 17:11:09.542 SingleUserNotebookApp log:189] 200 GET /user/<MY_USERNAME>/metrics (<MY_USERNAME>#<EXTERNAL_IP>) 9.10ms
Jun 21 17:13:52 jupyterhub systemd[1]: Stopping /bin/bash -c cd /home/jupyter-<MY_USERNAME> && exec jupyterhub-singleuser --port=59331...
Jun 21 17:13:52 jupyterhub bash[28473]: [C 2021-06-21 17:13:52.335 SingleUserNotebookApp notebookapp:1978] received signal 15, stopping
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.336 SingleUserNotebookApp notebookapp:2145] Shutting down 2 kernels
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.438 SingleUserNotebookApp multikernelmanager:226] Kernel shutdown: d179550b-b0df-4889-8605-049d4ec59f70
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.438 SingleUserNotebookApp multikernelmanager:226] Kernel shutdown: 7f2ab807-18b8-465e-b21b-fd9d82a7c3c7
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.438 SingleUserNotebookApp notebookapp:2160] Shutting down 2 terminals
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.439 SingleUserNotebookApp management:199] EOF on FD 12; stopping reading
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.540 SingleUserNotebookApp management:362] Terminal 2 closed
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.541 SingleUserNotebookApp management:199] EOF on FD 16; stopping reading
Jun 21 17:13:52 jupyterhub bash[28473]: [I 2021-06-21 17:13:52.641 SingleUserNotebookApp management:362] Terminal 1 closed
Jun 21 17:13:52 jupyterhub bash[28473]: Websocket closed
Jun 21 17:13:52 jupyterhub bash[28473]: Websocket closed
Jun 21 17:13:52 jupyterhub bash[28473]: Traceback (most recent call last):
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/bin/jupyterhub-singleuser", line 10, in <module>
Jun 21 17:13:52 jupyterhub bash[28473]: sys.exit(main())
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
Jun 21 17:13:52 jupyterhub bash[28473]: return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
Jun 21 17:13:52 jupyterhub bash[28473]: app.start()
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/jupyterhub/singleuser/mixins.py", line 571, in start
Jun 21 17:13:52 jupyterhub bash[28473]: super().start()
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/notebookapp.py", line 2362, in start
Jun 21 17:13:52 jupyterhub bash[28473]: self.cleanup_terminals()
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/notebookapp.py", line 2161, in cleanup_terminals
Jun 21 17:13:52 jupyterhub bash[28473]: run_sync(terminal_manager.terminate_all())
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/utils.py", line 370, in run_sync
Jun 21 17:13:52 jupyterhub bash[28473]: return wrapped()
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/utils.py", line 364, in wrapped
Jun 21 17:13:52 jupyterhub bash[28473]: result = loop.run_until_complete(maybe_async)
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
Jun 21 17:13:52 jupyterhub bash[28473]: return future.result()
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/terminal/terminalmanager.py", line 96, in terminate_all
Jun 21 17:13:52 jupyterhub bash[28473]: await self.terminate(term, force=True)
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/terminal/terminalmanager.py", line 85, in terminate
Jun 21 17:13:52 jupyterhub bash[28473]: self._check_terminal(name)
Jun 21 17:13:52 jupyterhub bash[28473]: File "/opt/tljh/user/lib/python3.7/site-packages/notebook/terminal/terminalmanager.py", line 113, in _check_terminal
Jun 21 17:13:52 jupyterhub bash[28473]: raise web.HTTPError(404, u'Terminal not found: %s' % name)
Jun 21 17:13:52 jupyterhub bash[28473]: tornado.web.HTTPError: HTTP 404: Not Found (Terminal not found: 2)
Jun 21 17:13:52 jupyterhub systemd[1]: jupyter-<MY_USERNAME>.service: Main process exited, code=exited, status=1/FAILURE
Jun 21 17:13:52 jupyterhub systemd[1]: jupyter-<MY_USERNAME>.service: Failed with result 'exit-code'.
Jun 21 17:13:52 jupyterhub systemd[1]: Stopped /bin/bash -c cd /home/jupyter-<MY_USERNAME> && exec jupyterhub-singleuser --port=59331.

Related

OpenEdx error while running python code in Codejail Plugins using Dockerize container services

I have installed a stack of OpexEDX platform using Tutor and installed OpexEdx "Codejail" plugin using below link
pip install git+https://github.com/edunext/tutor-contrib-codejail
https://github.com/eduNEXT/tutor-contrib-codejail
I am facing a problem during working on the code jail while importing python matplotlib library.
importing the same library inside codejail container is working fine. the only problem is import through OpnexEdx code block. > advance black > problem.
I have already installed the Codejail and Matplotlib on docker.
I have to run this code. which gives error
<problem>
<script type="loncapa/python">
import matplotlib
</script>
</problem>
import os works fine
but getting error while
import matplotlib
detail of current stack:
open edx version : openedx-mfe:14.0.1
code jail version : codejailservice:14.1.0
please see the error message below
cannot create LoncapaProblem block-v1:VUP+Math101+2022+type#problem+block#3319c4e42da64a74b0e40f048e3f2599: Error while executing script code: Couldn't execute jailed code: stdout: b'', stderr: b'Traceback (most recent call last):\n File "jailed_code", line 19, in <module>\n exec(code, g_dict)\n File "<string>", line 66, in <module>\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 921, in <module>\n dict.update(rcParams, rc_params_in_file(matplotlib_fname()))\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 602, in matplotlib_fname\n for fname in gen_candidates():\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 599, in gen_candidates\n yield os.path.join(get_configdir(), \'matplotlibrc\')\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 239, in wrapper\n ret = func(**kwargs)\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 502, in get_configdir\n return get_config_or_cache_dir(_get_xdg_config_dir())\n File "/sandbox/venv/lib/python3.8/site-packages/matplotlib/__init__.py", line 474, in get_config_or_cache_dir\n tempfile.mkdtemp(prefix="matplotlib-")\n File "/opt/pyenv/versions/3.8.6_sandbox/lib/python3.8/tempfile.py", line 347, in mkdtemp\n prefix, suffix, dir, output_type = sanitize_params(prefix, suffix, dir)\n File "/opt/pyenv/versions/3.8.6_sandbox/lib/python3.8/tempfile.py", line 117, in sanitize_params\n dir = gettempdir()\n File "/opt/pyenv/versions/3.8.6_sandbox/lib/python3.8/tempfile.py", line 286, in gettempdir\n tempdir = get_default_tempdir()\n File "/opt/pyenv/versions/3.8.6_sandbox/lib/python3.8/tempfile.py", line 218, in _get_default_tempdir\n raise FileNotFoundError(_errno.ENOENT,\nFileNotFoundError: [Errno 2] No usable temporary directory found in [\'/tmp\', \'/var/tmp\', \'/usr/tmp\', \'/tmp/codejail-lbfd69da\']\n' with status code: 1. For more information check Codejail Service logs.
Codejail service logs are as follows:
{"log":"[pid: 6|app: 0|req: 20/39] 172.18.0.10 () {36 vars in 483 bytes} [Tue Nov 22 11:24:59 2022] POST /api/v0/code-exec =\u003e generated 1978 bytes in 742 msecs (HTTP/1.1 200) 2 headers in 73 bytes (1 switches on core 0)\n","stream":"stderr","time":"2022-11-22T11:25:00.151315626Z"} {"log":"2022-11-22 11:26:23,304 INFO 9 [codejailservice.app] code_exec_service.py:52 - Running problem_id:53fbaa04859f41989ab967c15a12c013 jailed code for course_id:course-v1:VUP+Math101+2022 ...\n","stream":"stderr","time":"2022-11-22T11:26:23.30489438Z"} {"log":"2022-11-22 11:26:23,343 INFO 9 [codejailservice.app] code_exec_service.py:73 - Jailed code was executed in 0.03849988000001758 seconds.\n","stream":"stderr","time":"2022-11-22T11:26:23.343618965Z"} {"log":"[pid: 9|app: 0|req: 20/40] 172.18.0.10 () {36 vars in 483 bytes} [Tue Nov 22 11:26:23 2022] POST /api/v0/code-exec =\u003e generated 73 bytes in 40 msecs (HTTP/1.1 200) 2 headers in 71 bytes (1 switches on core 0)\n","stream":"stderr","time":"2022-11-22T11:26:23.344178308Z"} {"log":"2022-11-23 04:15:24,786 INFO 6 [codejailservice.app] code_exec_service.py:52 - Running problem_id:3319c4e42da64a74b0e40f048e3f2599 jailed code for course_id:course-v1:VUP+Math101+2022 ...\n","stream":"stderr","time":"2022-11-23T04:15:24.786287416Z"} {"log":"2022-11-23 04:15:25,582 ERROR 6 [codejailservice.app] code_exec_service.py:70 - Error found while executing jailed code.\n","stream":"stderr","time":"2022-11-23T04:15:25.582527974Z"} {"log":"[pid: 6|app: 0|req: 21/41] 172.18.0.10 () {36 vars in 483 bytes} [Wed Nov 23 04:15:24 2022] POST /api/v0/code-exec =\u003e generated 1978 bytes in 798 msecs (HTTP/1.1 200) 2 headers in 73 bytes (1 switches on core 0)\n","stream":"stderr","time":"2022-11-23T04:15:25.583132326Z"} {"log":"2022-11-23 06:00:15,150 INFO 9 [codejailservice.app] code_exec_service.py:52 - Running problem_id:3319c4e42da64a74b0e40f048e3f2599 jailed code for course_id:course-v1:VUP+Math101+2022 ...\n","stream":"stderr","time":"2022-11-23T06:00:15.15073834Z"} {"log":"2022-11-23 06:00:15,891 ERROR 9 [codejailservice.app] code_exec_service.py:70 - Error found while executing jailed code.\n","stream":"stderr","time":"2022-11-23T06:00:15.8916806Z"} {"log":"[pid: 9|app: 0|req: 21/42] 172.18.0.10 () {36 vars in 483 bytes} [Wed Nov 23 06:00:15 2022] POST /api/v0/code-exec =\u003e generated 1978 bytes in 742 msecs (HTTP/1.1 200) 2 headers in 73 bytes (1 switches on core 0)\n","stream":"stderr","time":"2022-11-23T06:00:15.892225441Z"}

VPS is unaccessible through ssh and cant connect website

the problem is the connection to my vps is lost on daily basis multiple times like 20 mins. When the server is down i can't connect website so i get the error:
Err connection timed out.
and i try connecting through ssh and it outputs a log:
Connection refused.
Nothing more nothing less i should solve this because it causes lots of trouble the only solution i came up with its restarting from the server provider site. But this happens frequently not one in a month or a year it happens 10 times a day. How should i debug the problem or how can i find a real solution.
Any help is appreciated. Thanks.
Edit
The output of ssh -vvv root#ip:
OpenSSH_8.2p1 Ubuntu-4ubuntu0.3, OpenSSL 1.1.1f 31 Mar 2020
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname *ip is address
debug2: ssh_connect_direct
debug1: Connecting to *ip [*ip] port 22.
debug1: connect to address *ip port 22: Connection refused
ssh: connect to host *ip port 22: Connection refused
Apache error.log
[Sun Oct 17 13:59:41.104024 2021] [wsgi:error] [pid 2883:tid 139903188997888] [remote some_ip:53604] Bad Request: /iframe2/news/kurallar/
[Sun Oct 17 13:59:41.109194 2021] [wsgi:error] [pid 2883:tid 139903071426304] [remote some_ip:48318] Bad Request: /iframe2/news/kurallar/
[Sun Oct 17 14:10:08.136701 2021] [wsgi:error] [pid 2883:tid 139903071426304] [remote my_ip:24816] Not Found: /favicon.ico
[Sun Oct 17 14:19:34.339115 2021] [mpm_event:notice] [pid 2882:tid 139903302818752] AH00491: caught SIGTERM, shutting down
Exception ignored in: <bound method BaseEventLoop.__del__ of <_UnixSelectorEventLoop running=False closed=False debug=False>>
Traceback (most recent call last):
File "/usr/lib/python3.6/asyncio/base_events.py", line 526, in __del__
NameError: name 'ResourceWarning' is not defined
[Sun Oct 17 14:19:34.517419 2021] [mpm_event:notice] [pid 3319:tid 140614344002496] AH00489: Apache/2.4.29 (Ubuntu) mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Sun Oct 17 14:19:34.517583 2021] [core:notice] [pid 3319:tid 140614344002496] AH00094: Command line: '/usr/sbin/apache2'
/var/log/auth.log
Oct 17 21:15:01 my_name sshd[1365]: Invalid user pi from 94.3.213.149 port 42290
Oct 17 21:15:01 my_name sshd[1364]: Invalid user pi from 94.3.213.149 port 42286
Oct 17 21:15:01 my_name sshd[1365]: Connection closed by invalid user pi 94.3.213.149 port 42290 [preauth]
Oct 17 21:15:01 my_name sshd[1364]: Connection closed by invalid user pi 94.3.213.149 port 42286 [preauth]
Oct 17 22:00:38 my_name sshd[1628]: Invalid user user from 212.193.30.32 port 39410
Oct 17 22:00:38 my_name sshd[1628]: Received disconnect from 212.193.30.32 port 39410:11: Normal Shutdown, Thank you for playing [preauth]
Oct 17 22:00:38 my_name sshd[1628]: Disconnected from invalid user user 212.193.30.32 port 39410 [preauth]
I get lots of these inputs and shutdowns in the log the name 'pi' and the ip is not me. Do these connections affect the website or leak any information of the user.
There are gaps in the closed times when i could not connect to the server.
The closed times the /var/log/syslog prints these:
Oct 18 08:27:08 my_name kernel: [53593.658210] [UFW BLOCK] IN=eth0 OUT= MAC={mac_addr} SRC={some_ip} DST={my_ip} LEN=40 TOS=0x08 PREC=0x20 TTL=241 ID=31782 PROTO=TCP SPT=47415 DPT=24634 WINDOW=1024 RES=0x00 SYN URGP=0
Oct 18 08:35:01 my_name CRON[2854]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Oct 18 08:45:01 my_name CRON[2860]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Oct 18 08:49:48 my_name kernel: [ 0.000000] Linux version 4.15.0-158-generic (buildd#lgw01-amd64-051) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #166-Ubuntu SMP Fri Sep 17 19:37:52 UTC 2021 (Ubuntu 4.15.0-158.166-generic 4.15.18)

Where to start tracing down an exception?

I'm getting an exception in production which isn't providing and stacktrace information. How do I start debugging where this might be coming from?
Oct 25 16:26:17 socket-proxy app/web.1: Exception: RedisError: Disconnected (Redis::DisconnectedError)
Oct 25 16:26:17 socket-proxy app/web.1: 0x4af6ac: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x4ce900: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x4b553e: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x529d1c: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x518cb2: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x518064: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x521d82: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x51ed3b: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x5240e9: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x50b995: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x416209: ??? at ??
Oct 25 16:26:17 socket-proxy app/web.1: 0x0: ??? at ??
Ruby dev here so not sure why stack trace is printed mysteriously, however if you are looking for some clues as to where to look I would start at this class:
redis/error.cr
# Exception for errors that Redis returns.
class Redis::Error < Exception
def initialize(s)
super("RedisError: #{s}")
end
end
class Redis::DisconnectedError < Redis::Error
def initialize
super("Disconnected")
end
end
Now clearly only place that exception seems to being raised in the crystal-redis repository is in this class:
redis/connection.cr (line: )
def receive_line
line = #socket.gets(chomp: false)
unless line
raise Redis::DisconnectedError.new
end
line.byte_slice(0, line.bytesize - 2)
end
Looking at the method that uses it, receive_line it seems the error is clearly being thrown at Redis::Connection during connection or receive method.
So either its a error during connection, or a dropped connection.
Considering the clueless stack-trace, that would be a good start, unless you can share some more code to look at.
Hope that helps.
This ended up being because of the production server timing out the redis connection after a period of time. I've switched to redis-reconnect to auto-reconnect.
https://github.com/danielwestendorf/redis-reconnect

Vora 1.4 Catalog fails to start

So I upgraded Vora from 1.3 to 1.4 on recently upgraded HDP 2.5.6.
All services seem to be starting fine, except Catalog. In the log I see a lot of messages like this:
2017-08-16 11:43:34.591183|+1000|ERROR|Was not able to create new dlog via XXXXX:37999, Status was ERROR_OP_TIMED_OUT, Details: |v2catalog_server|Distributed Log|140607339825056|CreateDLog|log_administration.cpp(211)^^
2017-08-16 11:43:34.611044|+1000|ERROR|Operation (CREATE_LOG) timed out, last status was: ERROR_INTERNAL|v2catalog_server|Distributed Log|140607279314688|Retry|callback_base.cpp(222)^^
2017-08-16 11:43:34.611204|+1000|ERROR|Was not able to create new dlog via XXXXX:20439, Status was ERROR_OP_TIMED_OUT, Details: |v2catalog_server|Distributed Log|140607339825056|CreateDLog|log_administration.cpp(211)^^
2017-08-16 11:43:34.611235|+1000|ERROR|Create DLog ended with status ERROR_OP_TIMED_OUT, retrying in 1000ms|v2catalog_server|Distributed Log|140607339825056|CreateDLog|log_administration.cpp(163)^^
2017-08-16 11:43:35.611757|+1000|ERROR|can't create dlog client[ ERROR_OP_TIMED_OUT ]|v2catalog_server|Catalog|140607339825056|Init|dlog_accessor.cpp(174)^^
terminate called after throwing an instance of 'std::system_error'
what(): Invalid argument
Any ideas what I left misconfigured?
[UPDATE] DLog's log below:
[Wed Aug 16 10:31:23 2017] DLOG Server Version: 1.2.330.20859
[Wed Aug 16 10:31:23 2017] Listening on XXXXXX:46026
[Wed Aug 16 10:31:23 2017] Loading data store
2017-08-16 10:31:23.475454|+1000|WARN |Server file descriptor limit too large vs system limit; reducing to 896|v2dlog|Distributed Log|140349419014080|Load|store.cpp(2187)^^
[Wed Aug 16 10:31:23 2017] Server file descriptor limit too large vs system limit; reducing to 896
[Wed Aug 16 10:31:23 2017] Recovering log in store
[Wed Aug 16 10:31:23 2017] Starting server in managed mode
[Wed Aug 16 10:31:23 2017] Initializing management interface
2017-08-16 10:31:39.365780|+1000|WARN |f(1)h(1):Host 1 has timed out, disabling|v2dlog|Distributed Log|140349343360768|newcluster.(*FragmentRef).ProcessRule|dlog.go(607)^^
2017-08-16 10:32:10.333444|+1000|ERROR|Log with ID 1 is not registered on unit.|v2dlog|Distributed Log|140349238322944|Seal|tenant_registry.cpp(63)^^
2017-08-16 10:32:10.333754|+1000|ERROR|f(1)h(1):Sealing local unit failed for log 1: disabling|v2dlog|Distributed Log|140349238322944|newcluster.(*replicaStateRef).disable|dlog.go(991)^^
[Wed Aug 16 11:22:24 2017] Received signal: 15. Shutting down
[Wed Aug 16 11:22:24 2017] Flushing store...
[Wed Aug 16 11:22:24 2017] Store flush complete
[Wed Aug 16 11:30:17 2017] DLOG Server Version: 1.2.330.20859
[Wed Aug 16 11:30:17 2017] Listening on XXXXXX:37999
[Wed Aug 16 11:30:17 2017] Loading data store
2017-08-16 11:30:17.371415|+1000|WARN |Server file descriptor limit too large vs system limit; reducing to 896|v2dlog|Distributed Log|140388824664000|Load|store.cpp(2187)^^
[Wed Aug 16 11:30:17 2017] Server file descriptor limit too large vs system limit; reducing to 896
[Wed Aug 16 11:30:17 2017] Recovering log in store
[Wed Aug 16 11:30:17 2017] Starting server in managed mode
[Wed Aug 16 11:30:17 2017] Initializing management interface
2017-08-16 11:30:19.421458|+1000|WARN |missed heartbeat for log 1, host 2; poking with state 2|v2dlog|Distributed Log|140388740617984|newcluster.(*FragmentRef).ProcessRule|dlog.go(619)^^
Further on this, I've configured Vora DLog to run on all three nodes of the cluster, but I see it's not running on one of them. The (likely) related part of Vora Manager's log is:
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : stdout from check: [Thu Aug 17 09:32:36 2017] Checking for store #012[Thu Aug 17 09:32:36 2017] No valid store found
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : stderr from check: 2017-08-17 09:32:36.590974|+1000|INFO |Command Line: /opt/vora/lib/vora-dlog/bin/v2dlog check --trace-level DEBUG --trace-to-stderr /var/local/vora/vora-dlog|v2dlog|Distributed Log|139919669938112|server_main|main.cpp(1323) #0122017-08-17 09:32:36.592784|+1000|INFO |Checking for store|v2dlog|Distributed Log|139919669938112|Run|main.cpp(1146) #0122017-08-17 09:32:36.593074|+1000|ERROR|Exception during recovery: Encountered a generic I/O error|v2dlog|Distributed Log|139919669938112|Load|store.cpp(2201) #0122017-08-17 09:32:36.593157|+1000|FATAL|Error during recovery|v2dlog|Distributed Log|139919669938112|handle_recovery_error|main.cpp(767) #012[Thu Aug 17 09:32:36 2017] Error during recovery #0122017-08-17 09:32:36.593214|+1000|FATAL| Encountered a generic I/O error|v2dlog|Distributed Log|139919669938112|handle_recovery_error|main.cpp(767) #012[Thu Aug 17 09:32:36 2017] Encountered a generic I/O error #0122017-08-17 09:32:36.593277|+1000|FATAL| boost::filesystem::status: Permission den
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : ... ied: "/var/local/vora/vora-dlog"|v2dlog|Distributed Log|139919669938112|handle_recovery_error|main.cpp(767) #012[Thu Aug 17 09:32:36 2017] boost::filesystem::status: Permission denied: "/var/local/vora/vora-dlog" #0122017-08-17 09:32:36.593330|+1000|INFO |No valid store found|v2dlog|Distributed Log|139919669938112|Run|main.cpp(1151)
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : Creating SAP Hana Vora Distributed Log store ...
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : stdout from format: [Thu Aug 17 09:32:36 2017] Formatting store
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : stderr from format: 2017-08-17 09:32:36.615558|+1000|INFO |Command Line: /opt/vora/lib/vora-dlog/bin/v2dlog format --trace-level DEBUG --trace-to-stderr /var/local/vora/vora-dlog|v2dlog|Distributed Log|140176991168448|server_main|main.cpp(1323) #0122017-08-17 09:32:36.617444|+1000|INFO |Formatting store|v2dlog|Distributed Log|140176991168448|Run|main.cpp(1093) #0122017-08-17 09:32:36.617655|+1000|ERROR|boost::filesystem::status: Permission denied: "/var/local/vora/vora-dlog"|v2dlog|Distributed Log|140176991168448|Format|store.cpp(2107) #0122017-08-17 09:32:36.617693|+1000|FATAL|Could not format store.|v2dlog|Distributed Log|140176991168448|Run|main.cpp(1095) #012[Thu Aug 17 09:32:36 2017] Could not format store.
Aug 17 09:32:36 XXXXXX vora.vora-dlog: [c.63f700da] : Error while creating dlog store.
Aug 17 09:32:36 XXXXXX nomad[628]: client: task "vora-dlog-server" for alloc "058fd477-4e80-59ca-7703-e97f2ca1c8c2" failed: Wait returned exit code 1, signal 0, and error <nil>
[UPDATE2] So I see quite a few lines like this in Vora Manager log:
Aug 17 14:38:27 XXXXXX vora.vora-dlog: [c.2235f785] : Running['sudo', '-i', '-u', 'root', 'chown', 'vora:vora', '/var/log/vora/vora-dlog/']
And I would guess it should be successful, as on that node I see that the directory vora-dlog belongs to vora user:
-rw-r--r-- 1 vora vora 0 Jun 29 19:04 .keep
drwxrwx--- 2 vora vora 4096 Aug 16 10:31 dbdir
drwxrwx--- 6 root vora 4096 Aug 15 16:24 vora-discovery
drwxrwx--- 2 vora vora 4096 Aug 16 10:31 vora-dlog
drwxr-xr-x 4 root root 4096 Aug 15 16:23 vora-scheduler
The contents of vora-dlog is empty.

SQLite error attempt to write to a read-only database

I've been struggling to deploy my django project with Apache and mod_wsgi. I've had many problems that I managed to handle, but this one just seems not to be solvable.
I get the following error in my apache log when I enter the address setakshop.ir:8080 :
[Wed May 27 05:54:24 2015] [error] Internal Server Error: /en-gb/
[Wed May 27 05:54:24 2015] [error] Traceback (most recent call last):
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 87, in get_response
[Wed May 27 05:54:24 2015] [error] response = middleware_method(request)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/oscar/apps/basket/middleware.py", line 26, in process_request
[Wed May 27 05:54:24 2015] [error] strategy = selector.strategy(request=request, user=request.user)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/oscar/apps/partner/strategy.py", line 39, in strategy
[Wed May 27 05:54:24 2015] [error] return Default(request)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/oscar/apps/partner/strategy.py", line 57, in __init__
[Wed May 27 05:54:24 2015] [error] if request and request.user.is_authenticated():
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/utils/functional.py", line 224, in inner
[Wed May 27 05:54:24 2015] [error] self._setup()
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/utils/functional.py", line 357, in _setup
[Wed May 27 05:54:24 2015] [error] self._wrapped = self._setupfunc()
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/middleware.py", line 22, in <lambda>
[Wed May 27 05:54:24 2015] [error] request.user = SimpleLazyObject(lambda: get_user(request))
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/middleware.py", line 10, in get_user
[Wed May 27 05:54:24 2015] [error] request._cached_user = auth.get_user(request)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/auth/__init__.py", line 152, in get_user
[Wed May 27 05:54:24 2015] [error] user_id = request.session[SESSION_KEY]
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/backends/base.py", line 49, in __getitem__
[Wed May 27 05:54:24 2015] [error] return self._session[key]
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/backends/base.py", line 175, in _get_session
[Wed May 27 05:54:24 2015] [error] self._session_cache = self.load()
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/backends/db.py", line 29, in load
[Wed May 27 05:54:24 2015] [error] self.create()
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/backends/db.py", line 41, in create
[Wed May 27 05:54:24 2015] [error] self.save(must_create=True)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/backends/db.py", line 64, in save
[Wed May 27 05:54:24 2015] [error] obj.save(force_insert=must_create, using=using)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 589, in save
[Wed May 27 05:54:24 2015] [error] force_update=force_update, update_fields=update_fields)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 617, in save_base
[Wed May 27 05:54:24 2015] [error] updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 698, in _save_table
[Wed May 27 05:54:24 2015] [error] result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 731, in _do_insert
[Wed May 27 05:54:24 2015] [error] using=using, raw=raw)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 92, in manager_method
[Wed May 27 05:54:24 2015] [error] return getattr(self.get_queryset(), name)(*args, **kwargs)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 921, in _insert
[Wed May 27 05:54:24 2015] [error] return query.get_compiler(using=using).execute_sql(return_id)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 921, in execute_sql
[Wed May 27 05:54:24 2015] [error] cursor.execute(sql, params)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py", line 65, in execute
[Wed May 27 05:54:24 2015] [error] return self.cursor.execute(sql, params)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/utils.py", line 94, in __exit__
[Wed May 27 05:54:24 2015] [error] six.reraise(dj_exc_type, dj_exc_value, traceback)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py", line 65, in execute
[Wed May 27 05:54:24 2015] [error] return self.cursor.execute(sql, params)
[Wed May 27 05:54:24 2015] [error] File "/usr/local/lib/python2.7/dist-packages/django/db/backends/sqlite3/base.py", line 485, in execute
[Wed May 27 05:54:24 2015] [error] return Database.Cursor.execute(self, query, params)
[Wed May 27 05:54:24 2015] [error] OperationalError: attempt to write a readonly database
Now I googled this error and searched a lot, I mean a lot! I know that the db.sqlite file must have write permissions and it should be owned by apache ( www-data ), also the directory containing it must have write permissions and be owned by www-data. and I have done these two things. I don't have selinux installed, I've ran ./manage.py migrate, and I don't know what else I can do to make this work, I even chmoded both the directory and the db.sqlite file to 777 to just see if it will work, but it didn't and I know that's not a safe thing to do.
now for your information this is the folder containing the db.sqlite file:
-rw-rw-r-- 1 ashkan ashkan 382 Jan 30 11:54 README.rst
-rw-rw-r-- 1 ashkan ashkan 0 Jan 30 11:54 __init__.py
drwxrwxr-x 4 ashkan ashkan 4096 Feb 6 15:15 apps
-rwxrwxrwx 1 www-data www-data 741376 May 27 09:11 db.sqlite
drwxrwxr-x 7 ashkan ashkan 4096 Jan 30 11:54 deploy
drwxrwxr-x 2 ashkan ashkan 4096 Jan 30 11:54 fixtures
drwxrwxr-x 2 ashkan ashkan 4096 Feb 22 00:14 i18n
drwxrwxr-x 3 ashkan ashkan 4096 Feb 23 21:45 locale
drwxr-xr-x 2 ashkan ashkan 4096 May 26 08:33 logs
-rwxrwxr-x 1 ashkan ashkan 242 May 26 00:22 manage.py
lrwxrwxrwx 1 ashkan ashkan 10 May 25 23:15 oscar -> i18n/oscar
drwxrwxr-x 4 ashkan ashkan 4096 Jan 30 21:53 public
-rw-rw-r-- 1 ashkan ashkan 14306 May 26 00:07 settings.py
-rw-r--r-- 1 ashkan ashkan 10230 May 26 00:23 settings.pyc
-rw-rw-r-- 1 ashkan ashkan 14876 Feb 6 14:26 settings.py~
-rw-rw-r-- 1 ashkan ashkan 293 Jan 30 11:54 settings_mysql.py
-rw-rw-r-- 1 ashkan ashkan 266 May 26 00:16 settings_postgres.py
-rw-rw-r-- 1 ashkan ashkan 162 Jan 30 11:54 settings_sphinx.py
drwxrwxr-x 2 ashkan ashkan 4096 Feb 23 21:45 static
drwxrwxr-x 3 ashkan ashkan 4096 Jan 30 11:54 templates
-rwxrwxr-x 1 ashkan ashkan 1114 Jan 30 11:54 test_migrations.sh
-rwxrwxr-x 1 ashkan ashkan 1138 Jan 30 11:54 update_latest.sh
-rw-rw-r-- 1 ashkan ashkan 1573 Jan 30 11:54 urls.py
-rw-rw-r-- 1 ashkan ashkan 1427 Jan 30 21:53 urls.pyc
drwxr-xr-x 2 ashkan ashkan 4096 Jan 30 20:58 whoosh_index
-rw-rw-r-- 1 ashkan ashkan 778 May 26 00:23 wsgi.py
and this is the directory containing the db.sqlite file:
-rw-rw-r-- 1 ashkan ashkan 866 Jan 30 11:54 README.rst
drwxrwxr-x 2 ashkan ashkan 4096 Jan 30 11:54 _fixtures
-rw-rw-r-- 1 ashkan ashkan 897149 Feb 6 15:34 alaki
-rw-rw-r-- 1 ashkan ashkan 1818857 Feb 6 15:37 alaki.txt
drwxrwxr-x 9 ashkan ashkan 4096 Jan 30 11:54 demo
-rw-rw-r-- 1 ashkan ashkan 443627 Feb 6 15:34 out
drwxrwxrwx 12 www-data www-data 4096 May 27 09:11 sandbox
drwxrwxr-x 5 ashkan ashkan 4096 Jan 30 11:54 us
any idea what's causing this problem? thank you.
update 1 :
here is my apache config :
Listen 8080
<VirtualHost *:8080>
WSGIDaemonProcess setak python-path=/home/ashkan/freshcopy/django-oscar/sites/sandbox:/usr/local/lib/python2.7/site-packages
WSGIProcessGroup setak
WSGIScriptAlias / /home/ashkan/freshcopy/django-oscar/sites/sandbox/wsgi.py
ServerAdmin admin#setakshop.ir
ServerName setakshop.ir:8000
DocumentRoot /var/www/
Alias /media/ /home/ashkan/freshcopy/django-oscar/sites/sandbox/public/media/
Alias /static/ /home/ashkan/freshcopy/django-oscar/sites/sandbox/public/static/
<Directory /home/ashkan/freshcopy/django-oscar/sites/sandbox>
<Files wsgi.py>
Order allow,deny
allow from all
</Files>
</Directory>
<Directory /home/ashkan/freshcopy/django-oscar/sites/sandbox/public/static>
Order allow,deny
allow from all
</Directory>
<Directory /home/ashkan/freshcopy/django-oscar/sites/sandbox/public/media>
Order allow,deny
allow from all
</Directory>
DocumentRoot /var/www
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
<Directory /var/www/>
Options Indexes FollowSymLinks MultiViews
AllowOverride None
Order allow,deny
allow from all
</Directory>
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
AllowOverride None
Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
Order allow,deny
Allow from all
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
LogLevel info
CustomLog ${APACHE_LOG_DIR}/access.log combined
Alias /doc/ "/usr/share/doc/"
<Directory "/usr/share/doc/">
Options Indexes MultiViews FollowSymLinks
AllowOverride None
Order deny,allow
Deny from all
Allow from 127.0.0.0/255.0.0.0 ::1/128
</Directory>
</VirtualHost>
update 2 :
here is my wsgi.py :
import os
import sys
import site
import urllib
sys.stdout = sys.stderr
# Project root
root = '/home/ashkan/django-oscar/sites/sandbox'
sys.path.insert(0, root)
# Packages from virtualenv
activate_this = '/home/ashkan/django-oscar/oscar/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))
# Set environmental variable for Django and fire WSGI handler
os.environ['DJANGO_SETTINGS_MODULE'] = 'settings'
from django.core.wsgi import get_wsgi_application
_application = get_wsgi_application()
def application(environ, start_response):
environ['PATH_INFO'] = urllib.unquote(environ['REQUEST_URI'].split('?')[0])
return _application(environ, start_response)