pygame.init() fails when run with systemd - python-2.7

I'm trying to run a python pygame script with systemd and for some reason the script just exits without an error.
This is on a Raspberry Pi with Raspian "Jessie lite".
If I run the script manually with "sudo python myscript.py" it works fine.
sudo systemctl status myscript.service reports:
* myscript.service - Python Script
Loaded: loaded /etc/systemd/system/myscript.service; enabled) Active: inactive (dead) since Mon 2016-08-29 04:33:19 UTC; 1s ago
Process: 3275> ExecStart=/usr/bin/python /home/pi/myscript.py (code=killed, signal=HUP)
Main PID: 3275 (code=killed, signal=HUP)
If I start the service manually with sudo systemctl start myscript.service the same thing happens.
I've stripped down my script to just the pygame.init() call. This is where it exits.
If I try to initialize the modules manually then "cdrom", "joystick", "threads", and "font" initialize normally but a call to display.init() causes the program to exit. There is no exception raised.
The only resource I could find online is this guy. It seems he ran into the exact same thing I'm seeing.
I've tried strace and if I wait long enough (2 minutes), it will work!
Obviously I can't run with strace all the time. I think it slows down the execution of the initialization to somehow allow it to work.
EDIT:
So the issue appears to be systemd sending a SIGHUP. If this is unhandled in Python the default action is to exit. A quick fix is the catch SIGHUP:
import signal
def handler(signum, frame):
pass
try:
signal.signal(signal.SIGHUP, handler)
except AttributeError:
# Windows compatibility
pass
So many burning questions. Why does systemd do this? Why does strace fix the issue? Why do some Python scripts get SIGHUP while others don't?

I don't have an answer, but I'm experiencing the same thing, so I'll add some more detail.
Here's the cut-down code:
#!/usr/bin/python2.7
import logging
from pygame import display
import signal
import time
def handler(signum, frame):
"""Why is systemd sending sighups? I DON'T KNOW."""
logging.warning("Got a {} signal. Doing nothing".format(signum))
signal.signal(signal.SIGHUP, handler)
signal.signal(signal.SIGTERM, handler)
signal.signal(signal.SIGCONT, handler)
logging.warning("About to start display.")
try:
display.init() # hups
except Exception as ex:
logging.warning("Got any exception: %s " % ex)
logging.warning("Quitting in 60")
time.sleep(60)
Here's the log that produces:
Jul 8 22:30:27 beardog systemd[1]: Started PyGame Test.
Jul 8 22:30:27 beardog pygame[17406]: WARNING:root:About to start display.
Jul 8 22:30:27 beardog pygame[17406]: WARNING:root:Got a 1 signal. Doing nothing
Jul 8 22:30:27 beardog pygame[17406]: WARNING:root:Got a 18 signal. Doing nothing
Jul 8 22:30:27 beardog pygame[17406]: WARNING:root:Quitting in 60
It's getting a SIGCONT immediately after the SIGHUP, but no SIGTERM. Systemd claims to only ever send a SIGHUP after a SIGTERM, so maybe it's coming from somewhere else? I can't find anything relevant in the pygame code though.
I've turned on systemd debug logging, but it doesn't print anything interesting.
Here's my systemd config.
[Unit]
Description=PyGame Test
After=syslog.target network.target network-online.target graphical.target
[Service]
Type=simple
WorkingDirectory=/path/to/code/pygame/
ExecStart=/path/to/code/pygame/why.py
Restart=always
RestartSec=5
LimitNOFILE=10000
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=pygame
SendSIGHUP=no
[Install]
WantedBy=multi-user.target
I've tried this as Type=forking, oneshot and dbus (although it isn't any of them). I've also tried TimeoutStartSec=20, but no change. Tested on an Ubuntu Xenial laptop and on a Raspberry pi running Raspbian. Both python2.7 and python3. Code works fine when run manually, and seems to work when run under strace in systemd. /o\
Like the OP, I can work around it by catching the SIGHUP, but after this much debugging I'd love to know what's going on.

For me the solution was to only init specific modules which I needed.
In my case instead pygame.init() I initialized only pygame.mixer.init() and now systemd can start the service.

First, check your logs:
journalctl -u myservice.service
Adding this to your unit file may improve the logging:
StandardOutput=journal+console
In some cases if some logging happens just before the service exits, the logs won't get tagged with the the service, so after you run the service, also look at:
journalctl
For entries just after your service exited.
Also, instead using the network.target, try the network-online.target.
I presume the app runs OK from the command line by itself. Assuming it works that way and not from systemd, differing environment variables can be an issue. Add a line to the top of your app to dump all the environment variables and compare that output when run from systemd vs the CLI.
Finally, review the docs for Type=. If the default Type= doesn't apply to your case, set this appropriately.
Future questions specific to systemd may be better answered at the Unix & Linux StackExchange as they don't directly relate to programming.
You should also post your systemd unit file since that's what you are having trouble with.

Related

Can systemd version be upgraded to v240 or higher on Centos 7?

Currentl I'm facing an issue where I would like to redirect stdout/stderr to specific log files. I have a created a service file for systemd service where I have added the steps for redirection, which is not working because current version 219 of systemd on the system does not support it and would require v240+ to work. My machine is CentOS 7.7.
Service file:
=============
[Unit]
Description=Process Monitoring and Control Daemon
After=rc-local.service nss-user-lookup.target
[Service]
User=jams
#Type=forking
WorkingDirectory=/opt/workspace/Dashboard/source-code/dashboard/
ExecStart=/opt/workspace/.env/bin/python kafka_consumer.py
StandardOutput=append:/data/dashboard/access.log
StandardError=append:/data/dashboard/error.log
Restart=always
[Install]
WantedBy=multi-user.target
$ systemctl --version
systemd 219
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN
$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
Sorry, I'm not much familiar and have very less knowledge about this. Is it possible to upgrade the version of systemd to v240 or higher on CentOS 7.7 ? If yes, could anyone please point me in the right direction to get the source code and with the steps to build the package ?
Or, any alternatives ?
I see the question is rather old, but if you are still searching for a solution then something like this could work:
ExecStart=/opt/workspace/.env/bin/python kafka_consumer.py >> /data/dashboard/access.log 2>>/data/dashboard/error.log
A similar solution works for me for a similar problem on CentOS 7 with systemd 219.

How to fix Blogdown::serve_site() timeout error?

I have tried the solutions I found online for this, but none of them seem to work...
Even after installing blogdown from github and updating hugo, I get a timeout error.
remotes::install_github('rstudio/blogdown')
blogdown::update_hugo()
That is, when I run blogdown::serve_site I get this error:
Launching the server via the command: C:\Users\Master\AppData\Roaming\Hugo\hugo.exe server --bind 127.0.0.1 -p 4321 --themesDir themes -t hugo-future-imperfect -D -F --navigateToChanged ERROR: The process "10244" not found. Error: It took more than 30 seconds to launch the server. There may be something wrong. The process has been killed. If the site needs more time to be built and launched, set options(blogdown.server.timeout) to a larger value.
Is there another way to fix this?
What is causing this error?
Thanks!
I have the same problem on Kubuntu 20.10. I installed latest blogdown and hugo serveral days ago. I have blogdown 0.21.47 and hugo 0.79.0 on Kubuntu. Finally, I found a method to fix it.
Close Rstudio
Open terminal
Enter directory of your website
Execute hugo server -D
Press Ctrl C to stop
Open Rstudio
Click Addin -> BLOGDOWN -> Serve Site
I don't know why it works or whether it works in your case. Please test it and let us know whether it works.
I am running into the same problem.
blogdown::serve_site()
wont work:
Error: It took more than 30 seconds to launch the server. An error might have occurred with hugo. You may run blogdown::build_site() and see if it gives more info. If the site is very large and needs more time to be built, set options(blogdown.server.timeout) to a larger value.
although blogdown::build_site() works smoothly. The blogdown::serve_site() function works fine for a newly created project/website.
For repository see here.
Not sure if it is a blogdown, hugo or academic theme issue.
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
Matrix products: default
locale:
[1] LC_COLLATE=Dutch_Netherlands.1252 LC_CTYPE=Dutch_Netherlands.1252
[3] LC_MONETARY=Dutch_Netherlands.1252 LC_NUMERIC=C
[5] LC_TIME=Dutch_Netherlands.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] servr_0.20
loaded via a namespace (and not attached):
[1] Rcpp_1.0.5 rstudioapi_0.13 knitr_1.30 magrittr_2.0.1 mnormt_2.0.2 pbivnorm_0.6.0
[7] R6_2.5.0 rlang_0.4.9 tools_4.0.3 tmvnsim_1.0-2 xfun_0.19 htmltools_0.5.0
[13] yaml_2.2.1 digest_0.6.27 lavaan_0.6-7 bookdown_0.21 processx_3.4.5 later_1.1.0.1
[19] promises_1.1.1 ps_1.4.0 evaluate_0.14 rmarkdown_2.5 blogdown_0.21.50 compiler_4.0.3
[25] stats4_4.0.3 jsonlite_1.7.1 httpuv_1.5.4
I had the same problem on. Installing "go" (https://golang.org/doc/install) helped - this was missing after my clean install of big sur.

Pabot - Unable to run parallel robotframework tests

So, I'm working on a robotframework test project, and the goal is to run several test suites in parallel. For this purpose, pabot was chosen as the solution. I am trying to implement it, but with little success.
My issue is: after installing Pabot (which, I might say, I did by cloning the project and running "setup.py install", instead of using pip, since the corporate proxy I'm behind has proven an obstacle I can't overcome), I created a new directory in the project tree, moved some suites there, and ran:
pabot --processes 2 --outputdir pabot_results Login*.robot
Doing so results in the following error message:
2018-10-10 10:27:30.449000 [PID:9676] [0] EXECUTING Suites.LoginAdmin
2018-10-10 10:27:30.449000 PID:400 EXECUTING Suites.LoginUser
2018-10-10 10:27:30.777000 PID:400 FAILED Suites.LoginUser
2018-10-10 10:27:30.777000 [PID:9676] [0] FAILED Suites.LoginAdmin
WARN: No output files in "pabot_results\pabot_results"
Output:
[ ERROR ] Reading XML source '' failed: invalid mode ('rb') or filename
Try --help for usage information.
Elapsed time: 0 minutes 0.578 seconds
Upon inspecting the stderr file that was generated, I have this message:
Traceback (most recent call last):
File "C:\Python27\Lib\site-packages\robotframework-3.1a2.dev1-py2.7.egg\robot\running\runner.py", line 22, in
from .context import EXECUTION_CONTEXTS
ValueError: Attempted relative import in non-package
Apparently, this has to do with something from the runner.py script, which, if I'm not mistaken, came with the installation of robotframework. Since manually modifying that script does not seem to me the optimal solution, my question is, what am I missing here? Did I forget to do anything while setting this up? Or is this an issue of compatibility between versions?
This project is using Maven as the tool to manage dependencies. The version I am running is 3.5.4. I am using a Windows 10, 64bit system; I have Python 2.7.14, and Robot Framework 3.1a2.dev1. The Pabot version is 0.44. Obviously, I added C:\Python27 and C:\Python27\Scripts to the PATH environment variable.
Edit: I am also using robotframework-maven-plugin version 1.4.0.8, if that happens to be relevant.
Edit 2: added the error messages in text format.
I believe I've come across an issue similar when setting up parallel execution on my machine. Firstly I would confirm that pabot is installed using pip show robotframework-pabot.
Then you should define the directory your results are going to using -d.
I then modified the name of the -o to Output.xml to make it easy to identify.
This is a copy of the code I use. Runs optimally with 8 processes
pabot --processes 8 -d results -o Output.xml Tests
Seems that you stumbled on a bug in the prerelease version of robot framework (3.1a2.dev1).
Please install a release version of robot framework. For example 3.0.4.
Just in case anyone happens to stumble upon this issue in the future:
Since I can't use pip, and I tried a good deal of workarounds that eventually made things more unstable, I ended up saving my project and removing everything Python-related from my system, so as to allow me to install everything from scratch. In a Windows 10, 64bit system, I used:
Python 2.7.14
wxPython 2.8.12.1, win64, unicode, for py27
setuptools 40.2.0 (to allow me to use the easy_install command)
Robot Framework 3.0.4
robotremoteserver 1.1
Selenium2Library 3.0.0
and Pabot version 0.45.
I might add that, when installing the Selenium2Library the way I described above, it eventually tries to download some things from the pip repositories - which, if you have a proxy, will cause you trouble. I solved this problem by browsing https://pypi.org/simple/selenium/, manually downloading the 2.53.6 .tar.gz file, then extracting it and running setup.py install on the command line.
PS: Ideally, though, anyone should be able to use proxy settings from the command line (--proxy http://user:password#server:port) to get pip and then use it; however, for some reason, probably related to network security configurations that I didn't want to lose time with, this didn't work in my case.

getting 'cgroup change of group failed' when trying to add process to cgroup

I did the following both on Ubuntu 14 and SUSE Linux Enterprise Server 11 (x86_64) where libcgroup is installed, with root:
cgcreate -t ngam:home -a ngam:home -g cpuset:/nadav2ndCpuSet
cgset -r cpuset.cpus=1 nadav2ndCpuSet
After that, if you cat /sys/fs/cgroup/cpuset/nadav2ndCpuSet/cpuset.cpus,
you will get:
1
which is good! as it is supposed to work.
Then, from user ngam, I ran the following cmd:
cgexec -g cpuset:nadav2ndCpuSet ~/whileLoop
where whileLoop is just a simple program that runs in a loop doing sqrt.
After that, I got the following error msg:
cgroup change of group failed
Why is it happening?
Thanks!
I ran into something similar while playing with cgroups on Ubuntu 16.04 just now.
When using the controller cpuset, cpus and mems are not initiated. Therefor you manually have to do it. Since you already specified cpuset.cpus you only need to set cpuset.mems
simply running
echo 0 > /sys/fs/cgroup/cpuset/nadav2ndCpuSet/cpuset.mems
or
cgset -r cpuset.mems=0 nadav2ndCpuSet
would solve your problem.
for more info on cpuset see http://man7.org/linux/man-pages/man7/cpuset.7.html
What I found is I forgot to make cgconfig start with system reboot, so a simple systemctl start cgconfig resolve the problem, and then do not forget systemctl enable cgconfig to make it start with system reboot.
I know my this answer might not be relevant to the question. I hope when people search the error cgroup change of group failed, this answer could help them.
BTW: systemctl start cgconfig is for centos 7, for centos 6 you may use service cgconfig start / chkconfig cgconfig on

serial.Serial.readline() raises SerialException, but the same code worked a week ago

I have a pair of applications that communicate by sending text (in one direction only) over a serial port. They have been working great for a while. Last week the reading side stopped working on my machine, and raises a SerialException whenever I call the readline() method of my serial.Serial object. The same code works fine on another machine! The only thing I can think of that could have caused a problem is that I installed a bunch of system updates the night before this happened (Any idea how to see the history on that?). I'm using Ubuntu and Python 2.7.6 (see below), and as far as I can tell I have the same python packages installed on both machines.
I've written two small sample apps to try to troubleshoot, and am getting the following error on the reading side:
File "./reader.py", line 16, in <module>
s = port.readline()
File "/usr/local/lib/python2.7/dist-packages/serial/serialposix.py", line 475, in read
raise SerialException('device reports readiness to read but returned no data (device disconnected or multiple access on port?)')
It doesn't seem to matter whether I use a "real" port or a "virtual" port, so this can be reproduced by creating two virtual ports with the following command:
socat -d -d pty,raw,echo=0 pty,raw,echo=0
Here is the sample "writer.py" that I created for troubleshooting:
#!/usr/bin/env python
from __future__ import print_function
import serial
print( 'Opening port' )
port = serial.Serial( port='/dev/pts/5', # Substitute the correct port here!
baudrate=115200,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
timeout=0.25 )
while True:
s = raw_input()
if s:
port.write( s + '\n' )
This works great - I can read the text coming through the port using an app like "Serial port terminal" or such.
Here is the sample "reader.py" that works find on another machine but fails immediately on mine:
#!/usr/bin/env python
from __future__ import print_function
import serial
print( 'Opening port' )
port = serial.Serial( port='/dev/pts/10',
baudrate=115200,
bytesize=serial.EIGHTBITS,
parity=serial.PARITY_NONE,
stopbits=serial.STOPBITS_ONE,
timeout=0.25 )
while True:
s = port.readline()
if s:
print( s )
Once I create the virtual port with the socat command and run "reader.py", I always get the exception immediately. Any ideas what might have changed on my machine that would cause this failure?
System info:
~/temp$ uname -a
Linux alonghi-ubu 3.13.0-65-generic #105-Ubuntu SMP Mon Sep 21 18:50:58 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
~/temp$ python --version
Python 2.7.6
Ubuntu 14.04 3.13.0.65 kernel breaks python serial communication. Try downgrading kernel to 3.13.0-63 and serial communication should work as before