Confused regarding python positional arguments, something like args=(i,) - tuples

Let's say I tested the following code to see the sub process pool act:
# coding=utf-8
import os
import sys
from multiprocessing import Pool
import time
import random
def run_proc(param1):
print("child procees %s pid is %s,parent id is %s" %
(param1, os.getpid(), os.getppid()))
starttime = time.time()
time.sleep(random.random() * 3)
endtime = time.time()
print('child process %s runs %0.2f seconds.' %
(param1, (endtime - starttime)))
if __name__ == '__main__':
print(sys.version)
pname = sys.argv[0].split('/')[-1]
print("process %s is running now...,it's pid is %s" % (pname, os.getpid()))
p = Pool(5)
for i in range(5):
p.apply_async(run_proc, args=("test"+str(i),))
print("waiting for all subprocess to end...")
p.close()
p.join()
print("all subprocesses are over!")
And that the output was all that I expected:
3.5.0 (default, Jul 23 2017, 10:55:33)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
process mp_basic_pool.py is running now...,it's pid is 19352
waiting for all subprocess to end...
child procees test0 pid is 19367,parent id is 19352
child procees test1 pid is 19368,parent id is 19352
child procees test2 pid is 19369,parent id is 19352
child procees test3 pid is 19370,parent id is 19352
child procees test4 pid is 19371,parent id is 19352
child process test2 runs 0.93 seconds.
child process test4 runs 1.33 seconds.
child process test3 runs 1.68 seconds.
child process test0 runs 2.68 seconds.
child process test1 runs 2.90 seconds.
all subprocesses are over!
[Finished in 3.2s]
There is the line "p.apply_async(run_proc, args=("test"+str(i),))". When I first wrote this code, I wrote it as: "p.apply_async(run_proc, args=("test"+str(i)))". A comma left, but the output was as:
3.5.0 (default, Jul 23 2017, 10:55:33)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
process mp_basic_pool.py is running now...,it's pid is 19382
waiting for all subprocess to end...
all subprocesses are over!
[Finished in 0.4s]
I looked for the python document and found that the second parameter should be a tuple, but is the comma be needed?

Single element tuples (which ("test"+str(i),) is) require a trailing comma to differentiate them from just a pair of parenthesis.
Think about it this way: for (x), without the comma, how is the interpreter supposed to know if you meant to use parenthesis for grouping, or to make a tuple? It's ambiguous.

Related

Timedeltasensor delaying from schedule interval

I have a job which runs at 13:30. Of which first task takes almost 1 hour to complete after that we need to wait 15 mins. So, I am using Timedeltasensor like below.
waitfor15min = TimeDeltaSensor(
task_id='waitfor15min',
delta=timedelta(minutes=15),
dag=dag)
However in logs, It is showing schedule_interval + 15 min like below
[2020-11-05 20:36:27,013] {time_delta_sensor.py:45} INFO - Checking if the time (2020-11-05T13:45:00+00:00) has come
[2020-11-05 20:36:27,013] {base_sensor_operator.py:79} INFO - Success criteria met. Exiting.
[2020-11-05 20:36:30,655] {logging_mixin.py:95} INFO - [2020-11-05 20:36:30,655] {jobs.py:2612} INFO - Task exited with return code 0
How can I create delay between job??
You could use PythonOperator and write a function that simply waits 15 minutes. There is an example on how a wait task could look like:
def my_sleeping_function(random_base, **kwargs)):
"""This is a function that will run within the DAG execution"""
time.sleep(random_base)
# Generate 5 sleeping tasks, sleeping from 0.0 to 0.4 seconds respectively
for i in range(5):
task = PythonOperator(
task_id='sleep_for_' + str(i),
python_callable=my_sleeping_function,
op_kwargs={'random_base': float(i) / 10},
provide context=true,
dag=dag,
)
run_this >> task

Flask-sqlalchemy / uwsgi: DB connection problem when more than on process is used

I have a Flask app running on Heroku with uwsgi server in which each user connects to his own database. I have implemented the solution reported here for a very similar situation. In particular, I have implemented the connection registry as follows:
class DBSessionRegistry():
_registry = {}
def get(self, URI, **kwargs):
if URI not in self._registry:
current_app.logger.info(f'INFO - CREATING A NEW CONNECTION')
try:
engine = create_engine(URI,
echo=False,
pool_size=5,
max_overflow=5)
session_factory = sessionmaker(bind=engine)
Session = scoped_session(session_factory)
a_session = Session()
self._registry[URI] = a_session
except ArgumentError:
raise Exception('Error')
current_app.logger.info(f'SESSION ID: {id(self._registry[URI])}')
current_app.logger.info(f'REGISTRY ID: {id(self._registry)}')
current_app.logger.info(f'REGISTRY SIZE: {len(self._registry.keys())}')
current_app.logger.info(f'APP ID: {id(current_app)}')
return self._registry[URI]
In my create_app() I assign a registry to the app:
app.DBregistry = DBSessionRegistry()
and whenever I need to talk to the DB I call:
current_app.DBregistry.get(URI)
where the URI is dependent on the user. This works nicely if I use uwsgi with one single process. With more processes,
[uwsgi]
processes = 4
threads = 1
sometimes it gets stuck on some requests, returning a 503 error code. I have found that the problem appears when the requests are handled by different processes in uwsgi. This is an excerpt of the log, which I commented to illustrate the issue:
# ... EVERYTHING OK UP TO HERE.
# ALL PREVIOUS REQUESTS HANDLED BY PROCESS pid = 12
INFO in utils: SESSION ID: 139860361716304
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
INFO in utils: APP ID: 139860526857584
# NOTE THE pid IN THE NEXT LINE...
[pid: 12|app: 0|req: 1/1] POST /manager/_save_task =>
generated 154 bytes in 3457 msecs (HTTP/1.1 200) 4 headers in 601
bytes (1 switches on core 0)
# PREVIOUS REQUEST WAS MANAGED BY PROCESS pid = 12
# THE NEXT REQUEST IS FROM THE SAME USER AND TO THE SAME URL.
# SO THERE IS NO NEED FOR CREATING A NEW CONNECTION, BUT INSTEAD...
INFO - CREATING A NEW CONNECTION
# TO THIS POINT, I DON'T UNDERSTAND WHY IT CREATED A NEW CONNECTION.
# THE SESSION ID CHANGES, AS IT IS A NEW SESSION
INFO in utils: SESSION ID: 139860363793168 # <<--- CHANGED
INFO in utils: REGISTRY ID: 139860484608480
INFO in utils: REGISTRY SIZE: 1
# THE APP AND THE REGISTRY ARE UNIQUE
INFO in utils: APP ID: 139860526857584
# uwsgi GIVES UP...
*** HARAKIRI ON WORKER 4 (pid: 11, try: 1) ***
# THE FAILED REQUEST WAS MANAGED BY PROCESS pid = 11
# I ASSUME THIS IS WHY IT CREATED A NEW CONNECTION
HARAKIRI: -- syscall> 7 0x7fff4290c6d8 0x1 0xffffffff 0x4000 0x0 0x0
0x7fff4290c6b8 0x7f33d6e3cbc4
HARAKIRI: -- wchan> poll_schedule_timeout
HARAKIRI !!! worker 4 status !!!
HARAKIRI [core 0] - POST /manager/_save_task since 1587660997
HARAKIRI !!! end of worker 4 status !!!
heroku[router]: at=error code=H13 desc="Connection closed without
response" method=POST path="/manager/_save_task"
DAMN ! worker 4 (pid: 11) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 4 (new pid: 14)
# FROM HERE ON, NOTHINGS WORKS ANYMORE
This behavior is consistent over several attempts: when the pid changes, the request fails. Even with a pool_size = 1 in the create_engine function the issue persists. No issue instead is uwsgi is used with one process.
I am pretty sure it is my fault, there is something I don't know or I don't understand about how uwsgi and/or sqlalchemy work. Could you please help me?
Thanks
What is hapeening is that you are trying to share memory between processes.
There are some exaplanations in these posts.
(is it possible to share memory between uwsgi processes running flask app?).
(https://stackoverflow.com/a/45383617/11542053)
You can use an extra layer to store your sessions outsite of the app.
For that, you can use uWsgi's SharedArea(https://uwsgi-docs.readthedocs.io/en/latest/SharedArea.html) which is very low level or you can user other approaches like uWsgi's caching(https://uwsgi-docs.readthedocs.io/en/latest/Caching.html)
hope it helps.

PAM Authentication failure for root during pexpect python

the below observation is not always the case, but after some time accessing the SUT several times with ssh with root user and correct password the python code gets into trouble with:
Apr 25 05:51:56 SUT sshd[31570]: pam_tally2(sshd:auth): user root (0) tally 83, deny 10
Apr 25 05:52:16 SUT sshd[31598]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.10.10.13 user=root
Apr 25 05:52:21 SUT sshd[31568]: error: PAM: Authentication failure for root from 10.10.10.13
Apr 25 05:52:21 SUT sshd[31568]: Connection closed by 10.10.10.13 [preauth]
This is the below python code:
COMMAND_PROMPT = '.*:~ #'
SSH_NEWKEY = '(?i)are you sure you want to continue connecting'
def scp(source, dest, password):
cmd = 'scp ' + source + ' ' + dest
try:
child = pexpect.spawn('/bin/bash', ['-c', cmd], timeout=None)
res = child.expect([pexpect.TIMEOUT, SSH_NEWKEY, COMMAND_PROMPT, '(?i)Password'])
if res == 0:
print('TIMEOUT Occurred.')
if res == 1:
child.sendline('yes')
child.expect('(?i)Password')
child.sendline(password)
child.expect([pexpect.EOF], timeout=60)
if res == 2:
pass
if res == 3:
child.sendline(password)
child.expect([pexpect.EOF], timeout=60)
except:
print('File not copied!!!')
self.logger.error(str(self.child))
When the ssh is unsuccessful, this is the pexpect printout:
version: 2.3 ($Revision: 399 $)
command: /usr/bin/ssh
args: ['/usr/bin/ssh', 'root#100.100.100.100']
searcher: searcher_re:
0: re.compile(".*:~ #")
buffer (last 100 chars): :
Account locked due to 757 failed logins
Password:
before (last 100 chars): :
Account locked due to 757 failed logins
Password:
after: <class 'pexpect.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 2284
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0
delayafterclose: 0.1
delayafterterminate: 0.1
Any clue maybe what could it be, is it maybe anything missing or wrong configured for pam authentication on my SUT? The problem is that when the SUT starts with this pam failures then python code will always have the problem and only a reboot of the SUT seems to help :(
Manually accessing the SUT via ssh root#... is always working, even if pexpect can't!!! The account seems not to be locked according to:
SUT:~ # passwd -S root
root P 04/24/2017 -1 -1 -1 -1
I have looked into some other questions but no real solution is mentioned or could work with my python code.
Thanks in adv.
My work around is to modify for testing purpose the pam_tally configuration files. It seems that the SUT acknowledge the multiple access as a threat and locks even the root account!
By removing this entry even_deny_root root_unlock_time=5 in the several pam_tally configuration files:
/etc/pam.d/common-account:account required pam_tally2.so deny=10 onerr=fail unlock_time=600 even_deny_root root_unlock_time=5 file=/home/test/faillog
/etc/pam.d/common-auth:auth required pam_tally2.so deny=10 onerr=fail unlock_time=600 even_deny_root root_unlock_time=5 file=/home/test/faillog
Those changes will be activated dynamically no restart of service needed!
Note: after reboot those entries will be most likely back!

create background process with Python's Popen

I'm a new bie to python. Recently, I face a problem with Python Popen, and hope someone can help me. Thanks :D
a.py
#!/usr/bin/env python
import b
b.run()
while True:
pass
b.py
#!/usr/bin/env python
import subprocess
def run():
subprocess.Popen(['ping www.google.com > /dev/null &'], shell=True)
run()
When run b.py and grep process status
$ ps aux | grep
test 35806 0.0 0.0 2451284 592 s010 Sl 10:11 0:00.00 ping www.google.com
the ping process runs in background with STATE Sl.
And now I try run a.py
$ ps aux | grep
test 36088 0.0 0.0 2444116 604 s010 Sl+ 10:15 0:00.00 ping www.google.com
ping process STATE changes to Sl+, and if I stop a.py with ctrl + c, ping process also terminated.
Is there any way to make ping process to run in backgroud, and it will not be affected when I stop a.py? And why ping process STATE changes from Sl to Sl+?
After research, I found that we can add "preexec_fn=os.setsid", and it solve the problem.
subprocess.Popen(['ping www.google.com > /dev/null &'], shell=True, preexec_fn=os.setsid)

uwsgi worker processes keep running

I am using emperor mode and noticed a couple of uwsgi worker processes keep using CPU.
Here is the ini config for the particular website
[uwsgi]
socket = /tmp/%n.sock
master = true
processes = 2
env = DJANGO_SETTINGS_MODULE=abc.settings
module = django.core.handlers.wsgi:WSGIHandler()
pythonpath = /var/www/abc/abc
chdir = /var/www/abc/abc
chmod-socket = 666
uid = www-data
virtualenv = /var/www/abc
vacuum = true
procname-prefix-spaced = %n
plugins = python
enable-threads = true
single-interpreter = true
sharedarea = 4
htop shows:
13658 www-data 20 0 204M 59168 4148 S 3.0 3.5 3h03:50 abc uWSGI worker 1
13659 www-data 20 0 209M 65092 4428 S 1.0 3.8 3h02:02 abc uWSGI worker 2
I have checked nginx and uwsgi log and both not showing the site is be accessed.
The question is:
why the workers keep using around 1-5% of the CPU when the site is not being accessed.
I think I have found the cause of this, in development, I am using the timer to monitor code changes then reload the uwsgi processes, and I think it's because the project is using django-cms and it's kind of big, so constantly monitoring for code changes every second is kind of heavy, after changing the timer to 5 seconds the processes actually gone quiet.