How to use Tivix django-cron app - django

I got exact same problem described in this post, but the answer doesn't help at all. In short, I am using Tivix django-cron, the cron job is not running at regular basis.
To illustrate the problem, the following cron job class is intended to send email every min once running runcrons command. But in fact, it only sends out one email and no more. That defeats the purpose of cron... What am I missing?
class TestCron(CronJobBase):
schedule = Schedule(run_every_mins=1)
code = 'test_cron_philip'
def do(self):
send_mail('cron test', 'body is test body', 'coach_zhong#163.com',
['admin#dessert.webfactional.com'],fail_silently=False)

Yes, you miss something ("runcrons" is not background deamon). From documentation:
"Now everytime you run the management command python manage.py
runcrons all the crons will run if required. Depending on the
application the management command can be called from the Unix crontab
as often as required. Every 5 minutes usually works for most of my
applications."
That means you have to put "runcrons" command in your crontab.
Example:
You have some CronJob that do something every 30 min.
To get this running you must edit you crontab (linux, mac) or task scheduler (windows) to run "python manage.py runcrons" for every, let say 1 min.
If you get this running, your CronJob will be pinged every 1 min and run if necessary (every 30 min or whatever value you have set).
Hope this helps.

Related

AWS Cron is not working for removing files [duplicate]

I have set up a cronjob for root user in ubuntu environment as follows by typing crontab -e
34 11 * * * sh /srv/www/live/CronJobs/daily.sh
0 08 * * 2 sh /srv/www/live/CronJobs/weekly.sh
0 08 1 * * sh /srv/www/live/CronJobs/monthly.sh
But the cronjob does not run. I have tried checking if the cronjob is running using pgrep cron and that gives process id 3033. The shell script calls a python file and is used to send an email. Running the python file is ok. There's no error in it but the cron doesn't run. The daily.sh file has the following code in it.
python /srv/www/live/CronJobs/daily.py
python /srv/www/live/CronJobs/notification_email.py
python /srv/www/live/CronJobs/log_kpi.py
WTF?! My cronjob doesn't run?!
Here's a checklist guide to debug not running cronjobs:
Is the Cron daemon running?
Run ps ax | grep cron and look for cron.
Debian: service cron start or service cron restart
Is cron working?
* * * * * /bin/echo "cron works" >> /tmp/file
Syntax correct? See below.
You obviously need to have write access to the file you are redirecting the output to. A unique file name in /tmp which does not currently exist should always be writable.
Probably also add 2>&1 to include standard error as well as standard output, or separately output standard error to another file with 2>>/tmp/errors
Is the command working standalone?
Check if the script has an error, by doing a dry run on the CLI
When testing your command, test as the user whose crontab you are editing, which might not be your login or root
Can cron run your job?
Check /var/log/cron.log or /var/log/messages for errors.
Ubuntu: grep CRON /var/log/syslog
Redhat: /var/log/cron
Check permissions
Set executable flag on the command: chmod +x /var/www/app/cron/do-stuff.php
If you redirect the output of your command to a file, verify you have permission to write to that file/directory
Check paths
Check she-bangs / hashbangs line
Do not rely on environment variables like PATH, as their value will likely not be the same under cron as under an interactive session. See How to get CRON to call in the correct PATHs
Don't suppress output while debugging
Commonly used is this suppression: 30 1 * * * command > /dev/null 2>&1
Re-enable the standard output or standard error message output by removing >/dev/null 2>&1 altogether; or perhaps redirect to a file in a location where you have write access: >>cron.out 2>&1 will append standard output and standard error to cron.out in the invoking user's home directory.
If you don't redirect output from a cron job, the daemon will try to send you any output or error messages by email. Check your inbox (maybe simply more $MAIL if you don't have a mail client). If mail is not available, maybe check for a file named dead.letter in your home directory, or system log entries saying that the output was discarded. Especially in the latter case, probably edit the job to add redirection to a file, then wait for the job to run, and examine the log file for error messages or other useful feedback.
If you are trying to figure out why something failed, the error messages will be visible in this file. Read it and understand it.
Still not working? Yikes!
Raise the cron debug level
Debian
in /etc/default/cron
set EXTRA_OPTS="-L 2"
service cron restart
tail -f /var/log/syslog to see the scripts executed
Ubuntu
in /etc/rsyslog.d/50-default.conf
add or comment out line cron.* /var/log/cron.log
reload logger sudo /etc/init.d/rsyslog restart
re-run cron
open /var/log/cron.log and look for detailed error output
Reminder: deactivate log level, when you are done with debugging
Run cron and check log files again
Cronjob Syntax
# Minute Hour Day of Month Month Day of Week User Command
# (0-59) (0-23) (1-31) (1-12 or Jan-Dec) (0-6 or Sun-Sat)
0 2 * * * root /usr/bin/find
This syntax is only correct for the root user. Regular user crontab syntax doesn't have the User field (regular users aren't allowed to run code as any other user);
# Minute Hour Day of Month Month Day of Week Command
# (0-59) (0-23) (1-31) (1-12 or Jan-Dec) (0-6 or Sun-Sat)
0 2 * * * /usr/bin/find
Crontab Commands
crontab -l
Lists all the user's cron tasks.
crontab -e, for a specific user: crontab -e -u agentsmith
Starts edit session of your crontab file.
When you exit the editor, the modified crontab is installed automatically.
crontab -r
Removes your crontab entry from the cron spooler, but not from crontab file.
Another reason crontab will fail: Special handling of the % character.
From the manual file:
The entire command portion of the line, up to a newline or a
"%" character, will be executed by /bin/sh or by the shell specified
in the SHELL variable of the cronfile. A "%" character in the
command, unless escaped with a backslash (\), will be changed into
newline characters, and all data after the first % will be sent to
the command as standard input.
In my particular case, I was using date --date="7 days ago" "+%Y-%m-%d" to produce parameters to my script, and it was failing silently. I finally found out what was going on when I checked syslog and saw my command was truncated at the % symbol. You need to escape it like this:
date --date="7 days ago" "+\%Y-\%m-\%d"
See here for more details:
http://www.ducea.com/2008/11/12/using-the-character-in-crontab-entries/
Finally I found the solution. Following is the solution:-
Never use relative path in python scripts to be executed via crontab.
I did something like this instead:-
import os
import sys
import time, datetime
CLASS_PATH = '/srv/www/live/mainapp/classes'
SETTINGS_PATH = '/srv/www/live/foodtrade'
sys.path.insert(0, CLASS_PATH)
sys.path.insert(1,SETTINGS_PATH)
import other_py_files
Never supress the crontab code instead use mailserver and check the mail for the user. That gives clearer insights of what is going.
I want to add 2 points that I learned:
Cron config files put in /etc/cron.d/ should not contain a dot (.). Otherwise, it won't be read by cron.
If the user running your command is not in /etc/shadow. It won't be allowed to schedule cron.
Refs:
http://manpages.ubuntu.com/manpages/xenial/en/man8/cron.8.html
https://help.ubuntu.com/community/CronHowto
To add another point, a file in /etc/cron.d must contain an empty new line at the end. This is likely related to the response by Luciano which specifies that:
The entire command portion of the line, up to a newline or a "%"
character, will be executed
I found useful debugging information on an Ubuntu 16.04 server by running:
systemctl status cron.service
In my case I was kindly informed I had left a comment '#' off of a remark line:
Aug 18 19:12:01 is-feb19 cron[14307]: Error: bad minute; while reading /etc/crontab
Aug 18 19:12:01 is-feb19 cron[14307]: (*system*) ERROR (Syntax error, this crontab file will be ignored)
It might also be a timezone problem.
Cron uses the local time.
Run the command timedatectl to see the machine time and make sure that your crontab is in this same timezone.
https://askubuntu.com/a/536489/1043751
I had a similar problem to the link below.
similar to my problem
my original post
My Issue
My issue was that cron / crontab wouldn't execute my bash script. that bash script executed a python script.
original bash file
#!/bin/bash
python /home/frosty/code/test_scripts/test.py
python file (test.py)
from datetime import datetime
def main():
dt_now = datetime.now()
string_now = dt_now.strftime('%Y-%m-%d %H:%M:%S.%f')
with open('./text_file.txt', 'a') as f:
f.write(f'wrote at {string_now}\n')
return None
if __name__ == '__main__':
main()
the error I was getting
File "/home/frosty/code/test_scripts/test.py", line 7
string_to_write = f'wrote at {string_now}\n'
^
SyntaxError: invalid syntax
this error didn't make sense because the code executed without error from the bash file and the python file.
** Note -> ensure in the crontab -e file you don't suppress the output. I sent the output to a file by adding >>/path/to/cron/output/file.log 2>&1 after the command. below is my crontab -e entry
*/5 * * * * /home/frosty/code/test_scripts/echo_message_sh >>/home/frosty/code/test_scripts/cron_out.log 2>&1
the issue
cron was using the wrong python interpreter, probably python 2 from the syntax error.
how I solved the problem
I changed my bash file to the following
#!/bin/bash
conda_shell=/home/frosty/anaconda3/etc/profile.d/conda.sh
conda_env=base
source ${conda_shell}
conda activate ${conda_env}
python /home/frosty/code/test_scripts/test.py
And I changed my python file to the following
from datetime import datetime
def main():
dt_now = datetime.now()
string_now = dt_now.strftime('%Y-%m-%d %H:%M:%S.%f')
string_file = '/home/frosty/code/test_scripts/text_file.txt'
string_to_write = 'wrote at {}\n'.format(string_now)
with open(string_file, 'a') as f:
f.write(string_to_write)
return None
if __name__ == '__main__':
main()
No MTA installed, discarding output
I had a similar problem with a PHP file executed as a CRON job.
When I manually execute the file it works, but not with CRON tab.
I got the output message: "No MTA installed, discarding output"
Postfix is the default Mail Transfer Agent (MTA) in Ubuntu and can be installed it using
sudo apt-get install postfix
But this same message can be also output when you add a log file as below and it does not have proper write permission to /path/to/logfile.log
/path/to/php -f /path/to/script.php >> /path/to/logfile.log
The permission issue can occur if you create the cron-log file manually using a command like touch while you are logged in as a different user and you add CRONs in the tab of another user(group) like www-data using: sudo crontab -u www-data -e. Then CRON daemon tries to write to the log file and fail, then tries to send the output as an email using Ubuntu's MTA and when it's not found, outputs "No MTA installed, discarding output".
To prevent this:
Create the file with proper permission.
Avoid creating the relevant CRON log file manually, add the log in CRON tab and let the log file get created automatically when the cron is run.
I've found another reason for user's crontab not running: the hostname is not present on the hosts file:
user#ubuntu:~$ cat /etc/hostname
ubuntu
Now the hosts file:
user#ubuntu:~$ cat /etc/hosts
127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
This is on a Ubuntu 14.04.3 LTS, the way to fix it is adding the hostname to the hosts file so it resembles something like this:
user#ubuntu:~$ cat /etc/hosts
127.0.0.1 ubuntu localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
For me, the solution was that the file cron was trying to run was in an encrypted directory, more specifcically a user diretory on /home/. Although the crontab was configured as root, because the script being run exisited in an encrypted user directory in /home/ cron could only read this directory when the user was actually logged in. To see if the directory is encrypted check if this directory exists:
/home/.ecryptfs/<yourusername>
if so then you have an encrypted home directory.
The fix for me was to move the script in to a non=encrypted directory and everythig worked fine.
As this is becoming a canonical for troubleshooting cron issues, allow me to add one specific but rather complex issue: If you are attempting to run a GUI program from cron, you are probably Doing It Wrong.
A common symptom is receiving error messages about DISPLAY being unset, or the cron job's process being unable to access the display.
In brief, this means that the program you are trying to run is attempting to render something on an X11 (or Wayland etc) display, and failing, because cron is not attached to a graphical environment, or in fact any kind of input/output facility at all, beyond being able to read and write files, and send email if the system is configured to allow that.
For the purposes of "I'm unable to run my graphical cron job", let's just point out in broad strokes three common scenarios for this problem.
Probably identify the case you are trying to implement, and search for related questions about that particular scenario to learn more, and find actual solutions with actual code.
If you are trying to develop an interactive program which communicates with a user, you want to rethink your approach. A common, but nontrivial, arrangement is to split the program in two: A back-end service which can run from cron, but which does not have any user-visible interactive facilities, and a front-end client which the user runs from their GUI when they want to communicate with the back-end service.
Probably your user client should simply be added to the user(s)' GUI startup script if it needs to be, or they want to, run automatically when they log in.
I suppose the back-end service could be started from cron, but if it requires a GUI to be useful, maybe start it from the X11 server's startup scripts instead; and if not, probably run it from a regular startup script (systemd these days, or /etc/rc.local or a similar system startup directory more traditionally).1
If you are trying to run a GUI program without interacting with a real user 2, you may be able to set up a "headless" X11 server 3 and run a cron job which starts up that server, runs your job, and quits.
Probably your job should simply run a suitable X11 server from cron (separate from any interactive X11 server which manages the actual physical display(s) and attached graphics card(s) and keyboard(s) available to the system), and pass it a configuration which runs the client(s) you want to run once it's up and running. (See also the next point for some practical considerations.)
You are running a computer for the sole purpose of displaying a specific application in a GUI, and you want to start that application when the computer is booted.
Probably your startup scripts should simply run the GUI (X11 or whatever) and hook into its startup script to also run the client program once the GUI is up and running. In other words, you don't need cron here; just configure the startup scripts to run the desktop GUI, and configure the desktop GUI to run your application as part of the (presumably automatic, guest?) login sequence.4
There are ways to run X11 programs on the system's primary display (DISPLAY=:0.0) but doing that from a cron job is often problematic, as that display is usually reserved for actual interactive use by the first user who logs in and starts a graphical desktop. On a single-user system, you might be able to live with the side effects if that user is also you, but this tends to have inconvenient consequences and scale very poorly.
An additional complication is deciding which user to run the cron job as. A shared system resource like a back-end service can and probably should be run by root (though ideally have a dedicated system account which it switches into once it has acquired access to any privileged resources it needs) but anything involving a GUI should definitely not be run as root at any point.
A related, but distinct problem is to interact in any meaningful way with the user. If you can identify the user's active session (to the extent that this is even well-defined in the first place), how do you grab their attention without interfering with whatever else they are in the middle of? But more fundamentally, how do you even find them? If they are not logged in at all, what do you do then? If they are, how do you determine that they are active and available? If they are logged in more than once, which terminal are they using, and is it safe to interrupt that session? Similarly, if they are logged in to the GUI, they might miss a window you spring up on the local console, if they are actually logged in remotely via VNC or a remote X11 server.
As a further aside: On dedicated servers (web hosting services, supercomputing clusters, etc) you might even be breaking the terms of service of the hosting company or institution if you install an interactive graphical desktop you can connect to from the outside world, or even at all.
1
The #reboot hook in cron is a convenience for regular users who don't have any other facility for running something when the system comes up, but it's just inconvenient and obscure to hide something there if you are root anyway and have complete control over the system. Use the system facilities to launch system services.
2
A common use case is running a web browser which needs to run a full GUI client, but which is being controlled programmatically and which doesn't really need to display anything anywhere, for example to scrape sites which use Javascript and thus require a full graphical browser to render the information you want to extract.
Another is poorly designed scientific or office software which was not written for batch use, and thus requires a GUI even when you just want to run a batch job and then immediately quit without any actual need to display anything anywhere.
(In the latter case, probably review the documentation to check if there isn't a --batch or --noninteractive or --headless or --script or --eval option or similar to run the tool without the GUI, or perhaps a separate utility for noninteractive use.)
3
Xvfb is the de facto standard solution; it runs a "virtual framebuffer" where the computer can spit out pixels as if to a display, but which isn't actually connected to any display hardware.
4
There are several options here.
The absolutely simplest is to set up the system to automatically log in a specific user at startup without a password prompt, and configure that user's desktop environment (Gnome or KDE or XFCE or what have you) to run your script from its "Startup Items" or "Login Actions" or "Autostart" or whatever the facility might be called. If you need more control over the environment, maybe run bare X11 without a desktop environment or window manager at all, and just run your script instead. Or in some cases, maybe replace the X11 login manager ("greeter") with something custom built.
The X11 stack is quite modular, and there are several hooks in various layers where you could run a script either as part of a standard startup process, or one which completely replaces a standard layer. These things tend to differ somewhat between distros and implementations, and over time, so this answer is necessarily vague and incomplete around these matters. Again, probably try to find an existing question about how to do things for your specific platform (Ubuntu, Raspbian, Gnome, KDE, what?) and scenario. For simple scenarios, perhaps see Ubuntu - run bash script on startup with visible terminal
I experienced same problem where crons are not running.
We fixed by changing permissions and owner by
Crons made root owner as we had mentioned in crontab AND
Cronjobs 644 permission given
There is already a lot of answers, but none of them helped me so I'll add mine here in case it's useful for somebody else.
In my situation, my cronjobs were working find until there was a power shortage that cut the power to my Raspberry Pi. Cron got corrupted. I think it was running a long python script exactly when the shortage happened. Nothing in the main answer above worked for me. The solution was however quite simple. I just had to force reinstallation of cron with:
sudo apt-get --reinstall install cron
It work right away after this.
Copying my answer for a duplicated question here.
cron may not know where to find the Python interpreter because it doesn't share your user account's environment variables.
There are 3 solutions to this:
If Python is at /usr/bin/python, you can change the cron job to use an absolute path: /usr/bin/python /srv/www/live/CronJobs/daily.py
Alternatively you can also add a PATH value to the crontab with PATH=/usr/bin.
Another solution would be to specify an interpreter in the script file, make it executable, and call the script itself in your crontab:
a. Put shebang at the top of your python file: #!/usr/bin/python.
b. Set it to executable: $ chmod +x /srv/www/live/CronJobs/daily.py
c. Put it in crontab: /srv/www/live/CronJobs/daily.py
Adjust the path to the Python interpreter if it's different on your system.
Reference
CRON uses a different TIMEZONE
A very common issue is: cron time settings may is different than your. In particular, the timezone could be not be the same:
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
You can run:
* * * * * echo $(date) >> /tmp/test.txt
This should generate a file like:
# cat test.txt
Sun 03 Apr 2022 09:02:01 AM UTC
Sun 03 Apr 2022 09:03:01 AM UTC
Sun 03 Apr 2022 09:04:01 AM UTC
Sun 03 Apr 2022 09:05:01 AM UTC
Sun 03 Apr 2022 09:06:01 AM UTC
If you are using a TZ other than UTC, you can try:
timedatectl set-timezone America/Sao_Paulo
replace America/Sao_Paulo according to you settings.
I'm not sure if it is actually necessary, but you can run:
sudo systemctl restart cron.service
After that, cron works as I expected:
# cat test.txt
Sun 03 Apr 2022 09:02:01 AM UTC
Sun 03 Apr 2022 09:03:01 AM UTC
Sun 03 Apr 2022 09:04:01 AM UTC
Sun 03 Apr 2022 09:05:01 AM UTC
Sun 03 Apr 2022 09:06:01 AM UTC
Sun 03 Apr 2022 09:07:01 AM UTC
Sun 03 Apr 2022 09:08:01 AM UTC
Sun 03 Apr 2022 09:09:01 AM UTC
Sun 03 Apr 2022 09:10:01 AM UTC
Sun 03 Apr 2022 06:11:01 AM -03
Sun 03 Apr 2022 06:12:01 AM -03
Sun 03 Apr 2022 06:13:01 AM -03
Sun 03 Apr 2022 06:14:01 AM -03
Try
service cron start
or
systemctl start cron
In my case I was trying to run cron locally.
I checked status:
service cron status
It showed me:
* cron is not running
Then I simply started the service:
service cron start
Sometimes the command that cron needs to run is in a directory where cron has no access, typically on systems where users' home directories' permissions are 700 and the command is in that directory.
Although answer has been accepted for this question, I will like to add what worked for me.
it's a good idea to quote the URL, if it contains a query it may not work without everything being quoted.
DONT FORGET TO PUT YOUR URL WHICH CONTAINS "?, =, #, %" IN A QUOTE.
Example.
https://paystack.com/indexphp?docs/api/#transaction-charge-authorization&date=today
should be in a quote like so
"https://paystack.com/indexphp?docs/api/#transaction-charge-authorization&date=today"

Kill APscheduler add_job based on id

We have a flask script get_logs.py that uses APScheduler and contains following job
scheduler.add_job(id="create_recommendation_entries", trigger = 'interval',seconds=60*10,func=create_entries)
Someone ran the script and now the the logs show that this script is still running at 10 minutes interval even after terminating.
The process id is not listed nor does it show using grep and we don't know whether it was executed using nohup or gunicorn.
How do I kill this job based on id="create_recommendation_entries"because I don't know any of its stats(port,pid etc).
Rerunning the script creates a different thread and stops after Ctrl+C but the previous one remains still in process

Run command from terminal window in AWS Instance at specified time or on start up

I have a AWS Cloud9 Instance that starts running at 11:52 PM MST and stops running at 11:59 PM MST. I have a dockerfile within the Instance that when ran with the correct mount will run a set of c++ .cpp files that collect live web data. The ultimate goal of this instance is to be fully automatic so that every night it collects the live web data for that date, hence why the Instance is open at the very end of the day each night. Is it possible to have my AWS Instance run a given command in a terminal window at a certain time, say 11:55 PM or even upon startup. So at the time, or at startup, the command "docker run -it...." is ran within the instance.
Is automating this process possible? I have looked into CloudWatch events and think that might be the best way to go about automating this process but I am not quite sure how I would create a rule to fulfill the job. If it is not possible to automate a certain command within a terminal window, could I automate the dockerfile to run at a certain time?
ofcourse you can automate running of commands not just docker but for the fact any commands using cron daemon. all you need to do is place your command in shell script file say doc.sh in your desired directory.
ssh into your instance
open terminal and type crontab -e
enter the following details in this manner a b c d e /directory/command
where a -Minute, b-hour c-day d-month e-day of the week
the /directory/command specifies the location and script you want to run.
for more reference cron examples,https://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/
If you have a dockerfile that you want to run for a few minutes a day, you should look into Fargate. You can schedule an event with Cloudwatch, run the container and then shut it down when it's done.
It will probably cost around $0.01/day to run this.

Missing log lines when writing to cloudwatch from ECS Docker containers

(Docker container on AWS-ECS exits before all the logs are printed to CloudWatch Logs)
Why are some streams of a CloudWatch Logs Group incomplete (i.e., the Fargate Docker Container exits successfully but the logs stop being updated abruptly)? Seeing this intermittently, in almost all log groups, however, not on every log stream/task run. I'm running on version 1.3.0
Description:
A Dockerfile runs node.js or Python scripts using the CMD command.
These are not servers/long-running processes, and my use case requires the containers to exit when the task completes.
Sample Dockerfile:
FROM node:6
WORKDIR /path/to/app/
COPY package*.json ./
RUN npm install
COPY . .
CMD [ "node", "run-this-script.js" ]
All the logs are printed correctly to my terminal's stdout/stderr when this command is run on the terminal locally with docker run.
To run these as ECS Tasks on Fargate, the log driver for is set as awslogs from a CloudFormation Template.
...
LogConfiguration:
LogDriver: 'awslogs'
Options:
awslogs-group: !Sub '/ecs/ecs-tasks-${TaskName}'
awslogs-region: !Ref AWS::Region
awslogs-stream-prefix: ecs
...
Seeing that sometimes the cloduwatch logs output is incomplete, I have run tests and checked every limit from CW Logs Limits and am certain the problem is not there.
I initially thought this is an issue with node js exiting asynchronously before console.log() is flushed, or that the process is exiting too soon, but the same problem occurs when i use a different language as well - which makes me believe this is not an issue with the code, but rather with cloudwatch specifically.
Inducing delays in the code by adding a sleep timer has not worked for me.
It's possible that since the docker container exits immediately after the task is completed, the logs don't get enough time to be written over to CWLogs, but there must be a way to ensure that this doesn't happen?
sample logs:
incomplete stream:
{ "message": "configs to run", "data": {"dailyConfigs":"filename.json"]}}
running for filename
completed log stream:
{ "message": "configs to run", "data": {"dailyConfigs":"filename.json"]}}
running for filename
stdout: entered query_script
... <more log lines>
stderr:
real 0m23.394s
user 0m0.008s
sys 0m0.004s
(node:1) DeprecationWarning: PG.end is deprecated - please see the upgrade guide at https://node-postgres.com/guides/upgrading
UPDATE: This now appears to be fixed, so there is no need to implement the workaround described below
I've seen the same behaviour when using ECS Fargate containers to run Python scripts - and had the same resulting frustration!
I think it's due to CloudWatch Logs Agent publishing log events in batches:
How are log events batched?
A batch becomes full and is published when any of the following conditions are met:
The buffer_duration amount of time has passed since the first log event was added.
Less than batch_size of log events have been accumulated but adding the new log event exceeds the batch_size.
The number of log events has reached batch_count.
Log events from the batch don't span more than 24 hours, but adding the new log event exceeds the 24 hours constraint.
(Reference: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html)
So a possible explanation is that log events are buffered by the agent but not yet published when the ECS task is stopped. (And if so, that seems like an ECS issue - any AWS ECS engineers willing to give their perspective on this...?)
There doesn't seem to be a direct way to ensure the logs are published, but it does suggest one could wait at least buffer_duration seconds (by default, 5 seconds), and any prior logs should be published.
With a bit of testing that I'll describe below, here's a workaround I landed on. A shell script run_then_wait.sh wraps the command to trigger the Python script, to add a sleep after the script completes.
Dockerfile
FROM python:3.7-alpine
ADD run_then_wait.sh .
ADD main.py .
# The original command
# ENTRYPOINT ["python", "main.py"]
# To run the original command and then wait
ENTRYPOINT ["sh", "run_then_wait.sh", "python", "main.py"]
run_then_wait.sh
#!/bin/sh
set -e
# Wait 10 seconds on exit: twice the `buffer_duration` default of 5 seconds
trap 'echo "Waiting for logs to flush to CloudWatch Logs..."; sleep 10' EXIT
# Run the given command
"$#"
main.py
import logging
import time
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger()
if __name__ == "__main__":
# After testing some random values, had most luck to induce the
# issue by sleeping 9 seconds here; would occur ~30% of the time
time.sleep(9)
logger.info("Hello world")
Hopefully the approach can be adapted to your situation. You could also implement the sleep inside your script, but it can be trickier to ensure it happens regardless of how it terminates.
It's hard to prove that the proposed explanation is accurate, so I used the above code to test whether the workaround was effective. The test was the original command vs. with run_then_wait.sh, 30 runs each. The results were that the issue was observed 30% of the time, vs 0% of the time, respectively. Hope this is similarly effective for you!
Just contacted AWS support about this issue and here is their response:
...
Based on that case, I can see that this occurs for containers in a
Fargate Task that exit quickly after outputting to stdout/stderr. It
seems to be related to how the awslogs driver works, and how Docker in
Fargate communicates to the CW endpoint.
Looking at our internal tickets for the same, I can see that our
service team are still working to get a permanent resolution for this
reported bug. Unfortunately, there is no ETA shared for when the fix
will be deployed. However, I've taken this opportunity to add this
case to the internal ticket to inform the team of the similar and try
to expedite the process
In the meantime, this can be avoided by extending the lifetime of the
exiting container by adding a delay (~>10 seconds) between the logging
output of the application and the exit of the process (exit of the
container).
...
Update:
Contacted AWS around August 1st, 2019, they say this issue has been fixed.
I observed this as well. It must be an ECS bug?
My workaround (Python 3.7):
import atexit
from time import sleep
atexit.register(finalizer)
def finalizer():
logger.info("All tasks have finished. Exiting.")
# Workaround:
# Fargate will exit and final batch of CloudWatch logs will be lost
sleep(10)
I had the same problem with flushing logs to CloudWatch.
Following asavoy's answer I switched from exec form to shell form of the ENTRYPOINT and added a 10 sec sleep at the end.
Before:
ENTRYPOINT ["java","-jar","/app.jar"]
After:
ENTRYPOINT java -jar /app.jar; sleep 10

Getting Data From A Specific Website Using Google Cloud

I have a machine learning project and I have to get data from a website every 15 minutes. And I cannot use my own computer so I will use Google cloud. I am trying to use Google Compute Engine and I have a script for getting data (here is the link: https://github.com/BurkayKirnik/Automatic-Crypto-Currency-Data-Getter/blob/master/code.py). This script gets data every 15 mins and writes it down to csv files. I can run this code by opening an SSH terminal and executing it from there but it stops working when I close the terminal. I tried to run it by executing it in startup script but it doesn't work this way too. How can I run this and save the csv files? BTW I have to install an API to run the code and I am doing it in startup script. There is no problem in this part.
Instances running in Google Cloud Platform can be configured with the same tools available in the operating system that they are running. If your instance is a Linux instance, the best method would be to use a cronjob to execute your script repeatedly at your chosen interval.
Once you have accessed the instance via SSH, you can open the crontab configuration file by running the following command:
$ crontab -e
The above command will provide access to your personal crontab configuration (for the user you are logged in as). If you want to run the script as root you can use this instead:
$ sudo crontab -e
You can now edit the crontab configuration and add an entry that tells cron to execute your script at your required interval (in your case every 15 minutes).
Therefore, your crontab entry should look something like this:
*/15 * * * * /path/to/you/script.sh
Notice the first entry is for minutes, so by using the */15, you are telling the cron daemon to execute the script once every 15 minutes.
Once you have edited the crontab configuration file, it is a good idea to restart the cron daemon to ensure the change you made will take place. To do this you can run:
$ sudo service cron restart
If you would like to check the status to ensure the cron service is running you can run:
$ sudo service cron status
You script will now execute every 15 minutes.
In terms of storing the CSV files, you could either program your script to store them on the instance, or an alternative would be to use Google Cloud Storage bucket. File can be copied to buckets easily by making use of the gsutil (part of Cloud SDK) command as described here. It's also possible to mount buckets as a file system as described here.