Scheduling Autorep command to run daily - scheduling

I am a newbie to autosys. I am trying to get the jobs execution information on daily basis into a csv file. For this I am trying to write an autosys job which I can schedule to run daily. Below is the snippet of the code:
insert_job: job_run_time job_type: CMD
box_name: box_job_run_time
command: autorep -J box_job1 -r -1
But this is giving the below error:
'autorep' is not recognized as an internal or external command,
operable program or batch file.
Please help with the solution

Shanky, before you run autorep command directly on the terminal, it's required to configure first to read Autosys DB storing the run details.
Mostly it would be a few variables and instance names.
Please check with the scheduling team on how to configure Autosys client on the server.

Thanks Piyush and Mansi for your reply!
Mansi, where this command autorep -J box_job1 -r -1 >> Output.csv can be configured? so that it can be scheduled to run daily.

Related

Run command from terminal window in AWS Instance at specified time or on start up

I have a AWS Cloud9 Instance that starts running at 11:52 PM MST and stops running at 11:59 PM MST. I have a dockerfile within the Instance that when ran with the correct mount will run a set of c++ .cpp files that collect live web data. The ultimate goal of this instance is to be fully automatic so that every night it collects the live web data for that date, hence why the Instance is open at the very end of the day each night. Is it possible to have my AWS Instance run a given command in a terminal window at a certain time, say 11:55 PM or even upon startup. So at the time, or at startup, the command "docker run -it...." is ran within the instance.
Is automating this process possible? I have looked into CloudWatch events and think that might be the best way to go about automating this process but I am not quite sure how I would create a rule to fulfill the job. If it is not possible to automate a certain command within a terminal window, could I automate the dockerfile to run at a certain time?
ofcourse you can automate running of commands not just docker but for the fact any commands using cron daemon. all you need to do is place your command in shell script file say doc.sh in your desired directory.
ssh into your instance
open terminal and type crontab -e
enter the following details in this manner a b c d e /directory/command
where a -Minute, b-hour c-day d-month e-day of the week
the /directory/command specifies the location and script you want to run.
for more reference cron examples,https://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/
If you have a dockerfile that you want to run for a few minutes a day, you should look into Fargate. You can schedule an event with Cloudwatch, run the container and then shut it down when it's done.
It will probably cost around $0.01/day to run this.

Create Folder in Informatica using Pmrep

I am trying to create folder in Informatica using PMREP command.
Syntax below.
Can i call this command (pmrep) directly from command task or I have to put this command in Shell script and call this shell script in command task?
Also, I do not see any option to pass user credentials in this command, will it automatically take these details from the parent workflow where I add this in command task?
pmrep createFolder -n [-d folder_description] [-o owner_name] [-a owner_security_domain] [-s shared_folder] [-p permissions] [-f active | frozendeploy | frozennodeploy]
Above pmrep command is to create a folder in Informatica repository which would be one time creation. There is no point in creating it through command task, because when you execute the workflow multiple times, command task tries to create the folder multiple times in Informatica repository which leads to the failure of the workflows. If you are creating a folder in Informatica repository, create it through terminal.
If you are creating folder in the server use below command in command task,
mkdir folder_name
Please mention if it's for Informatica repository folder or a directory in your machine, so that can help you according to your requirement.

A clear step by step process for running a periodic task in a django application

I have been trying a long for creating a periodic task in Django but there are lot of version constraints and not a clear explanation.
I recommend Celery. What is Celery?
Celery supports scheduling tasks. Check this doc
First of all, you want to create a management command following this guide.
https://docs.djangoproject.com/en/2.1/howto/custom-management-commands/
Say we want to run the closepoll command in the example every 5 minutes.
You'll then need to create a script to run this command.
Linux / MacOS:
#!/bin/bash -e
cd path/to/your/django/project
source venv/bin/activate # if you use venv
python manage.py closepoll # maybe you want to >> /path/to/log so you can log the results
store the file as run_closepoll.sh, run chmod +x run_closepoll.sh in command line
Now we can use crontab to run our command
run crontab -e in your command line
add this line:
*/5 * * * * /path/to/run_closepoll.sh
Now the command will run every 5 minutes.
If you're not familiar with crontab, you can use this website
https://crontab-generator.org/
Windows:
Same content as the above example, but remove the first line and save as run_closepoll.bat
In your start menu, search for Task Scheduler, follow the instructions on the GUI, it should be pretty simple from there.
for more info about the task scheduler, see here: https://learn.microsoft.com/en-us/windows/desktop/taskschd/using-the-task-scheduler
This blog explains clearly
https://medium.com/#yehandjoe/celery-4-periodic-task-in-django-9f6b5a8c21c7
Thanks!!!
I'm using django-cron and It works as expected. The only caveat is that you have to set a Cron job in the Linux system to run the command python manage.py runcrons.

AWS EMR bootstrap action as sudo

I need to update /etc/hosts for all instances in my EMR cluster (EMR AMI 4.3).
The whole script is nothing more than:
#!/bin/bash
echo -e 'ip1 uri1' >> /etc/hosts
echo -e 'ip2 uri2' >> /etc/hosts
...
This script needs to run as sudo or it fails.
From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html#bootstrapUses
Bootstrap actions execute as the Hadoop user by default. You can execute a bootstrap action with root privileges by using sudo.
Great news... but I can't figure out how to do this, and I can't find an example.
I've tried a bunch of things... including...
running as Hadoop and adding 'sudo' to each of the 'echo' statements in the script
using a shell script to copy and chmod the above ('echo' statements with no 'sudo') and running local copy using run-if bootstrap that calls 1=1 sudo bash /home/hadoop/myDir/myScript.sh
hard coding the whole script as a one-liner into a run-if bootstrap action
I consistently get:
On the master instance (i-xxx), bootstrap action 2 returned a non-zero return code
If i check the logs for the "Setup hadoop debugging" step, there's nothing there.
From here: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview.html#emr-overview-cluster-lifecycle
summary emr setup (in order):
provisions ec2 instances
runs bootstrap actions
installs native applications... like hadoop, spark, etc.
So it seems like there's some risk that since I'm mucking around as user Hadoop before hadoop is installed, I could be messing something up there, but I can't imagine what.
I think it must be that my script isn't running as 'sudo' and it's failing to update /etc/hosts.
My question... how can I use bootstrap actions (or something else) on EMR to run a simple shell script as sudo? ...specifically to update /etc/hosts?
I've not had problems using sudo from within a shell script run as an EMR bootstrap action, so it should work. You can test that it works with a simple script that simply does "sudo ls /root".
Your script is trying to append to /etc/hosts by redirecting stdout with:
sudo echo -e 'ip1 uri1' >> /etc/hosts
The problem here is that while the echo is run with sudo, the redirection (>>) is not. It's run by the underlying hadoop user, who does not have permission to write to /etc/hosts. The fix is:
sudo sh -c 'echo -e "ip1 uri1" >> /etc/hosts'
This runs the entire command, including the stdout redirection, in a shell with sudo.

Getting Data From A Specific Website Using Google Cloud

I have a machine learning project and I have to get data from a website every 15 minutes. And I cannot use my own computer so I will use Google cloud. I am trying to use Google Compute Engine and I have a script for getting data (here is the link: https://github.com/BurkayKirnik/Automatic-Crypto-Currency-Data-Getter/blob/master/code.py). This script gets data every 15 mins and writes it down to csv files. I can run this code by opening an SSH terminal and executing it from there but it stops working when I close the terminal. I tried to run it by executing it in startup script but it doesn't work this way too. How can I run this and save the csv files? BTW I have to install an API to run the code and I am doing it in startup script. There is no problem in this part.
Instances running in Google Cloud Platform can be configured with the same tools available in the operating system that they are running. If your instance is a Linux instance, the best method would be to use a cronjob to execute your script repeatedly at your chosen interval.
Once you have accessed the instance via SSH, you can open the crontab configuration file by running the following command:
$ crontab -e
The above command will provide access to your personal crontab configuration (for the user you are logged in as). If you want to run the script as root you can use this instead:
$ sudo crontab -e
You can now edit the crontab configuration and add an entry that tells cron to execute your script at your required interval (in your case every 15 minutes).
Therefore, your crontab entry should look something like this:
*/15 * * * * /path/to/you/script.sh
Notice the first entry is for minutes, so by using the */15, you are telling the cron daemon to execute the script once every 15 minutes.
Once you have edited the crontab configuration file, it is a good idea to restart the cron daemon to ensure the change you made will take place. To do this you can run:
$ sudo service cron restart
If you would like to check the status to ensure the cron service is running you can run:
$ sudo service cron status
You script will now execute every 15 minutes.
In terms of storing the CSV files, you could either program your script to store them on the instance, or an alternative would be to use Google Cloud Storage bucket. File can be copied to buckets easily by making use of the gsutil (part of Cloud SDK) command as described here. It's also possible to mount buckets as a file system as described here.