Yarn Resource manager-admin commands - admin

I am new in yarn resource manager admin commands, I know how to check the status of the application with -appStates keyword. For example,
yarn application -list -appStates FINISHED
will give all the finished jobs. We can do more with states like RUNNING, NEW, ALL, NEW_SAVING, SUBMITTED, ACCEPTED, FINISHED, FAILED, KILLED. But how to fetch the recent application submitted. Is there any keyword RECENT?
Something like this:
yarn application -list -appStates RECENT
Thanks in Advance

If you want more fine-grained control of this, you should use the YARN Resource Manager REST APIs. The Cluster Applications API allows you to specify startedTimeBegin and startedTimeEnd which you can use to define how much time you mean by RECENT. Just remember time is specified in milliseconds not seconds.
This simple script shows how you'd see jobs created in the last 5 minutes.
starttime=`date +"%s"`
let starttime=starttime-300
curl RM_URL/ws/v1/cluster/apps?startedTimeBegin=${starttime}000

Related

Data flow pipeline got stuck

Workflow failed. Causes: The Dataflow job appears to be stuck because no worker activity has been seen in the last 1h. Please check the worker logs in Stackdriver Logging. You can also get help with Cloud Dataflow at https://cloud.google.com/dataflow/support.
I am using service account with all required IAM roles
Generally The Dataflow job appears to be stuck because no worker activity has been seen in the last 1h can be caused by too long setup progress. In order to solve this issue you can try to increase worker resources (via --machine_type parameter) to overcome the issue.
For example, While installing several dependencies that required building wheels (pystan, fbprophet) which will take more than an hour on the minimal machine (n1-standard-1 with 1 vCPU and 3.75GB RAM). Using a more powerful instance (n1-standard-4 which has 4 times more resources) will solve the problem.
You can debug this by looking at the worker startup logs in cloud logging. You are likely to see pip issues with installing dependencies.
Do you have any error logs showing that Dataflow Workers are crashing when trying to start?
If not, maybe worker VMs are started but they can't reach the Dataflow service, which is often related to network connectivity.
Please note that by default, Dataflow creates jobs using the network and subnetwork default (please check if it exists on your project), and you can change to a specific one by specifying --subnetwork. Check https://cloud.google.com/dataflow/docs/guides/specifying-networks for more information.

How airflow loads/updates DagBag from dags home folder on google cloud platform?

Please do not down vote my answer. If needed then I will update and correct my words. I have done my home-work research. I am little new so trying to understand this.
I would like to understand that how do airflow on Google cloud platform gets the changes from dags home folder to UI. Also Please help me with my dags setup script. I have read so many answers along with books. book link is here
I tried figuring out my answer from page 69 which says
3.11 Scheduling & Triggers The Airflow scheduler monitors all tasks and all DAGs, and triggers the task instances whose dependencies have
been met. Behind the scenes, it monitors and stays in sync with a
folder for all DAG objects it may contain, and periodically (every
minute or so) inspects active tasks to see whether they can be
triggered.
My understanding from this book is that scheduler regularly takes changes from dags home folder. (Is it correct?)
I also read multiple answers on stack overflow , I found this one useful Link
But still answer does not contain process that is doing this creation/updation of dagbag from script.py in dag home folder. How changes are sensed.
Please help me with my dags setup script.
We have created a generic python script that dynamically creates dags by reading/iterating over config files.
Below is directory structure
/dags/workflow/
/dags/workflow/config/dag_a.json
/dags/workflow/config/dag_b.json
/dags/workflow/task_a_with_single_operator.py
/dags/workflow/task_b_with_single_operator.py
/dags/dag_creater.py
Execution flow dag_creater.py is as following :-
1. Iterate in dags/workflow/config folder get the Config JSON file and
read variable dag_id.
2. create Parent_dag = DAG(dag_id=dag_id,
start_date=start_date, schedule_interval=schedule_interval,
default_args=default_args, catchup=False)
3. Read tasks with dependencies of that dag_id from config json file
(example :- [[a,[]],[b,[a]],[c,[b]]]) and code it as task_a >>
task_b >> task_c
This way dag is created. All works fine. Dags are also visible on UI and running fine.
But problem is, My dag creation script is running every time. Even in each task logs I see logs of all the dags. I expect this script to run once. just to fill entry in metadata. I am unable to understand like why is it running every time.
Please make me understand the process.
I know airflow initdb is run once we setup metadata first time. So that is not doing this update all time.
Is it scheduler heart beat updating all?
Is my setup correct?
Please Note: I can't type real code as that is the restriction from my
organization. However if asked, i will provide more information.
Airflow Scheduler is actually continuously running in Airflow runtime environment as a main contributor for monitoring changes in DAG folder and triggering the relevant DAG tasks residing in this folder. The main settings for Airflow Scheduler service can be found in airflow.cfg file, essentially the heart beat intervals which effectively impact the general DAG tasks maintenance.
However, the way how the particular task will be executed is defined as per the Executor's model in Airflow configuration.
To store DAGs being available for the Airflow runtime environment GCP Composer uses Cloud Storage, implementing the specific folder structure, synchronizing any object arriving to /dags folder with *.py extension be verified if it contains the DAG definition.
If you expect to run DAG spreading script within Airflow runtime, then in this particular use case I would advise you to look at PythonOperator, using it in the separate DAG to invoke and execute your custom generic Python code with guarantees scheduling it only once a time. You can check out this Stack thread with implementation details.

Debugging broken dags in GCP Composer

I have read the question for vanilla Airflow.
How can broken DAGs be debugged effectively in Google Cloud Composer?
How can i see the full logs of a broken DAG?
Right now I can only see one line of trace in Airflow UI main page.
EDIT:
Answers seem to be not understanding my question.
I am looking for fixing broken DAGs i.e. the DAG does not even appear in the DAGs list and of course there are no tasks running and no task logs to view.
As hexacynide pointed out, you can look at the task logs - there's details in the Composer docs about doing that specifically found here. You can also use Stackdriver logging, which is enabled by default in Composer projects. In Stackdriver logs, you can filter your logs on many variables, including by time, by pod (airflow-worker, airflow-webserver, airflow-scheduler, etc.) and by whatever keywords you suspect might appear in the logs.
EDIT: Adding screenshots and more clarity in response to question update
In Airflow, when there's a broken DAG, there is usually some form of error message at the top. (Yes, I know this error message is helpful and I don't need to debug further, but I'm going to just to show how to)
In the message, I can see that my DAG bq_copy_across_locations is broken.
To debug, I go to Stackdriver, and search for the name of my DAG. I limit the results to the logs from this Composer environment. You can also limit the time frame if needed.
I looked through the error logs and found a the Traceback error for the broken DAG.
Alternatively, if you know you only want to search for the stack traceback, you can run an advanced filter looking for your DAG name and the word "traceback". To do so, click the arrow at the right side of the Stackdriver logging bar and hit "convert to advance filter"
Then enter your advanced filter
resource.type="cloud_composer_environment"
resource.labels.location="YOUR-COMPOSER-REGION"
resource.labels.environment_name="YOUR-ENV-NAME"
("BROKEN-DAG-NAME" AND
"Traceback")
This is what my advanced search looked like
The only logs that will be returned will be the stack Traceback logs for that DAG.
To determine run-time issues that occur when a DAG is triggered, you can always look at task logs as you would for any typical Airflow installation. These can be found using the web UI, or by looking in the associated logs folder in your Cloud Composer environment's associated Cloud Storage bucket.
To identify issues at parse time, you can execute Airflow commands using gcloud composer. For example, to run airflow list_dags, the gcloud CLI equivalent would be:
$ gcloud composer environments --location=$REGION run $ENV_NAME -- list_dags --report
Note that the second -- is intentional. This is so that the command argument parser can differentiate between arguments to gcloud and arguments to be passed to the Airflow subcommand (in this case list_dags).

Cloud Foundry triggers if application was created

is there a possibility that cloud foundry triggers an function if a new application was pushed to the platform.
I would like to trigger same internal functions like registration on the API gateway. I know that I can pull the information from events API https://apidocs.cloudfoundry.org/224/events/list_all_events.html. But, is it also possible by push?
The closest thing I can think of to what you're asking is the profile script.
https://docs.cloudfoundry.org/devguide/deploy-apps/deploy-app.html#profile
The note about the Java buildpack not supporting .profile scripts is incorrect. It's a platform feature, so all buildpack's support them. The difference with Java apps is that you're probably pushing a JAR or WAR file so it's harder to make sure the file is placed in the correct location. Location of the file is everything.
When your application starts, the platform will first run the .profile script, if it exists, that is packaged with your application. It's a standard shell script and you can do whatever you like in this file.
The only caveat is that your application will not start until this script completes successfully (i.e. exit 0). Thus you have a limited amount of time for that script to run and your application to start. How much time, you ask? That is configured by cf push -t and is in seconds. You can also set it in your manifest.yml with the timeout attribute.
Time (in seconds) allowed to elapse between starting up an app and the first healthy response from the app
This is also something that each application needs to include. I suppose you could also use a custom buildpack to add that file, if you wanted to have it added across multiple applications. There's no easy way to add it for all apps though.
Hope that helps!

Job Scheduling in SAS Data Integration Studio

i want to schedule a job in SAS-DIS. i tried the process using sas management console,bt an error is popping up saying scheluing server not found.
can anyone help me how to setup a scheduling server? or is it a software to be installed?
Thanks
I think a scheduling server is an extra package that has to be purchased. Our BI setup is lacking that option and no matter what we can't seem to get it approved. Check with your SAS server admin to see if the job scheduling has been enabled. If so he/she should be able to tell you the process for getting it scheduled.
Alternatively, without a scheduling server you still deploy your jobs and can either use
1. Cron and Crontab (in Unix or Linux)
2. Windows OS scheduler
to schedule jobs manually as this is the best option available if there is none. I know this can be very tedious and cumbersome , but can give it a try if you have less number of jobs to schedule.