Control-M: Order date (prerequisite), two jobs with different load dates - scheduling

I have two jobs with different load dates (not same day):
JobA: runs at the end of month (may be 28th, 29 th, 30th).
JobB: runs at the beginning of the month (may be 1st, 2nd).
I want to make JobA as prerequisite of JobB.
How can I proceed, in control M, since the two jobs are not in the same load date and they don't have specific date to run.

This situation can be easily handled with the use of STAT conditions.
Normally conditions are order date dependent, but if you change the condition from ODAT to STAT then the order date is not taken into consideration.
Ensure that you are removing the STAT condition on JobB after completion (on failure or success) to avoid next months JobB kicking off prematurely.

Re-elaborating Jasper answer, my suggestion is: on JobA, in the "On Do Actions" add the following action with attention to "Date" parameter which must be "No Date":
On JobB prerequisites add the condition. Again, with "No Date":
In JobB, in "On Do Actions" add the following action, to manage the condition:

Related

WARNING: Failed to add policy job since the add condition is not satisfied

I'm trying to schedule automatic recommendation and population by following this doc.
I'm trying to run this query
SELECT google_columnar_engine_add_policy( 'RECOMMEND_AND_POPULATE_COLUMNS', 'EVERY', 10, 'HOURS');
But this query fails. I've tried many other combinations of policy_interval, duration, time_unit, and it fails with the same error every time.
Only one case works, that is when policy_interval is "IMMEDIATE" but this is not what I'm after.
The basic steps to follow for the configuation and usage are as below:
Enable the columnar engine.
Let the engine's recommendation feature observe your workload and
gather query statistics
Size the engine's column store based on the recommendation feature's
analysis.
Enable automatic population of the column store by the
recommendation feature.
Let the recommendation feature observe your workload and
automatically add columns to the column store.
The query that you are trying to run is for Schedule automatic recommendation and population
(
'RECOMMEND_AND_POPULATE_COLUMNS',
policy_interval, duration, time_unit
);
policy_interval: The time interval determining when the policy runs. You can specify these values:
'IMMEDIATE': The RECOMMEND_AND_POPULATE_COLUMNS operation runs immediately one time. When you use this value, specify 0 and 'HOURS' for the duration and time_unit parameters.
'AFTER': The RECOMMEND_AND_POPULATE_COLUMNS operation runs once when the duration time_unit amount of time passes.
'EVERY': The RECOMMEND_AND_POPULATE_COLUMNS operation runs repeatedly every duration time_unit amount of time.
duration: The number of time_units. For example, 24.
time_unit: The unit of time for duration. You can specify 'DAYS'or 'HOURS'.
Please check if this was followed from setup to configuration and try again.Also as you mentioned, the specific errors are not available with you for clearly understanding the breakpoint here.I would recommend you to check the below link for reference.
https://cloud.google.com/alloydb/docs
https://cloud.google.com/alloydb/docs/faq
Hope that helps.

How to reduce time taken by sas macro code to run

I need help with my monthly report sas code below:
Firstly the code takes too long to run while the data is relatively small. When it completes a message that reads: The contents of log is too large.
Please can you check what the issue which my code?
Meaning of macro variable:
&end_date. = last day of the previous month. for instance 30-Apr-22
&lastest_refrsh_dt. = The latest date the report was published.
once the report is published, we updated the config table with &end_date.
work.schedule_dt: a table that contains the update flag. if all flags are true, we proceed but if update flags are false exit. at the six day of month, if the flag is still false, then email that reads "data not available" is sent.
Normally, that message about the log is due to warnings in the logs over type issues. From what you describe, it is typically due to an issue on date interpretation.
There is nothing in this post to aid in helping beyond that. You need to open the log and find out what the message is. Otherwise, it is speculation on our part.

AWS IoT Analytics Delta Window

I am having real problems getting the AWS IoT Analytics Delta Window (docs) to work.
I am trying to set it up so that every day a query is run to get the last 1 hour of data only. According to the docs the schedule feature can be used to run the query using a cron expression (in my case every hour) and the delta window should restrict my query to only include records that are in the specified time window (in my case the last hour).
The SQL query I am running is simply SELECT * FROM dev_iot_analytics_datastore and if I don't include any delta window I get the records as expected. Unfortunately when I include a delta expression I get nothing (ever). I left the data accumulating for about 10 days now so there are a couple of million records in the database. Given that I was unsure what the optimal format would be I have included the following temporal fields in the entries:
datetime : 2019-05-15T01:29:26.509
(A string formatted using ISO Local Date Time)
timestamp_sec : 1557883766
(A unix epoch expressed in seconds)
timestamp_milli : 1557883766509
(A unix epoch expressed in milliseconds)
There is also a value automatically added by AWS called __dt which is a uses the same format as my datetime except it seems to be accurate to within 1 day. i.e. All values entered within a given day have the same value (e.g. 2019-05-15 00:00:00.00)
I have tried a range of expressions (including the suggested AWS expression) from both standard SQL and Presto as I'm not sure which one is being used for this query. I know they use a subset of Presto for the analytics so it makes sense that they would use it for the delta but the docs simply say '... any valid SQL expression'.
Expressions I have tried so far with no luck:
from_unixtime(timestamp_sec)
from_unixtime(timestamp_milli)
cast(from_unixtime(unixtime_sec) as date)
cast(from_unixtime(unixtime_milli) as date)
date_format(from_unixtime(timestamp_sec), '%Y-%m-%dT%h:%i:%s')
date_format(from_unixtime(timestamp_milli), '%Y-%m-%dT%h:%i:%s')
from_iso8601_timestamp(datetime)
What are the offset and time expression parameters that you are using?
Since delta windows are effectively filters inserted into your SQL, you can troubleshoot them by manually inserting the filter expression into your data set's query.
Namely, applying a delta window filter with -3 minute (negative) offset and 'from_unixtime(my_timestamp)' time expression to a 'SELECT my_field FROM my_datastore' query translates to an equivalent query:
SELECT my_field FROM
(SELECT * FROM "my_datastore" WHERE
(__dt between date_trunc('day', iota_latest_succeeded_schedule_time() - interval '1' day)
and date_trunc('day', iota_current_schedule_time() + interval '1' day)) AND
iota_latest_succeeded_schedule_time() - interval '3' minute < from_unixtime(my_timestamp) AND
from_unixtime(my_timestamp) <= iota_current_schedule_time() - interval '3' minute)
Try using a similar query (with no delta time filter) with correct values for offset and time expression and see what you get, The (_dt between ...) is just an optimization for limiting the scanned partitions. You can remove it for the purposes of troubleshooting.
Please try the following:
Set query to SELECT * FROM dev_iot_analytics_datastore
Data selection filter:
Data selection window: Delta time
Offset: -1 Hours
Timestamp expression: from_unixtime(timestamp_sec)
Wait for dataset content to run for a bit, say 15 minutes or more.
Check contents
After several weeks of testing and trying all the suggestions in this post along with many more it appears that the extremely technical answer was to 'switch off and back on'. I deleted the whole analytics stack and rebuild everything with different names and it now seems to now be working!
Its important that even though I have flagged this as the correct answer due to the actual resolution. Both the answers provided by #Populus and #Roger are correct had my deployment being functioning as expected.
I found by chance that changing SELECT * FROM datastore to SELECT id1, id2, ... FROM datastore solved the problem.

Replace the text of a specific cell given certain circumstances

I am creating a work schedule for the company I work at. There are four different jobs at the company and therefore 4 separate tabs for schedules.
I have a tab specifically for when an employee calls out sick or requests time off. I am looking for a way for when the user enters the employee's name, specific date, and sick/request off, for it to automatically update the work schedule that that employee belongs too (Job1, Job2, Job3, or Job4)
Example:
This is John Doe's Work Schedule for Job 1(and therefore located on Job1 tab)
John Doe calls out sick on Friday, 01/18/19. The supervisor fills out the following on Time Off Reqs/Sick tab
Given that the user inputs the above data in Time Off Reqs/Sick tab, I would like John Doe's Schedule to change automatically in Job1 tab to the following
John Doe
Here is the link to my dummy data
Any help is greatly appreciated!
I was able to get what you were looking for. See the sheet (and make a copy to modify) HERE.
I used the formula below in each cell of the "Job1" sheet (for this example, I only did it for Column F).
=IFERROR(IF(AND(D2<>"SATURDAY",D2<>"SUNDAY",ISNA(QUERY(Requests!A$2:C,"Select C where A='"&$F$1&"' and B = date '"&TEXT(DATEVALUE(C2),"yyyy-mm-dd")&"'",0)))=TRUE,"Work",QUERY(Requests!A$2:C,"Select C where A='"&$F$1&"' and B = date '"&TEXT(DATEVALUE(C2),"yyyy-mm-dd")&"'",0)),"Off")
You could probably also use multiple INDEX/MATCH statements to get what you're looking for. If you restructure the data, you may be able to use ARRAYFORMULA to reduce the number of formulas you need to use.

Airflow: how to get response from Big query output for data availability and based on result kick off task/subdags

Requirement is kick off dag based on data availability from upstream/dependent tables
While condition check data availability (in the tables at Big query for n number of iteration) to check data available or not. If data available then kick off subdag/task else continue in loop.
It would be great to see an clear example how to use BigQueryOperator or `BigQueryValueCheckOperator' and then execute big query something like this
{Code}
SELECT
1
FROM
WHERE
datetime BETWEEN TIMESTAMP(CURRENT_DATE())
AND TIMESTAMP(DATE_ADD(CURRENT_DATE(),1,'day'))
LIMIT
1
{Code}
If query output is 1 (that means data available for today's load) then kick off dag else continue in loop as shown in attached diagram link.
Does anyone had setup such design in Airflow dag.
You may check the BaseSensorOperator and BigQueryTableSensor to implement your own Sensor for it. https://airflow.incubator.apache.org/_modules/airflow/operators/sensors.html
Sensor operators keep executing at a time interval and succeed when a
criteria is met and fail if and when they time out.
BigQueryTableSensor just checks whether table exists or not but did check the data in the table. It might be something like this:
task1>>YourSensor>>task2