Airflow dag is not running as per start date and schedule interval - airflow-scheduler

Today is 6th may.
The dag arguments are:
'start_date': datetime.datetime(2021, 5, 6),
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
The schedule interval is:
schedule_interval="0 14 * * *"
It should have started at 2021-05-06 at 14 UTC.
But it has not been started.

This is expected.
start_date of 2021-05-06 and interval of 0 14 * * * means that the first run will start on 2021-05-07 14:00 this run will have execution_date of 2021-05-06 14:00
To understand why please read previous answer on this subject

Related

Cron Expression in aws weekly

Is it possible to create a cron in AWS CloudWatch that runs every hour from 9:30 a.m. to 4:30 p.m. Monday through Friday?
In the documentation here, the closest example I have is this:
0/5, 8-17, ?, *, MON-FRI, * = Run every 5 minutes Monday through Friday between 8:00 am and 5:55 pm (UTC+0).
from the example above, where is it defined that it will end at "55 "minutes after "5" hours? Ignoring that, something like this occurs to me:
0/60, 9-16, ?, *, MON-FRI, *
but I'm not sure what it means or if it's correct, also it's not starting from 9:30 but from 9:00
I hope you can help me, thanks in advance
I used this calculator to verify and generate cron expressions.
In the example you provide 0/5, 8-17, ?, *, MON-FRI, *
0/5:- means it runs every five minutes starting at 00 minutes (00
minutes inclusive)
8-17:- means it runs between 8 and 17 hours with both 8 and 17
inclusive.
So For your use case:- 0, 10-16, ?, *, MON-FRI, *
(since hours between 9.30 - 4.30 are 10-16 and it only needs to run at the start of the hour which means 00 minutes)

google finance function not correct in google sheets

The function below should be pulling back the amazon share price on 04/04/2020. The result states $11.43 which is incorrect. This has been working for the past 6 months but not today for some reason. Is this an issue with google finance ?
=GOOGLEFINANCE("amzn","price",date(2020,4,4))
Result is:
Date Close
06/04/2020 16:00:00 11.43
well 04/04/2020 is Saturday so
=GOOGLEFINANCE("amzn", "price", DATE(2020, 4, 4))
will round it up to Monday 6th
if you want to get the latest value 11,2 (from Friday 04/03/2020) you can do:
=GOOGLEFINANCE("NASDAQ:AMZN", "price",
IF(TEXT(DATE(2020, 4, 4), "ddd")="Sat", DATE(2020, 4, 4)-1,
IF(TEXT(DATE(2020, 4, 4), "ddd")="Sun", DATE(2020, 4, 4)-2,
DATE(2020, 4, 4))))

How to schedule a dag to run a particular time of day of a month of a year. i.e., once

I'm trying to execute a dag which needs to be run only once. So I placed the dag execution interval as '#once'. However, I'm getting the error as mentioned in this link -
https://issues.apache.org/jira/browse/AIRFLOW-1400
Now i'm trying to pass the exact date of execution as below:
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017,11,13),
'email': ['airflow#airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=5)
}
dag = DAG(
dag_id='dagNameTest', default_args=default_args, schedule_interval='12 09 13 11 2017',concurrency=1)
This is throwing error as:
File "/usr/lib/python2.7/site-packages/croniter/croniter.py", line 543, in expand
expr_format))
CroniterBadCronError: [12 09 13 11 2017] is not acceptable, out of range
Can someone help to resolve this.
Thanks,
Arjun
You have 2017 in the "day of week" spot. Try 12 09 13 11 *. You are trying to set a "date" in a field "schedule interval". So technically this will schedule it for once a year. You can run it this way, and when it has finished, deactivate the DAG.
Set a yearly interval for the minute, hour, day, month, and weekday number you want. I.E. 12 09 13 11 *. Set your DAG's start_date and end_date before and after that date respectively, and it should run only once at that time.
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017,1,1),
'end_date': datetime(2017,12,31),
'email': ['airflow#airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=5)
}
dag = DAG(
dag_id='dagNameTest', default_args=default_args, schedule_interval='12 09 13 11 *',concurrency=1)
It's possible that since datetime can take your time you might set the start_date with the hour and minute and then use the #once schedule. But I haven't tried that myself.
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017,11,13,9,11),
'email': ['airflow#airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(seconds=5)
}
dag = DAG(
dag_id='dagNameTest', default_args=default_args, schedule_interval='#once',concurrency=1)

Days of the week Django

please, explain me, how do this thing: I have a week number (52, for example) and year (2012). So, how I can get the days number (monday - 24, tuesday - 25, etc). Yes, I read this, but I cant understand, how to do it.
Thanks.
I would do it like this:
from datetime import date, timedelta
def get_weekdays(year, week):
january_first = date(year, 1, 1)
monday_date = january_first + timedelta(days=week * 7 - january_first.weekday())
# monday, tuesday, .. sunday
return [(monday_date + timedelta(days=d)).day for d in range(7)]
(my weeks start at monday)

Converting UTC to local

My timezone is United States Eastern Standard Time which is 5 hours behind UTC. Given that:
struct tm t = { 0, 30, 15, 10, 3, 112, 0, 0, -1 };
time_t utc_in_timet = _mkgmtime(&t);
struct tm tt = { 0 };
localtime_s(&tt, &utc_in_timet);
tt is off by one hour when localtime_s returns. I have 11:30 in there instead of 10:30.
What am I missing?
I think it has something to do with daylight saving time. Are you sure your timezone currently is EST (-5) ? Because it seems your system should be using EDT (-4) ?
I have tried your code in my machine and it works correctly (my time zone is GMT+2). Since you are telling your system to check for daylight savings itself (the last parameter for the tm is -1), it is actually using EDT and is thus giving you GMT-4.
You can try replacing the month (3) with 2, so that the date would be March 10th, just before the daylight savings change; I bet you will get the expected 10:30 in that case.
Verify your local timezone. Both England (e.g. London) and the east coast of the U.S. are currently in daylight savings time so this looks to be the issue (as someone already mentioned). For the U.S. east coast EDT would be 4 hours different.
Thought the problem is the month is March: struct tm t = { 0, 30, 15, 10, 3, 112, 0, 0, -1 };
Then, it is daylight saving time issue. But as pointed below by Gorpik "months go from 0 to 11, so April is indeed 3".
So, i checked - it shows 18:30 in Haifa which is correct +2UTC.