Apache Airflow 1.9 : Dataflow exception at the job's end - python-2.7

I am using Airflow 1.9 to launch a Dataflow on Google Cloud Platform (GCP) thanks to a DataflowJavaOperator.
Below, the code used to launch dataflow from an Airflow Dag :
df_dispatch_data = DataFlowJavaOperator(
task_id='df-dispatch-data', # Equivalent to JobName
jar="/path/of/my/dataflow/jar",
gcp_conn_id="my_connection_id",
dataflow_default_options={
'project': my_project_id,
'zone': 'europe-west1-b',
'region': 'europe-west1',
'stagingLocation': 'gs://my-bucket/staging',
'tempLocation': 'gs://my-bucket/temp'
},
options={
'workerMachineType': 'n1-standard-1',
'diskSizeGb': '50',
'numWorkers': '1',
'maxNumWorkers': '50',
'schemaBucket': 'schemas_needed_to_dispatch',
'autoscalingAlgorithm': 'THROUGHPUT_BASED',
'readQuery': 'my_query'
}
)
However, even if all is right on GCP because the job succeed, an exception occured at the end of the dataflow job on my compute Airflow. It is thrown by the gcp_dataflow_hook.py :
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 27, in <module>
args.func(args)
File "/usr/local/lib/python2.7/dist-packages/airflow/bin/cli.py", line 528, in test
ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1584, in run
session=session)
File "/usr/local/lib/python2.7/dist-packages/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1493, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/operators/dataflow_operator.py", line 121, in execute
hook.start_java_dataflow(self.task_id, dataflow_options, self.jar)
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 152, in start_java_dataflow
task_id, variables, dataflow, name, ["java", "-jar"])
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 146, in _start_dataflow
self.get_conn(), variables['project'], name).wait_for_done()
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 31, in __init__
self._job = self._get_job()
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 48, in _get_job
job = self._get_job_id_from_name()
File "/usr/local/lib/python2.7/dist-packages/airflow/contrib/hooks/gcp_dataflow_hook.py", line 40, in _get_job_id_from_name
for job in jobs['jobs']:
KeyError: 'jobs'
Have you got an idea ?

This issue is caused by options used to launch the dataflow. If --zone or --region are given the google API to get job status does not work, only if default zone and regions, US/us-central1.

Related

MWAA Trigger DAG Issue with POST request

I have a problem when I try to execute multiple Tasks within MWAA using POST Requests. I have been using mw1.small tier of MWAA and I schedule around 3 tasks per minute with EventBridge and Lambda. When I see my results I find that some tasks are missing and when I search for logs, I noticed that the Task was triggered but It was never scheduled or queued, and it does not appear on the tree or graph view.
I have 169 rules created on Event Bridge running a certain time everyday and I only see around 165 to 166 executions of the DAG. It is not a problem from Event Bridge or Lambda. I checked the logs for those services and all 169 DAG invocations are working fine.
The lambda function that I mentioned before triggers the DAG using a POST Request for every rule that I have on Event Bridge.
These are my configuration options that I have set.
celery.pool=1
celery.worker_autoscale=1,1
core.dag_file_processor_timeout=150
core.dagbag_import_timeout=90
core.killed_task_cleanup_time=604800
core.min_serialized_dag_update_interval=60
scheduler.dag_dir_list_interval=300
scheduler.min_file_process_interval=300
scheduler.parsing_processes=1
scheduler.processor_poll_interval=60
scheduler.schedule_after_task_execution=false
NOTE: I know I can use Step Functions but this is not an option in my case.
EDIT:
This problem is caused because I have multiple parallel requests made from the lambda function. Airflow 2.2.2 uses dag_id and execution_date as a primary key for the table dag_run.
The two types of traceback that I found are:
/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py:91 DeprecationWarning: Calling `DAG.create_dagrun()` without an explicit data interval is deprecated
Traceback (most recent call last):
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dag_run_dag_id_execution_date_key"
DETAIL: Key (dag_id, execution_date)=(test_dag, 2023-02-16 20:19:55+00) already exists.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/airflow/.local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 48, in main
args.func(args)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
return func(*args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
return f(*args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/dag_command.py", line 138, in dag_trigger
dag_id=args.dag_id, run_id=args.run_id, conf=args.conf, execution_date=args.exec_date
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/client/local_client.py", line 30, in trigger_dag
dag_id=dag_id, run_id=run_id, conf=conf, execution_date=execution_date
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 125, in trigger_dag
replace_microseconds=replace_microseconds,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 91, in _trigger_dag
dag_hash=dag_bag.dags_hash.get(dag_id),
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/session.py", line 70, in wrapper
return func(*args, session=session, **kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/models/dag.py", line 2359, in create_dagrun
session.flush()
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2540, in flush
self._flush(objects)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2682, in _flush
transaction.rollback(_capture_exception=True)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
with_traceback=exc_tb,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2642, in _flush
flush_context.execute()
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/unitofwork.py", line 589, in execute
uow,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
insert,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1136, in _emit_insert_statements
statement, params
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
distilled_params,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
cursor, statement, parameters, context
File "/usr/local/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dag_run_dag_id_execution_date_key"
DETAIL: Key (dag_id, execution_date)=(test_dag, 2023-02-16 20:19:55+00) already exists.
[SQL: INSERT INTO dag_run (dag_id, queued_at, execution_date, start_date, end_date, state, run_id, creating_job_id, external_trigger, run_type, conf, data_interval_start, data_interval_end, last_scheduling_decision, dag_hash) VALUES (%(dag_id)s, %(queued_at)s, %(execution_date)s, %(start_date)s, %(end_date)s, %(state)s, %(run_id)s, %(creating_job_id)s, %(external_trigger)s, %(run_type)s, %(conf)s, %(data_interval_start)s, %(data_interval_end)s, %(last_scheduling_decision)s, %(dag_hash)s) RETURNING dag_run.id]
[parameters: {'dag_id': 'test_dag', 'queued_at': datetime.datetime(2023, 2, 16, 20, 19, 56, 168249, tzinfo=Timezone('UTC')), 'execution_date': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'start_date': None, 'end_date': None, 'state': <TaskInstanceState.QUEUED: 'queued'>, 'run_id': 'test22__2023-02-16T20:19:03+602430', 'creating_job_id': None, 'external_trigger': True, 'run_type': <DagRunType.MANUAL: 'manual'>, 'conf': <psycopg2.extensions.Binary object at 0x7fe5917cc900>, 'data_interval_start': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'data_interval_end': DateTime(2023, 2, 16, 20, 19, 55, tzinfo=Timezone('UTC')), 'last_scheduling_decision': None, 'dag_hash': 'a1c4fce80be1afad038a0ccd8a41efcf'}]
(Background on this error at: http://sqlalche.me/e/13/gkpj)
and
Traceback (most recent call last):
File "/usr/local/airflow/.local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/__main__.py", line 48, in main
args.func(args)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 48, in command
return func(*args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/utils/cli.py", line 92, in wrapper
return f(*args, **kwargs)
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/dag_command.py", line 138, in dag_trigger
dag_id=args.dag_id, run_id=args.run_id, conf=args.conf, execution_date=args.exec_date
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/client/local_client.py", line 30, in trigger_dag
dag_id=dag_id, run_id=run_id, conf=conf, execution_date=execution_date
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 125, in trigger_dag
replace_microseconds=replace_microseconds,
File "/usr/local/airflow/.local/lib/python3.7/site-packages/airflow/api/common/experimental/trigger_dag.py", line 75, in _trigger_dag
f"A Dag Run already exists for dag id {dag_id} at {execution_date} with run id {run_id}"
airflow.exceptions.DagRunAlreadyExists: A Dag Run already exists for dag id test_dag at 2023-02-16 20:20:23+00:00 with run id test21__2023-02-16T20:20:04+061773

How to pass args and template_fields to dataproc from airflow 1

I am trying to execute python code on a dataproc cluster via airflow orchestration.
I am using airflow 1.10.12, and DataprocWorkflowTemplateInstantiateInlineOperator to instanciate a dataproc cluster & pass some parameters (and templated params aswell). The main objective is to run some prediction code.
Note that I upgraded this code so use airflow.providers.google.cloud.operators.dataproc and not airflow.contrib.operators.dataproc_operator to import DataprocInstantiateInlineWorkflowTemplateOperator, because the former introduced the parameters kwarg that, in theory, permits passing arguments to the cluster. Using the later in my other scripts, I have no errors, but I cannot introduce parameters to the cluster.
...
from airflow.providers.google.cloud.operators.dataproc import (
DataprocInstantiateInlineWorkflowTemplateOperator,
)
...
workflow_seg_members = make_workflow_template(
region=REGION,
dataproc_job_bucket=DATAPROC_JOB_BUCKET,
python_main_executable_path="segmentation_members/seg_members_prediction.py",
)
op_seg_members_prediction = DataprocInstantiateInlineWorkflowTemplateOperator(
task_id="seg_members_prediction",
project_id=PROJECT_ID,
region=REGION,
template=workflow_seg_members,
parameters={
"execution_date_str": "{{ds_nodash}}",
"project": "<REDACTED>",
"dataset": "<REDACTED>",
"features_table_prefix": "global_features",
"output_table_prefix": "seg_members_output",
"path_to_model": "segmentation_members/DecisionTreeClassifier.pkl",
"bucket_name": "<REDACTED>",
"model_designation": "Segmentation Members",
},
)
in seg_members_prediction.py, I use argparse to create the needed arguments.
The error I am getting is :
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.cloud.dataproc.v1beta2.OrderedJob got str.
My questions are :
How do I fix this MergeFrom() exception?
Is this the right approach to pass parameters from airflow to my dataproc cluster?
Here is the complete stack :
File "/usr/local/lib/airflow/airflow/providers/google/common/hooks/base_google.py", line 373, in inner_wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/airflow/airflow/providers/google/cloud/hooks/dataproc.py", line 712, in instantiate_inline_workflow_template
metadata=metadata,
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/dataproc_v1beta2/gapic/workflow_template_service_client.py", line 488, in instantiate_inline_workflow_template
request_id=request_id,
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.cloud.dataproc.v1beta2.OrderedJob got str.
[2022-06-13 12:36:34,696] {taskinstance.py:1197} INFO - Marking task as UP_FOR_RETRY. dag_id=ds_seg_members_integration, task_id=seg_members_prediction, execution_date=20220610T162234, start_date=20220613T123629, end_date=20220613T123634
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 7, in <module>
exec(compile(f.read(), __file__, 'exec'))
File "/usr/local/lib/airflow/airflow/bin/airflow", line 37, in <module>
args.func(args)
File "/usr/local/lib/airflow/airflow/utils/cli.py", line 76, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/bin/cli.py", line 561, in run
_run(args, dag, ti)
File "/usr/local/lib/airflow/airflow/bin/cli.py", line 480, in _run
pool=args.pool,
File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 986, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/airflow/airflow/providers/google/cloud/operators/dataproc.py", line 1748, in execute
metadata=self.metadata,
File "/usr/local/lib/airflow/airflow/providers/google/common/hooks/base_google.py", line 373, in inner_wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/airflow/airflow/providers/google/cloud/hooks/dataproc.py", line 712, in instantiate_inline_workflow_template
metadata=metadata,
File "/opt/python3.6/lib/python3.6/site-packages/google/cloud/dataproc_v1beta2/gapic/workflow_template_service_client.py", line 488, in instantiate_inline_workflow_template
request_id=request_id,
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.cloud.dataproc.v1beta2.OrderedJob got str.
EDIT :
I tried running the following code but still got the same error :
op_seg_members_prediction = DataprocInstantiateInlineWorkflowTemplateOperator(
task_id="seg_members_prediction",
project_id=PROJECT_ID,
region=REGION,
template=workflow_seg_members,
)
op_seg_members_prediction.execute(context="DEBUG")
After browsing the apache-airflow-providers-google doc I discovered it isn't compatible with airflow 1.
You can install this package on top of an existing Airflow 2
installation (see Requirements below for the minimum Airflow version
supported) via pip install apache-airflow-providers-google
The package supports the following python versions: 3.7,3.8,3.9,3.10
So I upgraded to airflow 2 in my local environment, and the MergeFrom() error stopped occurring.
This still begs the question : How do you pass parameters from airflow to dataproc for airflow 1

sam build --use-container failed mounting directory

There was no issue in building the project a little while back, but it started throwing below error.
RuntimeError: Container does not exist. Cannot get logs for this
container
Normally this happens when docker cannot mount the shared directory, but in this case even adding the lambda directory manually in the docker interface didn't help!
Complete debug log of sam build --use-container
Building function 'SAListManagerUrlLambda'
Fetching lambci/lambda:build-python3.7 Docker container image......
Mounting C:\Users\xxxx\xxxx\xxxx\xxxx\functions\xxxx-xxxx\xxxx-xxxx as /tmp/samcli/source:ro,delegated inside runtime container
Container was not created. Skipping deletion
Sending Telemetry: {'metrics': [{'commandRun': {'awsProfileProvided': False, 'debugFlagProvided': True, 'region': '', 'commandName': 'sam build', 'duration': 1292, 'exitReason': 'RuntimeError', 'exitCode': 255, 'requestId': 'cbfcd29c-16ae-xxxx-xxxx-b9ffec8de75a', 'installationId': 'fece8ccc-cb84-xxxx-xxxx-ac72820ef0c3', 'sessionId': 'e1cbc287-1850-xxxx-xxxx-3a235769f7fb', 'executionEnvironment': 'CLI', 'pyversion': '3.7.6', 'samcliVersion': '0.53.0'}}]}
HTTPSConnectionPool(host='aws-serverless-tools-telemetry.us-west-2.amazonaws.com', port=443): Read timed out. (read timeout=0.1)
Traceback (most recent call last):
File "D:\obj\windows-release\37amd64_Release\msi_python\zip_amd64\runpy.py", line 193, in _run_module_as_main
File "D:\obj\windows-release\37amd64_Release\msi_python\zip_amd64\runpy.py", line 85, in _run_code
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\__main__.py", line 12, in <module>
cli(prog_name="sam")
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 782, in main
rv = self.invoke(ctx)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\decorators.py", line 73, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\click\core.py", line 610, in invoke
return callback(*args, **kwargs)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\lib\telemetry\metrics.py", line 96, in wrapped
raise exception # pylint: disable=raising-bad-type
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\lib\telemetry\metrics.py", line 62, in wrapped
return_value = func(*args, **kwargs)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\commands\build\command.py", line 129, in cli
mode,
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\commands\build\command.py", line 194, in do_cli
artifacts = builder.build()
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\lib\build\app_builder.py", line 117, in build
function.metadata)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\lib\build\app_builder.py", line 271, in _build_function
options)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\lib\build\app_builder.py", line 369, in _build_function_on_container
container.wait_for_logs(stdout=stdout_stream, stderr=stderr_stream)
File "C:\Amazon\AWSSAMCLI\runtime\lib\site-packages\samcli\local\docker\container.py", line 197, in wait_for_logs
raise RuntimeError("Container does not exist. Cannot get logs for this container")
RuntimeError: Container does not exist. Cannot get logs for this container
In my case the reason was different, Action Center's Focus Assist was set to Alarms Only.
This caused the share directory notification to fail, causing the build failure.
So, make sure your Focus Assist is set to OFF.
It seems that many situations can trigger the same error. For more information the --debug option can be used like this:
sam build --use-container --debug
I see that you are using it, because you got extra information like this:
Sending Telemetry: {'metrics': [{'commandRun': {'awsProfileProvided': False, 'debugFlagProvided': True, 'region': '', 'commandName': 'sam build', 'duration': 1292, 'exitReason': 'RuntimeError', 'exitCode': 255, 'requestId': 'cbfcd29c-16ae-xxxx-xxxx-b9ffec8de75a', 'installationId': 'fece8ccc-cb84-xxxx-xxxx-ac72820ef0c3', 'sessionId': 'e1cbc287-1850-xxxx-xxxx-3a235769f7fb', 'executionEnvironment': 'CLI', 'pyversion': '3.7.6', 'samcliVersion': '0.53.0'}}]}
HTTPSConnectionPool(host='aws-serverless-tools-telemetry.us-west-2.amazonaws.com', port=443): Read timed out. (read timeout=0.1)
Traceback (most recent call last):
In my case I did suppose that the error was sending the telemetry.
My guess is that somehow the build process need pass the region. In my case it is not us-west-2.
Anyway, I disabled it as specified in the documentation and it now works.
In my case local disk in my cloud9 was almost full, so I had to delete some docker images that comes pre-installed with cloud9.
To remove an image use
docker rmi Image
This will clear up space and your build will not fail the next time.

Cloud composer issue with datasets in Australia region

I was trying to use cloud composer to schedule and orchestrate Bigquery jobs. Bigquery tables are in australia-southeast1 region.The cloud composer environment was created in us-central1 region(As composer is not available in Australia region). When I try below command , it throws a vague error. The same setup worked fine when I tried with datasets residing in EU and US.
Command:
gcloud beta composer environments run bq-schedule --location us-central1 test -- my_bigquery_dag input_gl 8-02-2018
Error:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 7, in <module>
exec(compile(f.read(), __file__, 'exec'))
File "/usr/local/lib/airflow/airflow/bin/airflow", line 27, in <module>
args.func(args)
File "/usr/local/lib/airflow/airflow/bin/cli.py", line 528, in test
ti.run(ignore_task_deps=True, ignore_ti_state=True, test_mode=True)
File "/usr/local/lib/airflow/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/models.py", line 1583, in run
session=session)
File "/usr/local/lib/airflow/airflow/utils/db.py", line 50, in wrapper
result = func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/models.py", line 1492, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/airflow/airflow/contrib/operators/bigquery_operator.py", line 98, in execute
self.create_disposition, self.query_params)
File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 499, in run_query
return self.run_with_configuration(configuration)
File "/usr/local/lib/airflow/airflow/contrib/hooks/bigquery_hook.py", line 868, in run_with_configuration
err.resp.status)
Exception: ('BigQuery job status check failed. Final error was: %s', 404)
Is there any workaround to resolve this issue?
Because your dataset resides in australia-southeast1, BigQuery created a job in the same location by default, which is australia-southeast1. However, the Airflow in your Composer environment was trying to get the job's status without specifying location field.
Reference: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
This has been fixed by my PR and it has been merged to master.
To work around this, you can extend the BigQueryCursor and override the run_with_configuration() function with location support.

aws no credentials error

I am trying to setup dynamic thumbnail service thumbor and to support s3 as storage, I need to setup this community powered pip library for aws.
Its working well on my local environment but when I am trying to host it on one of our servers, I am getting NoCredentialsError. I am assuming this is because of difference versions of botocore (latest one and one installed by pip library). Here is error log:
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 774, in get_component
# client config from the session
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 174, in <lambda>
self._components.lazy_register_component(
File "/usr/local/lib/python2.7/dist-packages/botocore/session.py", line 453, in get_data
- agent_version is the value of the `user_agent_version`
File "/usr/local/lib/python2.7/dist-packages/botocore/loaders.py", line 119, in _wrapper
data = func(self, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/botocore/loaders.py", line 364, in load_data
DataNotFoundError: Unable to load data for: _endpoints
2016-04-24 12:14:34 tornado.application:ERROR Future exception was never retrieved: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tornado/gen.py", line 230, in wrapper
yielded = next(result)
File "/usr/local/lib/python2.7/dist-packages/thumbor/handlers/imaging.py", line 31, in check_image
exists = yield gen.maybe_future(self.context.modules.storage.exists(kw['image'][:self.context.config.MAX_ID_LENGTH]))
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 455, in wrapper
future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 215, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 443, in wrapper
result = f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tc_aws/aws/storage.py", line 107, in exists
self.storage.get(file_abspath, callback=return_data)
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 455, in wrapper
future.result()
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 215, in result
raise_exc_info(self._exc_info)
File "/usr/local/lib/python2.7/dist-packages/tornado/concurrent.py", line 443, in wrapper
result = f(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/tc_aws/aws/bucket.py", line 44, in get
Key=self._clean_key(path),
File "/usr/local/lib/python2.7/dist-packages/tornado_botocore/base.py", line 97, in call
return self._make_api_call(operation_name=self.operation, api_params=kwargs, callback=callback)
File "/usr/local/lib/python2.7/dist-packages/tornado_botocore/base.py", line 60, in _make_api_call
operation_model=operation_model, request_dict=request_dict, callback=callback)
File "/usr/local/lib/python2.7/dist-packages/tornado_botocore/base.py", line 54, in _make_request
request_dict=request_dict, operation_model=operation_model, callback=callback)
File "/usr/local/lib/python2.7/dist-packages/tornado_botocore/base.py", line 32, in _send_request
request = self.endpoint.create_request(request_dict, operation_model)
File "/usr/local/lib/python2.7/dist-packages/botocore/endpoint.py", line 126, in create_request
operation_name=operation_model.name)
File "/usr/local/lib/python2.7/dist-packages/botocore/hooks.py", line 226, in emit
return self._emit(event_name, kwargs)
File "/usr/local/lib/python2.7/dist-packages/botocore/hooks.py", line 209, in _emit
response = handler(**kwargs)
File "/usr/local/lib/python2.7/dist-packages/botocore/signers.py", line 90, in handler
return self.sign(operation_name, request)
File "/usr/local/lib/python2.7/dist-packages/botocore/signers.py", line 124, in sign
signer.add_auth(request=request)
File "/usr/local/lib/python2.7/dist-packages/botocore/auth.py", line 626, in add_auth
raise NoCredentialsError
NoCredentialsError: Unable to locate credentials
Could it be fixed with proper ordering in which I install libraries? Because the pip library removes existing newer version of botocore and installs an older version.
EDIT:
I am running processes with supervisor and it seems process cant access aws credentials
EDIT 2:
The issue got resolved with proper configuration of supervisor. The user for process started by supervisor did not have access to config file
The issue got resolved with proper configuration of supervisor. The user for subprocess started by supervisor did not have access to aws config file. So it was working with local environment or creating process separately but not with supervisor.