In one of our GCP project multiple bigquery scheduled queries are running but recently we started facing failing job issue with the following error -
Error Message: Already Exists: Job <PROJECT ID>:US.scheduled_query_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx; JobID:<PROJECT NUMBER>:scheduled_query_xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
The write preference of these queries is "WRITE_APPEND". These jobs were created using GCP console.
Once retriggered these jobs run successfully with a new job ID.
Please help in understanding the reason behind used job ID allocation to these scheduled queries and please suggest if any fix available.
Related
I have a aws glue pyspark job which is long running after a certain command . In the log it is not writing anything after that command even a simple “print hello “ statement.
How can I debug aws glue pyspark job which is long running and not even writing logs. Job is not throwing any error it shows running status in the console
AWS Glue is based on Apache Spark which means until an action called there will not be any actual execution. So if you put print statements in between and see them in the logs that does't mean that your job is executed up to that point. As your job is long running check this article by AWS which explains about Debugging Demanding Stages and Straggler Tasks. Also this is a good blog to take a look at.
I have simple Dataprep jobs that are transferring GCS data to BQ. Until today, scheduled jobs were running fine but today two jobs failed and two jobs succeeded after taking more than half hour to one hour.
Error message I am getting is below:
java.lang.RuntimeException: Failed to create job with prefix beam_load_clouddataprepcmreportalllobmedia4505510bydataprepadmi_aef678fce2f441eaa9732418fc1a6485_2b57eddf335d0c0b09e3000a805a73d6_00001_00000, reached max retries: 3, last failed job:
I ran same job again, it again took very long time and failed but this time with different message:
Workflow failed. Causes: The Dataflow job appears to be stuck because no worker activity has been seen in the last 1h. Please check the worker logs in Stackdriver Logging. You can also get help with Cloud Dataflow at https://cloud.google.com/dataflow/support.
Any pointers or direction for possible cause! Also, link or troubleshooting tips for dataprep or dataflow job is appreciated.
Thank you
There could be a lot of potential causes for the jobs to get stuck: Transient issues, some quota/limit being reached, change in data format/size or another issue with the resources being used. I suggest to start the troubleshooting from Dataflow side.
Here are some useful resources that can guide you through the most common job errors, and how to troubleshoot them:
Troubleshooting your pipeline
Dataflow common errors
In addition, you could check in the Google Issue tracker of Dataprep and Dataflow to see if the issue has been reported before
Issue tracker for Dataprep
Issue tracker for Dataflow
And you can also look at the GCP status dashboard to discard a widespread issue with some service
Google Cloud Status Dashboard
Finally, if you have GCP support you can reach out directly to support. If you don't have support, you can use the Issue tracker to create a new issue for Dataprep and report the behavior you're seeing.
actually the following steps to my data:
new objects in GCS bucket trigger a Google Cloud function that create a BigQuery Job to load this data to BigQuery.
I need low cost solution to know when this Big Query Job is finished and trigger a Dataflow Pipeline only after the job is completed.
Obs:
I know about BigQuery alpha trigger for Google Cloud Function but i
dont know if is a good idea,from what I saw this trigger uses the job
id, which from what I saw can not be fixed and whenever running a job
apparently would have to deploy the function again. And of course
it's an alpha solution.
I read about a Stackdriver Logging->Pub/Sub -> Google cloud function -> Dataflow solution, but i didn't find any log that
indicates that the job finished.
My files are large so isn't a good idea to use a Google Cloud Function to wait until the job finish.
Despite your mention about Stackdriver logging, you can use it with this filter
resource.type="bigquery_resource"
protoPayload.serviceData.jobCompletedEvent.job.jobStatus.state="DONE"
severity="INFO"
You can add dataset filter in addition if needed.
Then create a sink into Function on this advanced filter and run your dataflow job.
If this doesn't match your expectation, can you detail why?
You can look at Cloud Composer which is managed Apache Airflow for orchestrating jobs in a sequential fashion. Composer creates a DAG and executes each node of the DAG and also checks for dependencies to ensure that things either run in parallel or sequentially based on the conditions that you have defined.
You can take a look at the example mentioned here - https://github.com/GoogleCloudPlatform/professional-services/tree/master/examples/cloud-composer-examples/composer_dataflow_examples
I have an instance running a database on the Google Cloud Platform (MySQL Second Generation master). It is currently taking a backup of the database, and has been doing so for more than 13 hours!
When I try to log into my database through my terminal, I get the following error message:
ERROR: (gcloud.sql.connect) HTTPError 409: Operation failed because another operation was already in progress.
Any idea why it has taken so many hours to create a backup? Anything I can do to be able to view my database in the meantime?
All help is welcome - thank you!
Any idea why it has taken so many hours to create a backup?
This question can't be answered without inspecting your project and metrics, I suggest you to either open a technical support case if you have technical support or raise your issue here with your project number and Cloud SQL instance
Anything I can do to be able to view my database in the meantime?
While there is a backup in progress you can not log in the instance. The best way to access the data (readonly) is to set up a read replica which you will be able to access even while a backup of master instance is in progress.
In Google Cloud ML (Machine Learning), I submitted a job, but it failed due to a Python error in the code.
After fixing the error, how can I re-run the job? Should I submit a new job?
When I'm done, how to delete the job?
The online documentation is not complete.
Thanks
When you're ready to re-try the job, just submit a new job with a new job name.
There is no way to delete jobs since we want to provide you with a record of previous jobs. Jobs will reach a terminal state (FAILED, SUCCEEDED, or CANCELLED) in which they are no longer consuming any resources. However, the jobs will continue to show up in the UI or in the API if you list jobs.