dsub: google cloud error ("exit status 141") - google-cloud-platform

I was trying to run some whole genome sequencing samples on google cloud using dsub. The dsub commands work ok for some samples, but not others. I have tried reducing the number of parallel threads, increasing the memory and disk, but it still fails. Since each run takes about 2 days, the trial and error approach is pretty expensive! Any help/tips would be highly appreciated!
My command is:
dsub \
--project "${MY_PROJECT}" \
--zones "us-central1-a" \
--logging "${LOGGING}" \
--vars-include-wildcards \
--disk-size 800 \
--min-ram 60 \
--image "us.gcr.io/xxx-yyy-zzz/data" \
--tasks "${SCRIPT_DIR}"/tBOWTIE2.tsv \
--command 'bismark --bowtie2 --bam --parallel 2 "${GENOME_REFERENCE}" -1 "${INPUT_FORWARD}" -2 "${INPUT_REVERSE}" -o "${OUTPUT_DIR}"' \
--wait
The dstat command with '--full' option shows the error as:
status: FAILURE
status-detail: "11: Docker run failed"
The last line in the log file, on google cloud, just states "(exit status 141)".
many thanks!

Related

TheGraph: one graph-node for several blockchains?

graph-node from TheGraph can get data from blockchain.
From https://github.com/graphprotocol/graph-node/blob/master/docs/getting-started.md
cargo run -p graph-node --release -- \
--postgres-url postgresql://<USERNAME><:PASSWORD>#localhost:5432/<POSTGRES_DB_NAME> \
--ethereum-rpc <ETHEREUM_NETWORK_NAME>:https://mainnet.infura.io \
--ipfs 127.0.0.1:5001 \
--debug
So running with --ethereum-rpc mainnet:https://mainnet.infura.io
But how to have one graph-node to several blockchains, e.g. one Ethereum mainnet and one Ethereum testnet ?
You can pass multiple --ethereum-rpc in the command
e.g.
cargo run -p graph-node --release -- \
--postgres-url postgresql://<USERNAME><:PASSWORD>#localhost:5432/<POSTGRES_DB_NAME> \
--ethereum-rpc mainnet:https://mainnet.infura.io \
--ethereum-rpc goerli:https://...
--ipfs 127.0.0.1:5001 \
--debug
Ref: https://github.com/graphprotocol/graph-node/pull/1027

How to resolve gcloud crashed (ReadTimeout): HTTPSConnectionPool(host='cloudfunctions.googleapis.com', port=443): Read timed out. (read timeout=300)

I receive the error when triggering a cloud function using the gcloud command from terminal:
gcloud functions call function_name
On the cloud function log page no error is shown and the task is finished with no problem, however, after the task is finished this error shows up on the terminal.
gcloud crashed (ReadTimeout): HTTPSConnectionPool(host='cloudfunctions.googleapis.com', port=443): Read timed out. (read timeout=300)
Note: my function time out is set to 540 second and it takes ~320 seconds to finish the job
I think the issue is that gcloud functions call times out after 300 seconds and is non-configurable for a longer timeout to match the Cloud Function.
I created a simple Golang Cloud Function:
func HelloFreddie(w http.ResponseWriter, r *http.Request) {
log.Println("Sleeping")
time.Sleep(400*time.Second)
log.Println("Resuming")
fmt.Fprint(w, "Hello Freddie")
}
And deployed it:
gcloud functions deploy ${NAME} \
--region=${REGION} \
--allow-unauthenticated \
--entry-point="HelloFreddie" \
--runtime=go113 \
--source=${PWD} \
--timeout=520 \
--max-instances=1 \
--trigger-http \
--project=${PROJECT}
Then I time'd it using gcloud functions call ${NAME} ...
time \
gcloud functions call ${NAME} \
--region=${REGION} \
--project=${PROJECT}
And this timed out:
ERROR: gcloud crashed (ReadTimeout): HTTPSConnectionPool(host='cloudfunctions.googleapis.com', port=443): Read timed out. (read timeout=300)
real 5m1.079s
user 0m0.589s
sys 0m0.107s
NOTE 5m1s ~== 300s
But, using curl:
time \
curl \
--request GET \
--header "Authorization: Bearer $(gcloud auth print-access-token)" \
$(\
gcloud functions describe ${NAME} \
--region=${REGION}
--project=${PROJECT} \
--format="value(httpsTrigger.url)")
Yields:
Hello Freddie
real 6m43.048s
user 0m1.210s
sys 0m0.167s
NOTE 6m43s ~== 400s
So, gcloud functions call times out after 300 seconds and this is non-configurable.
Submitted an issue to Google's Issue Tracker.

Google Cloud Function fail to build

I'm trying to update a cloud function that has been working for over a week now.
But when I try to update the function today, I get BUILD FAILED: BUILD HAS TIMED OUT error
Build fail error
I am using the google cloud console to deploy the python function and not cloud shell. I even tried to make a new copy of the function and that fails too.
Looking at the logs, it says INVALID_ARGUMENT. But I'm just using the console and haven't changed anything apart from the python code in comparison to previous build that I successfully deployed last week.
Error logs
{
insertId: "fjw53vd2r9o"
logName: " my log name "
operation: {…}
protoPayload: {
#type: "type.googleapis.com/google.cloud.audit.AuditLog"
authenticationInfo: {…}
methodName: "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction"
requestMetadata: {…}
resourceName: " my function name"
serviceName: "cloudfunctions.googleapis.com"
status: {
code: 3
message: "INVALID_ARGUMENT"
}
}
receiveTimestamp: "2020-02-05T18:04:18.269557510Z"
resource: {…}
severity: "ERROR"
timestamp: "2020-02-05T18:04:18.241Z"
}
I even tried to increase the timeout parameter to 540 seconds and I still get the build error.
Timeout parameter setting
Can someone help please ?
In future, please copy and paste the text from errors and logs rather than reference screenshots; it's easier to parse and it's possibly more permanent.
It's possible that there's an intermittent issue with the service (in your region) that is causing you problems. Does this issue continue?
You may check the status dashboard (there are none for Functions) for service issues:
https://status.cloud.google.com/
I just deployed and updated a Golang Function in us-centrall without issues.
Which language|runtime are you using?
Which region?
Are you confident that your updates to the Function are correct?
A more effective albeit dramatic way to test this would be to create a new (temporary) project and try to deploy the function there (possibly to a different region too).
NB The timeout setting applies to the Function's invocations not to the deployment.
Example (using gcloud)
PROJECT=[[YOUR-PROJECT]]
BILLING=[[YOUR-BILLING]]
gcloud projects create ${PROJECT}
gcloud beta billing projects link ${PROJECT} --billing-account=${BILLING}
gcloud services enable cloudfunctions.googleapis.com --project=${PROJECT}
touch function.go go.mod
# Deploy
gcloud functions deploy fred \
--region=us-central1 \
--allow-unauthenticated \
--entry-point=HelloFreddie \
--trigger-http \
--source=${PWD} \
--project=${PROJECT} \
--runtime=go113
# Update
gcloud functions deploy fred \
--region=us-central1 \
--allow-unauthenticated \
--entry-point=HelloFreddie \
--trigger-http \
--source=${PWD} \
--project=${PROJECT} \
--runtime=go113
# Test
curl \
--request GET \
$(\
gcloud functions describe fred \
--region=us-central1 \
--project=${PROJECT} \
--format="value(httpsTrigger.url)")
Hello Freddie
Logs:
gcloud logging read "resource.type=\"cloud_function\" resource.labels.function_name=\"fred\" resource.labels.region=\"us-central1\" protoPayload.methodName=(\"google.cloud.functions.v1.CloudFunctionsService.CreateFunction\" OR \"google.cloud.functions.v1.CloudFunctionsService.UpdateFunction\")" \
--project=${PROJECT} \
--format="json(protoPayload.methodName,protoPayload.status)"
[
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.CreateFunction"
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.CreateFunction",
"status": {}
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction"
}
},
{
"protoPayload": {
"methodName": "google.cloud.functions.v1.CloudFunctionsService.UpdateFunction",
"status": {}
}
}
]

Google Cloud Dataproc: cluster create errors (debconf DbDriver config.dat locked)

Recently, I have experienced occasional errors while attempting to create dataproc clusters in GCP. The creation command is similar to:
gcloud dataproc clusters create ${CLUSTER_NAME} \
--zone "us-east1-b" \
--master-machine-type "n1-standard-16" \
--master-boot-disk-size 150 \
--num-workers ${WORKER_NODE_COUNT:-9} \
--worker-machine-type "n1-standard-16" \
--worker-boot-disk-size 25 \
--project ${PROJECT_NAME} \
--properties 'yarn:yarn.log-aggregation-enable=true'
Very intermittently, the error I receive is:
ERROR: (gcloud.dataproc.clusters.create) Operation [projects/PROJECT/regions/global/operations/UUID] failed: Multiple Errors:
- Failed to initialize node random-name-m. See output in: gs://dataproc-UUID-us/google-cloud-dataproc-metainfo/UUID/random-name-m/dataproc-startup-script_output
- Failed to initialize node random-name-w-0. See output in: gs://dataproc-UUID-us/google-cloud-dataproc-metainfo/UUID/random-name-w-0/dataproc-startup-script_output
- Failed to initialize node random-name-w-1. See output in: gs://dataproc-UUID-us/google-cloud-dataproc-metainfo/UUID/random-name-w-1/dataproc-startup-script_output
- Worker random-name-w-8 unable to register with master random-name-m. This could be because it is offline, or network is misconfigured..
And the last lines of the Google Storage bucket output file (dataproc-startup-script_output) are:
+ debconf-set-selections
debconf: DbDriver "config": /var/cache/debconf/config.dat is locked by another process: Resource temporarily unavailable
++ logstacktrace
++ local err=1
++ local code=1
++ set +o xtrace
ERROR: 'debconf-set-selections' exited with status 1
Call tree:
0: /usr/local/share/google/dataproc/startup-script-cloud_datarefinery_image_20180803_nightly-RC04.sh:490 main
Exiting with status 1
This one is really starting to annoy me! Any ideas/thoughts/resolutions are much appreciated!
A fix for this issue will be rolling out over the course of next week's release.
You can check the release notes to see when the fix has rolled out here:
https://cloud.google.com/dataproc/docs/release-notes

Hyperledger Fabric "panic: Error while trying to open DB: resource temporarily unavailable" during starting a peer

I am trying to run multiple peers in different terminals on an Ubuntu 16.04 LTS machine. I am able to generate the certs for the configuration and channel transaction using the cryptogen and configtxgen tools. I am also able to start a peer (peer0) using the below configuration (start-peer0.sh)
CORE_PEER_ENDORSER_ENABLED=true \
CORE_PEER_PROFILE_ENABLED=true \
CORE_PEER_ADDRESS=peer0:7051 \
CORE_PEER_CHAINCODELISTENADDRESS=peer0:7052 \
CORE_PEER_ID=org0-peer0 \
CORE_PEER_LOCALMSPID=Org0MSP \
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer0:7051 \
CORE_PEER_GOSSIP_USELEADERELECTION=true \
CORE_PEER_GOSSIP_ORGLEADER=false \
CORE_PEER_TLS_ENABLED=false \
CORE_PEER_TLS_KEY_FILE=/root/bcnetwork/conf/crypto-config /peerOrganizations/org0/peers/peer0.org0/tls/server.key \
CORE_PEER_TLS_CERT_FILE=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer0.org0/tls/server.crt \
CORE_PEER_TLS_ROOTCERT_FILE=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer0.org0/tls/ca.crt \
CORE_PEER_TLS_SERVERHOSTOVERRIDE=peer0 \
CORE_VM_DOCKER_ATTACHSTDOUT=true \
CORE_PEER_MSPCONFIGPATH=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer0.org0/msp \
peer node start --peer-defaultchain=false
But when i am trying to start the another peer (peer1) using other terminal following error occurs:
""panic: Error while trying to open DB: resource temporarily unavailable"
start-peer1.sh
CORE_PEER_ENDORSER_ENABLED=true \
CORE_PEER_PROFILE_ENABLED=true \
CORE_PEER_ADDRESS=peer1:7053 \
CORE_PEER_CHAINCODELISTENADDRESS=peer1:7054 \
CORE_PEER_ID=org0-peer1 \
CORE_PEER_LOCALMSPID=Org0MSP \
CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer1:7053 \
CORE_PEER_GOSSIP_USELEADERELECTION=true \
CORE_PEER_GOSSIP_ORGLEADER=false \
CORE_PEER_TLS_ENABLED=false \
CORE_PEER_TLS_KEY_FILE=/root/bcnetwork/conf/crypto-config /peerOrganizations/org0/peers/peer1.org0/tls/server.key \
CORE_PEER_TLS_CERT_FILE=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer1.org0/tls/server.crt \
CORE_PEER_TLS_ROOTCERT_FILE=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer1.org0/tls/ca.crt \
CORE_PEER_TLS_SERVERHOSTOVERRIDE=peer1 \
CORE_VM_DOCKER_ATTACHSTDOUT=true \
CORE_PEER_MSPCONFIGPATH=/root/bcnetwork/conf/crypto-config/peerOrganizations/org0/peers/peer1.org0/msp \
peer node start --peer-defaultchain=false
Both the peers belongs to same organization.
Motive of this configuration is to deploy the same chaincode on both the peers and check if invoke is called on one peer, other peer will get the updated state of the ledger while querying the ledger.
Any suggestions and guidance will be a great help for me.