"Exited sync due to fetch errors" while fetching B2G code - build

I am trying to prepare my first B2G build for my Intex Cloud FX phone using these steps. I found out that my phone's code is tarako from the Firefox OS Phones page. However, running ./config.sh tarako fails every time.
Although the above article does say the following:
Note: it is possible for config.sh to fail with git-related fetching
errors such as the following :
Fetching projects: 95% (118/124) error: Exited sync due to fetch
errors
This appears to be caused by a connection error on the Android repo
source repository. In this case, you will want to rerun config.sh.
After a short while, It will automatically resume where it left off.
You might have do to this several times until it finally fetches all
projects.
But, I've tried it several times and the process aborts with the same error every time.
Here's the log:
Get https://github.com/mozilla-b2g/b2g-manifest
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
curl: (22) The requested URL returned error: 404 Not Found
Server does not provide clone.bundle; ignoring.
remote: Counting objects: 1725, done.
remote: Total 1725 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (1725/1725), 708.93 KiB | 62.00 KiB/s, done.
Resolving deltas: 100% (1019/1019), done.
From https://github.com/mozilla-b2g/b2g-manifest
* [new branch] master -> origin/master
* [new branch] revert-203-bug1025788-v2 -> origin/revert-203-bug1025788-v2
* [new branch] v1-train -> origin/v1-train
* [new branch] v1.0.0 -> origin/v1.0.0
* [new branch] v1.0.1 -> origin/v1.0.1
* [new branch] v1.1.0hd -> origin/v1.1.0hd
* [new branch] v1.2 -> origin/v1.2
* [new branch] v1.2f -> origin/v1.2f
* [new branch] v1.3 -> origin/v1.3
* [new branch] v1.3t -> origin/v1.3t
* [new branch] v1.4 -> origin/v1.4
* [new branch] v2.0 -> origin/v2.0
* [new branch] v2.1 -> origin/v2.1
* [new tag] B2G_1_0_1_20130213094222 -> B2G_1_0_1_20130213094222
* [new tag] B2G_1_1_0_hd_20130530182315 -> B2G_1_1_0_hd_20130530182315
* [new tag] B2G_1_1_0_hd_20130530182315_BASE -> B2G_1_1_0_hd_20130530182315_BASE
* [new tag] closing-nightly -> closing-nightly
Your identity is: John Bupit
If you want to change this, please re-run 'repo init' with --config-name
repo has been initialized in /home/jbupit/b2g/B2G
Fetching project gecko.git
Fetching project moztt
Fetching project platform/hardware/libhardware
Fetching project platform/system/bluetooth
Fetching projects: 1% (1/84) Fetching project platform/external/safe-iop
Fetching projects: 2% (2/84) Fetching project platform/abi/cpp
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/system/bluetooth/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/hardware/libhardware/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/abi/cpp/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/external/safe-iop/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/abi/cpp/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/system/bluetooth/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/external/safe-iop/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
fatal: unable to access 'http://sprdsource.spreadtrum.com:8085/b2g/android/platform/hardware/libhardware/': Failed connect to sprdsource.spreadtrum.com:8085; Connection timed out
error: Cannot fetch platform/system/bluetooth
Fetching project gonk-misc
Fetching projects: 3% (3/84) error: Cannot fetch platform/external/safe-iop
error: Cannot fetch platform/abi/cpp
error: Cannot fetch platform/hardware/libhardware
error: Exited sync due to fetch errors
Repo sync failed
What should I do? Is there another way of downloading the code?

I also ran into this - the server seems to be very unstable, or missing content at times
You may be able to piece together what you need from https://git.mozilla.org/?a=project_list&s=sprd&btnS=Search
You will need to edit the manifest XML with new URLs and change config.sh to use that.
The original sync was discussed in https://bugzilla.mozilla.org/show_bug.cgi?id=982360 and https://bugzilla.mozilla.org/show_bug.cgi?id=1014102

Related

Spark Job Crashes with error in prelaunch.err

We are runing a spark job which runs close to 30 scripts one by one. it usually takes 14-15h to run, but this time it failed in 13h. Below is the details:
Command:spark-submit --executor-memory=80g --executor-cores=5 --conf spark.sql.shuffle.partitions=800 run.py
Setup: Running spark jobs via jenkins on AWS EMR with 16 spot nodes
Error: Since the YARN log is huge (270Mb+), below are some extracts from it:
[2022-07-25 04:50:08.646]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : ermediates/master/email/_temporary/0/_temporary/attempt_202207250435265404741257029168752_0641_m_000599_168147 s3://memberanalytics-data-out-prod/pipelined_intermediates/master/email/_temporary/0/task_202207250435265404741257029168752_0641_m_000599 using algorithm version 1 22/07/25 04:37:05 INFO FileOutputCommitter: Saved output of task 'attempt_202207250435265404741257029168752_0641_m_000599_168147' to s3://memberanalytics-data-out-prod/pipelined_intermediates/master/email/_temporary/0/task_202207250435265404741257029168752_0641_m_000599 22/07/25 04:37:05 INFO SparkHadoopMapRedUtil: attempt_202207250435265404741257029168752_0641_m_000599_168147: Committed 22/07/25 04:37:05 INFO Executor: Finished task 599.0 in stage 641.0 (TID 168147). 9341 bytes result sent to driver 22/07/25 04:49:36 ERROR YarnCoarseGrainedExecutorBackend: Executor self-exiting due to : Driver ip-10-13-52-109.bjw2k.asg:45383 disassociated! Shutting down. 22/07/25 04:49:36 INFO MemoryStore: MemoryStore cleared 22/07/25 04:49:36 INFO BlockManager: BlockManager stopped 22/07/25 04:50:06 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException java.util.concurrent.TimeoutException at java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95) 22/07/25 04:50:06 ERROR Utils: Uncaught exception in thread shutdown-hook-0 java.lang.InterruptedException

Heroku Redis Error while reading from socket: (104, 'Connection reset by peer')

I'm running a Heroku Redis instance that always worked fine, but after the upgrade when I try to run celery workers I get the following error
[2021-07-23 11:06:08,135: ERROR/MainProcess] consumer: Cannot connect to redis://:**#ec2-54-***-***-*.eu-west-1.compute.amazonaws.com:*****//: Error while reading from socket: (104, 'Connection reset by peer').
Trying again in 12.00 seconds... (6/100)
Everything seems to be up to date and I can't seem to figure out how to run it again. I tried to drop the redis instance and create it from scratch but nothing.

Gahp server (failure issues ) exited with status 1 unexpectedly

I am working on a web-based tool (named cloudcopasi) which take jobs from a user and submit it to bosco resources (compute nodes). I am using a bosco version (condor 8.8.12) on Linux CentOS 7. The web interface allows a user to add a bosco pool which user can use to submit jobs. However, when I try to submit a job, it fails. I tried to test the pool as well by using the following command:
bosco_cluster --test
It gives me the following GAHP error:
…..
Testing bosco submission...Passed!
Submission and log files for this job are in /home/cloudcopasi/bosco/local.bosco/bosco-test/boscotest.LTA07r
Waiting for jobmanager to accept job...Passed
Checking for submission to remote slurm cluster (could take ~30 seconds)...Failed
Showing last 5 lines of logs:
01/06/21 13:34:03 [3800] Gahp Server (pid=3815) exited with status 1 unexpectedly
01/06/21 13:34:08 [3800] gahp server not up yet, delaying ping
01/06/21 13:34:08 [3800] No jobs left, shutting down
01/06/21 13:34:08 [3800] Got SIGTERM. Performing graceful shutdown.
01/06/21 13:34:08 [3800] **** condor_gridmanager (condor_GRIDMANAGER) pid 3800 EXITING WITH STATUS 0
I am not sure what I am missing but I don’t understand how to solve this “Gahp server” issue.
Any help is highly appreciated.
Thank you.
This a probably an ssh failure (network, authentication, or authorization). Bosco runs the following command to access the remote cluster submit host:
<sbin>/remote_gahp <user>#<hostname> batch_gahp
You can run it on the command line to get more details about what's going wrong. remote_gahp is a bash script, so you can dig in further, if necessary.

Tendermint : get error when start tendermint node as per the sample

I tried to create a application as per the guild in the tendermint document.
After I started the application and tendermint node, getting the below error. I use go version go1.15.5 linux/amd64 and tednermint v0.34.0-rc4-148-g095e9cd.
[bc#localhost kvstore]$ TMHOME="/tmp/example" tendermint node --proxy_app=unix://example.sock
I[2020-12-01|13:16:59.697] Version info module=main software=v0.34.0-rc4-148-g095e9cd block=11 p2p=8
I[2020-12-01|13:16:59.702] Starting Node service module=main impl=Node
I[2020-12-01|13:16:59.703] Starting StateSync service module=statesync impl=StateSync
I[2020-12-01|13:16:59.738] Started node module=main nodeInfo="{ProtocolVersion:{P2P:8 Block:11 App:0} DefaultNodeID:3cf5ea6219c57fd906c042f767748988ba070db7 ListenAddr:tcp://0.0.0.0:26656 Network:test-chain-1mrgVg Version:v0.34.0-rc4-148-g095e9cd Channels:40202122233038606100 Moniker:localhost.localdomain Other:{TxIndex:on RPCAddress:tcp://127.0.0.1:26657}}"
E[2020-12-01|13:17:00.758] Stopping abci.socketClient for error: read message: EOF module=abci-client connection=consensus
E[2020-12-01|13:17:00.758] consensus connection terminated. Did the application crash? Please restart tendermint module=proxy err="read message: EOF"
E[2020-12-01|13:17:00.758] Error in proxyAppConn.BeginBlock module=state err="read message: EOF"
E[2020-12-01|13:17:00.758] Error on ApplyBlock module=consensus err="read message: EOF"
I[2020-12-01|13:17:00.758] captured terminated, exiting... module=main
I[2020-12-01|13:17:00.758] Stopping Node service module=main impl=Node
I[2020-12-01|13:17:00.758] Stopping Node module=main
I[2020-12-01|13:17:00.760] Stopping StateSync service module=statesync impl=StateSync
I[2020-12-01|13:17:00.760] Closing rpc listener module=main listener="&{Listener:0xc00000d440 sem:0xc000039200 closeOnce:{done:0 m:{state:0 sema:0}} done:0xc000039260}"
E[2020-12-01|13:17:00.760] Error serving server module=main err="accept tcp 127.0.0.1:26657: use of closed network connection"
KVStore
[bc#localhost kvstore]$ ./example
badger 2020/12/01 13:16:55 INFO: All 0 tables opened in 0s
badger 2020/12/01 13:16:55 INFO: Replaying file id: 0 at offset: 0
badger 2020/12/01 13:16:55 INFO: Replay took: 6.807µs
badger 2020/12/01 13:16:55 DEBUG: Value log discard stats empty
I[2020-12-01|13:16:55.373] Starting ABCIServer service impl=ABCIServer
I[2020-12-01|13:16:55.405] Waiting for new connection...
I[2020-12-01|13:16:59.692] Accepted a new connection
I[2020-12-01|13:16:59.692] Waiting for new connection...
I[2020-12-01|13:16:59.692] Accepted a new connection
I[2020-12-01|13:16:59.692] Waiting for new connection...
I[2020-12-01|13:16:59.692] Accepted a new connection
I[2020-12-01|13:16:59.692] Waiting for new connection...
I[2020-12-01|13:16:59.692] Accepted a new connection
I[2020-12-01|13:16:59.692] Waiting for new connection...
E[2020-12-01|13:17:00.758] Connection error err="error reading message: proto: wrong wireType = 2 for field Height"
E[2020-12-01|13:17:00.761] Connection was closed by client
E[2020-12-01|13:17:00.761] Connection was closed by client
E[2020-12-01|13:17:00.761] Connection was closed by client
go.mod
module github.com/me/example
go 1.15
require (
github.com/dgraph-io/badger v1.6.2
github.com/tendermint/tendermint v0.34.0-rc4
)

Spark EMR Cluster is removing executors when run because they are idle

I have a spark application that was running fine in standalone mode, I'm now trying to get the same application to run on an AWS EMR Cluster but currently it's failing.
The message is one I've not seen before and implies that the workers are not receiving jobs and are being shut down.
**16/11/30 14:45:00 INFO ExecutorAllocationManager: Removing executor 3 because it has been idle for 60 seconds (new desired total will be 7)
16/11/30 14:45:00 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 2
16/11/30 14:45:00 INFO ExecutorAllocationManager: Removing executor 2 because it has been idle for 60 seconds (new desired total will be 6)
16/11/30 14:45:00 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 4
16/11/30 14:45:00 INFO ExecutorAllocationManager: Removing executor 4 because it has been idle for 60 seconds (new desired total will be 5)
16/11/30 14:45:01 INFO YarnClientSchedulerBackend: Requesting to kill executor(s) 7
16/11/30 14:45:01 INFO ExecutorAllocationManager: Removing executor 7 because it has been idle for 60 seconds (new desired total will be 4)**
The DAG shows the workers initialised, then a collect (one that is relatively small) and then shortly after they all fail.
Dynamic allocation was enabled so there was a thought that perhaps the driver wasn't sending them any tasks and so they timed out - to prove the theory I spun up another cluster without dynamic allocation and the same thing happened.
The master is set to yarn.
Any help is massively appreciated, thanks.
16/11/30 14:49:16 INFO BlockManagerMaster: Removal of executor 21 requested
16/11/30 14:49:16 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asked to remove non-existent executor 21
16/11/30 14:49:16 INFO BlockManagerMasterEndpoint: Trying to remove executor 21 from BlockManagerMaster.
16/11/30 14:49:24 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1480517110174_0001_01_000049 on host: ip-10-138-114-125.ec2.internal. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_1480517110174_0001_01_000049
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
My step is quite simple - spark-submit --deploy-mode client --master yarn --class Run app.jar