Hi am trying to execute spark with was but I am getting the below issue when I run it on aws cluster.
log4j:ERROR Could not read configuration file from URL [file:/etc/spark/conf/log4j.properties].
java.io.FileNotFoundException: /etc/spark/conf/log4j.properties (No such file or directory)
at java.io.FileInputStream.open(Native Method)
and
17/09/28 19:50:16 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 9AA4363BB8D75166), S3 Extended Request ID: UzSFXBXoJgTar68arZIr6X68gKuIVLOmUub/u5gwnZ9QYC+QpqKZhr7M848mj0OdijyKXGYTJ3I=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1286)
Related
The error is below one :
22/09/14 10:09:30 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 4 times, most recent failure: Lost task 2.3 in stage 0.0 (TID 7, ip-172-34-112-69.us-west-2.compute.internal, executor 1): com.amazonaws.services.glue.util.FatalException: Unable to parse file: *****_20220901.csv
We are trying to run Janusgraph version 0.6.2 using AWS opensearch(elasticsearch) version 7.10 as the indexing backend. Things work fine with version 6.x but when we try to connect to version 7.x we encounter the following exception.
org.janusgraph.diskstorage.PermanentBackendException: method [PUT], host [https://vpc-xxxxxx.us-east-2.es.amazonaws.com:443], URI [/_cluster/settings], status line [HTTP/1.1 401 Unauthorized]
{"Message":"Your request: '/_cluster/settings' payload is not allowed."}
Janusgraph version info:
86 [main] INFO org.janusgraph.graphdb.server.JanusGraphServer - JanusGraph Version: 0.6.2
86 [main] INFO org.janusgraph.graphdb.server.JanusGraphServer - TinkerPop Version: 3.5.3
More detailed stack trace is below:
3115 [main] INFO org.janusgraph.diskstorage.Backend - Configuring index [search]
3387 [main] INFO com.newforma.janusgraph.es.awsauth.AWSV4AuthHttpClientConfigCallback - Initialized AWSV4AuthHttpClientConfigCallback for region us-east-2
3782 [main] WARN org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager - Graph [graph] configured at [/etc/opt/janusgraph/janusgraph.properties] could not be instantiated and will not be available in Gremlin Server. GraphFactory message: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
java.lang.RuntimeException: GraphFactory could not instantiate this Graph implementation [class org.janusgraph.core.JanusGraphFactory]
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:84)
at org.apache.tinkerpop.gremlin.structure.util.GraphFactory.open(GraphFactory.java:80)
... 14 more
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex
at org.janusgraph.util.system.ConfigurationUtil.instantiate(ConfigurationUtil.java:79)
at org.janusgraph.diskstorage.Backend.getImplementationClass(Backend.java:527)
at org.janusgraph.diskstorage.Backend.getIndexes(Backend.java:511)
at org.janusgraph.diskstorage.Backend.<init>(Backend.java:239)
at org.janusgraph.core.JanusGraphFactory.open(JanusGraphFactory.java:127)
... 19 more
Caused by: org.janusgraph.diskstorage.PermanentBackendException: method [PUT], host [https://vpc-xxxxxx.us-east-2.es.amazonaws.com:443], URI [/_cluster/settings], status line [HTTP/1.1 401 Unauthorized]
{"Message":"Your request: '/_cluster/settings' payload is not allowed."}
at org.janusgraph.diskstorage.es.ElasticSearchIndex.setupMaxOpenScrollContextsIfNeeded(ElasticSearchIndex.java:445)
at org.janusgraph.diskstorage.es.ElasticSearchIndex.<init>(ElasticSearchIndex.java:388)
... 32 more
Caused by: org.elasticsearch.client.ResponseException: method [PUT], host [https://vpc-xxxxxx.us-east-2.es.amazonaws.com:443], URI [/_cluster/settings], status line [HTTP/1.1 401 Unauthorized]
{"Message":"Your request: '/_cluster/settings' payload is not allowed."}
at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:326)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:296)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:270)
at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.performRequest(RestElasticSearchClient.java:482)
at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.performRequest(RestElasticSearchClient.java:473)
at org.janusgraph.diskstorage.es.rest.RestElasticSearchClient.updateClusterSettings(RestElasticSearchClient.java:269)
at org.janusgraph.diskstorage.es.ElasticSearchIndex.setupMaxOpenScrollContextsIfNeeded(ElasticSearchIndex.java:443)
From the stack trace it appears that janusgraph was trying to set a high value for the elasticsearch property max_open_scroll_context. It is 500 by default.
AWS opensearch(elasticsearch) 7.x onwards doesn't let us set cluster properties.
Tried the following from kibana and I was able to get a similar response. This operation was supported in AWS managed elasticsearch 6.x version.
PUT _cluster/settings
{
"persistent" : {
"search.max_open_scroll_context": 1024
},
"transient": {
"search.max_open_scroll_context": 1024
}
}
401 - Unauthorized
{"Message":"Your request: '/_cluster/settings' payload is not allowed."}
We can disable setting max_open_scroll_context property while janugraph starts by setting the property index.[x].elasticsearch.setup-max-open-scroll-contexts to false.
You can read more on this in configuration reference section on elasticsearch https://docs.janusgraph.org/configs/configuration-reference/#indexxelasticsearch
Using AWS CDK, I am trying to deploy the Docker image with lambda function on AWS. And I am getting the following error.
[100%] fail: docker login --username AWS --password-stdin https://XXXXXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com exited with error code 1: Error saving credentials: error storing credentials - err: exit status 1, out: `Post "http://ipc/registry/credstore-updated": dial unix /Users/my_mac/Library/Containers/com.docker.docker/Data/backend.sock: connect: connection refused`
❌ MyService (prj-development) failed: Error: Failed to publish one or more assets. See the error messages above for more information.
at publishAssets (/Users/my_mac/.npm/_npx/8365afa3375eae8d/node_modules/aws-cdk/lib/util/asset-publishing.ts:44:11)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at CloudFormationDeployments.publishStackAssets (/Users/my_mac/.npm/_npx/8365afa3375eae8d/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:464:7)
at CloudFormationDeployments.deployStack (/Users/my_mac/.npm/_npx/8365afa3375eae8d/node_modules/aws-cdk/lib/api/cloudformation-deployments.ts:339:7)
at CdkToolkit.deploy (/Users/my_mac/.npm/_npx/8365afa3375eae8d/node_modules/aws-cdk/lib/cdk-toolkit.ts:209:24)
at initCommandLine (/Users/my_mac/.npm/_npx/8365afa3375eae8d/node_modules/aws-cdk/lib/cli.ts:341:12)
Failed to publish one or more assets. See the error messages above for more information.
make: *** [deploy-local] Error 1
What can I do, please?
Before deployment, open the Docker app/daemon on your machine.
So I have installed torch in my local machine and it is working correctly, but while deploying it in aws it show the following error.
I am trying to deploy my django app on AWS elastic beanstalk but I am facing the following error.
2021-04-20 18:00:43 INFO Environment update is starting.
2021-04-20 18:00:46 INFO Deploying new version to instance(s).
2021-04-20 18:00:57 ERROR Instance deployment failed to install application dependencies. The deployment failed.
2021-04-20 18:00:57 ERROR Instance deployment failed. For details, see 'eb-engine.log'.
2021-04-20 18:01:01 ERROR [Instance: i-0eee746bc342a71cd] Command failed on instance. Return code: 1 Output: Engine execution has encountered an error..
2021-04-20 18:01:01 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2021-04-20 18:01:01 ERROR Unsuccessful command execution on instance id(s) 'i-0eee746bc342a71cd'. Aborting the operation.
2021-04-20 18:01:01 ERROR Failed to deploy application.
My eb-engine.log
Downloading https://download.pytorch.org/whl/cpu/torchvision-0.6.1%2Bcpu-cp37-cp37m-linux_x86_64.whl (5.7 MB)
Collecting torch==1.8.1+cpu
2021/04/20 18:00:57.931655 [ERROR] An error occurred during execution of command [app-deploy] - [InstallDependency]. Stop running the command. Error: fail to install dependencies with requirements.txt file with error Command /bin/sh -c /var/app/venv/staging-LQM1lest/bin/pip install -r requirements.txt failed with error exit status 2. Stderr:ERROR: Exception:
Traceback (most recent call last):
File "/var/app/venv/staging-LQM1lest/lib/python3.7/site-packages/pip/_vendor/resolvelib/resolvers.py", line 171, in _merge_into_criterion
crit = self.state.criteria[name]
KeyError: 'torch'
.
.
.
.
File "/var/app/venv/staging-LQM1lest/lib/python3.7/site-packages/pip/_vendor/msgpack/fallback.py", line 671, in _unpack
ret[key] = self._unpack(EX_CONSTRUCT)
File "/var/app/venv/staging-LQM1lest/lib/python3.7/site-packages/pip/_vendor/msgpack/fallback.py", line 684, in _unpack
return bytes(obj)
MemoryError
My requirements.txt
-f https://download.pytorch.org/whl/torch_stable.html
torchvision==0.6.1+cpu
torch==1.8.1+cpu
Can someone explain me what I am doing wrong?
Thank you so much
We have a process that will download files from S3, make changes to the files, and then upload the updated file back to S3. This works fine 99+% of the time. However, it seems that there are transient issues with S3 that cause this to fail for short periods of time, generating 403 (Forbidden) responses.
For example, log entries from one such incident the other day
2018-05-02 19:01:19 INFO Downloaded file
2018-05-02 19:01:20 INFO Uploaded file
2018-05-02 19:01:20 INFO Updated key (renamed file)
2018-05-02 19:27:26 INFO Downloaded file
2018-05-02 19:27:26 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 19:27:26 INFO Downloaded file
2018-05-02 19:27:26 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 19:27:27 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 19:27:27 INFO Downloaded file
2018-05-02 19:27:27 INFO Uploaded file
2018-05-02 19:27:28 ERROR Failed to upload file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 19:27:28 INFO Downloaded file
2018-05-02 19:27:28 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 20:30:32 INFO Downloaded file
2018-05-02 20:30:32 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 20:30:32 INFO Downloaded file
2018-05-02 20:30:32 ERROR Failed to download file, cause: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; )
2018-05-02 20:30:32 INFO Downloaded file
2018-05-02 20:30:33 INFO Uploaded file
These entries were all from the same file. It was successfully download, modified and uploaded again. 30 minutes later, it took 4 attempts to download it, then the upload failed. 3 minutes after that, it took 3 attempts to download, then it was successfully uploaded.
We are using the AWS Java SDK client for this. Has anyone had a similar experience and figured out how to resolve? Is it considered normal for S3 calls to fail occasionally even though the requests are valid?
I had similar issues with other cloud object providers, not with S3.
The solution taken was to handle the 403 response (log it to find any kind of pattern or concrete file object) and redo the request up to a maximum number of times.
Generally, upon the first 403 response, the second request was done, and received the 200 Ok.
In our case, problems were solved after the provider did updates in few of the problematic nodes It was a matter of updates. This can give you a clue, in order to workaround the inconsistency:
Try to create a different bucket. It might be a concrete bucket configuration bug in aws side. Move your files there. Keep it under observation and check if the 403 cases got reduced or even disappear.
Create a different bucket in another region (the closest maybe) it can give you better clues about possible networking issues.
Use a different object storage. If the issue still arise, then the most probable is that any of the modules you use is inconsistent with the current version/protocol of S3. Make sure you update to the last version any S3 or aws wrapper library you use in your project.