Slow upload speed in aws deploy push command - amazon-web-services

I am trying to use AWS CodeDeploy. I use aws deploy push --debug command. The file to be uploaded is around 250 KB. But upload doesn't finish. Following is the logs displayed.
2017-10-27 11:11:40,601 - MainThread - botocore.auth - DEBUG - CanonicalRequest:
PUT
/frontend-deployer/business-services-0.0.1-SNAPSHOT-classes.jar
partNumber=39&uploadId=.olvaJkxreDZf1ObaHCMtHmkQ5DFE.uZ9Om0sxZB08YG3tqRWBxmGLTFWSYQaj9mHl26LPJk..Stv_vPB5NMaV.zAqsYX6fZz_S3.uN5J4FlxHZFXoeTkMiBSYQB2C.g
content-md5:EDXgvJ8Tt5tHYZ6Nkh7epg==
host:s3.us-east-2.amazonaws.com
x-amz-content-sha256:UNSIGNED-PAYLOAD
x-amz-date:20171027T081140Z
content-md5;host;x-amz-content-sha256;x-amz-date
UNSIGNED-PAYLOAD
...
2017-10-27 11:12:12,035 - MainThread - botocore.endpoint - DEBUG - Sending http request: <PreparedRequest [PUT]>
2017-10-27 11:12:12,035 - MainThread - botocore.awsrequest - DEBUG - Waiting for 100 Continue response.
2017-10-27 11:12:12,189 - MainThread - botocore.awsrequest - DEBUG - 100 Continue response seen, now sending request body.
Even though the file is fairly small (250 KB), upload doesn't finish.
On the other hand, upload via aws s3 cp command lasts 1 second.
How can I increase the upload speed in aws deploy push command?

Related

Why doesn't Dask dashboard update when I run some code?

I'm trying to recreate the behaviour of the Dask dashboard as illustrated in this Youtube video https://www.youtube.com/watch?time_continue=1086&v=N_GqzcuGLCY. I can see my dashboard, but it doesn't update when I run a computation.
I'm running everything on my local machine (Kubuntu 18.04).
I used anaconda to set up my environment, including
python 2.7.14
dask 0.17.4
dask-core 0.17.4
bokeh 1.0.4
tornado 4.5.1
I set up my scheduler from the command line
dask-scheduler
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Clear task state
distributed.scheduler - INFO - Scheduler at: tcp://192.168.1.204:8786
distributed.scheduler - INFO - bokeh at: :8787
distributed.scheduler - INFO - Local Directory: /tmp/scheduler-bYQe2p
distributed.scheduler - INFO - -----------------------------------------------
distributed.scheduler - INFO - Register tcp://127.0.0.1:35007
distributed.scheduler - INFO - Starting worker compute stream, tcp://127.0.0.1:35007
...and a worker too.
dask-worker localhost:8786
distributed.nanny - INFO - Start Nanny at: 'tcp://127.0.0.1:36345'
distributed.worker - INFO - Start worker at: tcp://127.0.0.1:44033
distributed.worker - INFO - Listening to: tcp://127.0.0.1:44033
distributed.worker - INFO - bokeh at: 127.0.0.1:8789
distributed.worker - INFO - nanny at: 127.0.0.1:36345
distributed.worker - INFO - Waiting to connect to: tcp://localhost:8786
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Threads: 4
distributed.worker - INFO - Memory: 16.70 GB
distributed.worker - INFO - Local Directory: /home/fergal/orbital/repos/projects/safegraph/dask/dask-worker-space/worker-QjJ1ke
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO - Registered to: tcp://localhost:8786
distributed.worker - INFO - -------------------------------------------------
Then my code, borrowed from the video, is
from dask.distributed import Client
import dask.array as da
client = Client(processes=False)
print(client)
x = da.random.random((10000, 10000, 10), chunks=(1000,1000,5))
y = da.random.random((10000, 10000, 10), chunks=(1000,1000,5))
z = (da.arcsin(x) + da.arcsin(y)).sum(axis=(1,2))
z.visualize('eg.svg')
z.compute()
The code runs, and produces a graph via graph-viz. The bokeh server is accessible at 127.0.0.1:8787/status, and displays a big blue block at the top right, as per the first few seconds of the video. But when I run my code, the webpage doesn't update to show a running computation, nor does it show any results when the computation is finished. Iwould expect to see something like what I see around time 1:20 on the video.
I'm undoubtedly neglecting to set something up properly, but I can't find any clues in either the documentation or on Stack Overflow. So what am I doing wrong?
I found a solution.
Update dask to 1.1.5, shutdown the dask-scheduler (and dask-worker). I'm guessing my problem was that the version of dask from the default conda channel was out of date. I downloaded the newer version from conda-forge

AWS CLI: ECS register-task-definition and the requires-compatabilies option

I'm trying to adapt my CircleCI config file to build my node.js app to a Docker image and deploy it to AWS ECS. I started with this config.yml file from ricktbaker and I'm trying to make it work on Fargate.
When I initially ran these changes in CircleCI, I got this error:
An error occurred (InvalidParameterException) when calling the UpdateService operation: Task definition does not support launch_type FARGATE.
It looks like I should be able to modify line 71 with the requires-compatibilities option to change how the task definition is registered, but I keep getting an error I can't figure out.
json=$(aws ecs register-task-definition --container-definitions "$task_def" --family "$FAMILY" --requires-compatibilities "FARGATE")
usage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]
To see help text, you can run:
aws help
aws <command> help
aws <command> <subcommand> help
Unknown options: --requires-compatibilities, FARGATE
Am I adding the option incorrectly? It seems to match AWS' docs... Thanks, for any tips.
I tried adding the debug option as well, but I don't see anything particularly helpful in the log (slightly redacted, below).
2019-03-13 03:05:45,948 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/1.11.76 Python/2.7.15 Linux/4.4.0-141-generic botocore/1.5.39
2019-03-13 03:05:45,948 - MainThread - awscli.clidriver - DEBUG - Arguments entered to CLI: ['ecs', 'register-task-definition', '--container-definitions', 'MYCONTAINERDEFINITION', '--family', 'MYTASKNAME', '--debug', '--requires-compatibilities', 'FARGATE']
2019-03-13 03:05:45,948 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function add_scalar_parsers at 0x7fd7e93fbb90>
2019-03-13 03:05:45,948 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function inject_assume_role_provider_cache at 0x7fd7e985d398>
2019-03-13 03:05:45,949 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /usr/lib/python2.7/site-packages/botocore/data/ecs/2014-11-13/service-2.json
2019-03-13 03:05:45,962 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.ecs: calling handler <function register_retries_for_service at 0x7fd7ea57ecf8>
2019-03-13 03:05:45,962 - MainThread - botocore.handlers - DEBUG - Registering retry handlers for service: ecs
2019-03-13 03:05:45,963 - MainThread - botocore.hooks - DEBUG - Event building-command-table.ecs: calling handler <function add_waiters at 0x7fd7e9381d70>
2019-03-13 03:05:45,966 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /usr/lib/python2.7/site-packages/botocore/data/ecs/2014-11-13/waiters-2.json
2019-03-13 03:05:45,967 - MainThread - awscli.clidriver - DEBUG - OrderedDict([(u'family', <awscli.arguments.CLIArgument object at 0x7fd7e8f066d0>), (u'task-role-arn', <awscli.arguments.CLIArgument object at 0x7fd7e8f06950>), (u'network-mode', <awscli.arguments.CLIArgument object at 0x7fd7e8f06990>), (u'container-definitions', <awscli.arguments.ListArgument object at 0x7fd7e8f069d0>), (u'volumes', <awscli.arguments.ListArgument object at 0x7fd7e8f06a10>), (u'placement-constraints', <awscli.arguments.ListArgument object at 0x7fd7e8f06a50>)])
2019-03-13 03:05:45,967 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.ecs.register-task-definition: calling handler <function add_streaming_output_arg at 0x7fd7e9381140>
2019-03-13 03:05:45,968 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.ecs.register-task-definition: calling handler <function add_cli_input_json at 0x7fd7e98661b8>
2019-03-13 03:05:45,968 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.ecs.register-task-definition: calling handler <function unify_paging_params at 0x7fd7e9402ed8>
2019-03-13 03:05:45,971 - MainThread - botocore.loaders - DEBUG - Loading JSON file: /usr/lib/python2.7/site-packages/botocore/data/ecs/2014-11-13/paginators-1.json
2019-03-13 03:05:45,972 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.ecs.register-task-definition: calling handler <function add_generate_skeleton at 0x7fd7e947e320>
2019-03-13 03:05:45,972 - MainThread - botocore.hooks - DEBUG - Event before-building-argument-table-parser.ecs.register-task-definition: calling handler <bound method CliInputJSONArgument.override_required_args of <awscli.customizations.cliinputjson.CliInputJSONArgument object at 0x7fd7e8f06a90>>
2019-03-13 03:05:45,972 - MainThread - botocore.hooks - DEBUG - Event before-building-argument-table-parser.ecs.register-task-definition: calling handler <bound method GenerateCliSkeletonArgument.override_required_args of <awscli.customizations.generatecliskeleton.GenerateCliSkeletonArgument object at 0x7fd7e8f1e890>>
Your command line format is correct, i.e. register-task-definition --requires-compatibilities "FARGATE"
Since Fargate is quite new. So, you may have to make sure that awscli is recent version.
What is your installed awscli version? the latest version is 1.16.123
And, the recommended way pip3 install awscli --upgrade --user
Hope this helps.

Cannot assume role through AWS config file

We've always used the following to assume a role for longer than an hour on a remote machine:
# Prep environment to use roles.
unset AWS_CONFIG_FILE
unset AWS_DEFAULT_REGION
unset AWS_DEFAULT_PROFILE
CONFIG_FILE=$(mktemp)
# Creates temp file with instance profile credentials as default
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, ROLE_ARN are available from the environment.
printf "[default]\naws_access_key_id=$AWS_ACCESS_KEY_ID\naws_secret_access_key=$AWS_SECRET_ACCESS_KEY\n[profile role_profile]\nrole_arn = $ROLE_ARN\nsource_profile = default" > $CONFIG_FILE
# make sure instance profile takes precedence
unset AWS_ACCESS_KEY_ID
unset AWS_SECRET_ACCESS_KEY
unset AWS_SESSION_TOKEN
export AWS_CONFIG_FILE=$CONFIG_FILE
export AWS_DEFAULT_REGION=us-east-1
export AWS_DEFAULT_PROFILE=role_profile
Unfortunately, this method recently started to fail. We can reproduce the failure just by running:
aws sts get-caller-identity
Adding the --debug flag to the last command:
09:11:47 2018-06-21 14:11:47,731 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/1.15.40 Python/2.7.12 Linux/4.9.76-3.78.amzn1.x86_64 botocore/1.10.40
...
09:11:47 2018-06-21 14:11:47,811 - MainThread - botocore.hooks - DEBUG - Event choose-signer.sts.GetCallerIdentity: calling handler <function set_operation_specific_signer at 0x7f22d19a6ed8>
09:11:47 2018-06-21 14:11:47,812 - MainThread - botocore.credentials - WARNING - Refreshing temporary credentials failed during mandatory refresh period.
09:11:47 Traceback (most recent call last):
09:11:47 File "/var/lib/jenkins/.local/lib/python2.7/site-packages/botocore/credentials.py", line 432, in _protected_refresh
...
09:11:47 raise KeyError(cache_key)
09:11:47 KeyError: 'xxxx' (redacted)
09:11:47 2018-06-21 14:11:47,814 - MainThread - awscli.clidriver - DEBUG - Exiting with rc 255
Apparently a key is missing from a Python "cache" dictionary.
The obvious solution is just to find and remove the cache:
rm ~/.aws/cli/cache/*
This doesn't explain how this started happening, though (and if it will happen again). Can anyone explain what happened?
Probably, the permissions inside ~/.aws/cli are wrong.
Check the permissions:
ls -la ~/.aws/cli
ls -la ~/.aws/cli/cache
It might be possible that your files have wrong permissions or ownership. Correct them and aws cli commands would work correctly.
The permissions needed for files inside ~/.aws/cli/cache are -rw-------.
Hope it helps.

Uploading a 155 GB file to S3 bucket

I'm using S3Express on Windows to upload a 155GB file on to the S3 bucket using the following command:
put E:\<Folder-name>\<File-name>.csv s3://data.<company-name>.org/<Folder-name>/ -mul:100 -t:2
But the upload doesn't seem to start at all. It was stuck at the following:
Max. Threads: 2
Using MULTIPART UPLOADS (part size:100MB)
S3 Bucket: s3:
S3 Folder: data.<company-name>.org/<Folder-name>/
Selecting Files ...
Press 'Esc' to stop ...
Selected Files to Upload: 1 (154.61GB = 166013845848B) - Use [-showfiles] to list files.
Uploading files to S3...
Press 'Esc' to stop ...
[s=Status] [p=in Progress] [e=Errors] [w=Warnings] [k=sKipped] [d=Dupl] [o=Out]
before throwing the following error:
Error initializing upload for file : E:\<Folder-name>\<File-name>.csv
E:\<Folder-name>\<File-name>.csv: com_err:7 - Failed to connect to s3 port 443: Timed out - Failed to retrieve list of active multipart uploads
------------------------------------------------------------------------------
Done.
Errors (1):
E:\<Folder-name>\<File-name>.csv - com_err:7 - Failed to connect to s3 port 443: Timed out - Failed to retrieve list of active multipart uploads
------------------------------------------------------------------------------
Threads: 0 - Transf.: 0B (0B/sec) ET: 0h 25m 30s
Current Bandwidth: 0B/sec (0B left) - Temporary Network Errors: 4
Compl. Files: 0 of 1 (0B of 154.61GB) (0%) - Skip: 0 - Err: 1 (154.61GB)
I'm new to S3Express and AWS in general.
Any help would be much appreciated.
TIA.
Remove s3:// from the beginning of the destination. According to the docs, the format is bucket_name/object_name. There is no s3:// prefix.
These lines should have been a clue to you that something is wrong with your invocation:
S3 Bucket: s3:
S3 Folder: data.<company-name>.org/<Folder-name>/

Elastic Beanstalk CloudWatch Log streaming stops working – How to debug

My Elastic Beanstalk environment is stopping streaming node.js events to CloudWatch Logs. Streaming works fine for a view minutes on a new instance. After a view minutes no more logs show up in CloudWatch.
I set up AWS Elastic Beanstalk to stream logs to CloudWatch under Configuration > Software Configuration > CloudWatch Logs > Log Streaming (true). I deactivated log streaming and reactivated it as a test. Taking a look at cloudwatch
Last eb-activity log is about 10 minutes old
Error log is not available (on neither of the instances)
nginx/access.log is a view seconds old
nodejs.log is about an hour old (short after relaunching instance)
Every health check writes an log entry every view seconds into nodejs.log though.
I did not find any logs on the ec2 instance regarding log streaming.
Has anyone similar issues?
How do I make Elastic Beanstalk stream nodejs logs to CloudWatch logs.
--- EDIT
[ec2-user#ip-###-##-##-## log]$ cat /var/log/awslogs.log
2017-03-07 11:01:05,928 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Detected file rotation, notifying reader
2017-03-07 11:01:05,928 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Reader is still alive.
2017-03-07 11:01:05,928 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:05,928 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:06,052 - cwlogs.push.reader - INFO - 31861 - Thread-8 - No data is left. Reader is leaving.
2017-03-07 11:01:10,929 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Removing dead reader [2177a5cce5ed29525de329bfdc292ff1, /var/log/nginx/access.log]
2017-03-07 11:01:10,929 - cwlogs.push.stream - INFO - 31861 - Thread-1 - Starting reader for [92257964a10edeb586f084f4f2ba35de, /var/log/nginx/access.log]
2017-03-07 11:01:10,930 - cwlogs.push.reader - INFO - 31861 - Thread-11 - Start reading file from 0.
2017-03-07 11:01:10,930 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:10,930 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:15,931 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:15,931 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:16,788 - cwlogs.push.publisher - INFO - 31861 - Thread-7 - Log group: /aws/elasticbeanstalk/production/var/log/nginx/access.log, log stream: i-0bd24767864801e2c, queue size: 0, Publish batch: {'skipped_events_count': 0, 'first_event': {'timestamp': 1488884470930, 'start_position': 0L, 'end_position': 114L}, 'fallback_events_count': 0, 'last_event': {'timestamp': 1488884472931, 'start_position': 341L, 'end_position': 454L}, 'source_id': '92257964a10edeb586f084f4f2ba35de', 'num_of_events': 4, 'batch_size_in_bytes': 554}
2017-03-07 11:01:20,932 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:20,932 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:25,933 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/error.log*'.
2017-03-07 11:01:25,933 - cwlogs.push.stream - WARNING - 31861 - Thread-1 - No file is found with given path '/var/log/httpd/access.log*'.
2017-03-07 11:01:27,881 - cwlogs.push.publisher - INFO - 31861 - Thread-7 - Log group: /aws/elasticbeanstalk/production/var/log/nginx/access.log, log stream: i-0bd24767864801e2c, queue size: 0, Publish batch: {'skipped_events_count': 0, 'first_event': {'timestamp': 1488884481933, 'start_position': 454L, 'end_position': 568L}, 'fallback_events_count': 0, 'last_event': {'timestamp': 1488884482934, 'start_position': 568L, 'end_position': 681L}, 'source_id': '92257964a10edeb586f084f4f2ba35de', 'num_of_events': 2, 'batch_size_in_bytes': 277}
When Andrew (#andrew-ferk) and myself activated log streaming, it created all the log groups and streams in CloudWatch with the current log. After we deployed again, we noticed the logs stopped. This is because aws hashes the first line of the log. If it has seen that hash before it will treat that file like it's already been processed
If you are using npm start the first lines will be your application's name with version.
You can add a CMD date && npm start to your dockerfile to trigger a different first line each time or run npm in silent mode (as long as your first output is unique).
Also according to their docs you should add some policy to your elastic-beanstalk before enabling the feature AWS-Docs
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:GetLogEvents",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams",
"logs:PutRetentionPolicy"
],
"Resource": [
"*"
]
}
]
}
The following FAQs might be helpful:
CloudWatch Logs Agent FAQs
Why can’t I push log data to CloudWatch Logs with the awslogs agent?
Some things to check if you are streaming custom log files:
eb ssh into the instance and look at /var/log/awslogs.log. If that doesn't even mention "Loading additional configs from (your awslogs config file)", make sure you are installing your config file correct as well as restarting the awslogs service after installing it (presumably using .ebextensions. See "Custom Log File Streaming" in Using Elastic Beanstalk with Amazon CloudWatch Logs. See the commands section in logs-streamtocloudwatch-linux.config for how to restart the awslogs service.
The CloudWatch Logs Agent is stateful. If the first few lines of your log file are blank or never change, you may need to set file_fingerprint_lines. See CloudWatch Logs Agent Reference.