EBS Direct APIs: Starting a snapshot and putting data into it - amazon-web-services

I'm trying to (1) create a new EBS snapshot using the EBS direct APIs and (2) put data into the newly-created snapshot. I keep getting the following error at step #2:
"Error parsing parameter '--block-data': Blob values must be a path to a file."
I'm sure it has to do with the file path ( /tmp/data) in step #2, but I'm not sure what that path file should be or what exactly should be in there.
All help is appreciated. TY!
Here are my CLI commands (from the EC2 instance I'm trying to snapshot):
Start a Snapshot:
aws ebs start-snapshot --volume-size 8 --timeout 60 --client-token 550e8400-e29b-41d4-a716-446655440000
OUTPUT:
{
"Status": "pending",
"KmsKeyArn": "arn:aws:kms:us-east-1:721340000000:key/a0919dc2-5e54-4a66-b52bEXAMPLE",
"BlockSize": 524288,
"VolumeSize": 8,
"StartTime": 1663609346.678,
"SnapshotId": "snap-0d0b369bf6EXAMPLE",
"OwnerId": "7213410000000"
}
Put data into the newly-created snapshot:
aws ebs put-snapshot-block --snapshot-id snap-0d0b369bf6EXAMPLE --block-index 1000 --data-length 524288 --block-data /tmp/data --checksum UK3qYfpOd6sRG4FHFgl6v9Bfg6IHtH60Upu9TXXXXXX= --checksum-algorithm SHA256
This has been my guide: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/writesnapshots.html

Related

AWS cloud watch metrics with ASG name changes

On AWS cloud watch we have one dashboard per environment.
Each dashboard has N plots.
Some plots, use the Auto Scaling Group Name (ASG) to find the data to plot.
Example of such a plot (edit, tab source):
{
"metrics": [
[ "production", "mem_used_percent", "AutoScalingGroupName", "awseb-e-rv8y2igice-stack-AWSEBAutoScalingGroup-3T5YOK67T3FD" ]
],
... other params removed for brevity ...
"title": "Used Memory (%)",
}
Every time we deploy, the ASG name changes (deploy using code-deploy with Elastic Bean Stalk (EBS) configuration files from source).
I need to manually find the new name and update the N plots one by one.
The strange thing is that this happens for production and staging environments, but not for integration.
All 3 should be copies of one another, with different settings from the EBS configuration files, so I don't know what is going on.
In any case, what (I think) I need is one of:
option 1: prevent the ASG name change upon deploy
option 2: dynamically update the plots with the new name
option 3: plot the same data without using the ASG name (but alternatives I find are EC2 instance ID that changes and ImageId and InstanceType that are common to more than one EC2, so won't work either)
My online-search-foo has turned out empty.
More Info:
I'm publishing these metrics with the cloud watch agent, by adjusting the conf file, as per the docs here:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Agent-on-EC2-Instance.html
Have a look at CloudWatch Search Expression Syntax. It allows you to use tokens for searching, e.g.:
SEARCH(' {AWS/CWAgent, AutoScalingGroupName} MetricName="mem_used_percent" rv8y2igice', 'Average', 300)
which would replace the entry for metrics like so:
"metrics": [
[ { "expression": "SEARCH(' {AWS/CWAgent, AutoScalingGroupName} MetricName=\"mem_used_percent\" rv8y2igice', 'Average', 300)", "label": "Expression1", "id": "e1" } ]
]
Simply search the desired result in the console, results that match the search appear.
To graph, all of the metrics that match your search, choose Graph search
and find the accurate search expression that you want in the Details on the Graphed metrics tab.
SEARCH('{CWAgent,AutoScalingGroupName,ImageId,InstanceId,InstanceType} mem_used_percent', 'Average', 300)

Mounting a volume for AWS Batch

We have some AWS batch processes that run nicely, using images from ECS. We do not assign any volumes or storage, and it seems we get 8gb by default. I'm not actually sure why/where that is defined.
Anyway we now have a situation where we need more space. It's only temporary processing space - we need to extract an archive, convert it, re-compress it and then upload it to S3. We already have this process, it's just that we've now ran out of space in our 8gb allowance.
So; Just to be absolutely sure, how should we go about adding this space? I see a few things about connecting EFS to the instance, is that a good use case? Are there considerations regarding to multiple jobs running at the same time etc? (There are scenarios where this is allowed - since it's a generic unzipper process, that gets called many times).
So the requirement is a throwaway storage volume, that doesn't need to persist, it can disappear once the AWS batch job finishes. The data files that have currently blown it up are 9gb, i'm not sure how much our image itself uses. Alpine linux so presumably not a huge amount.
Or of course, if we can simply tune that initial 8gb up by a couple of gb then we're laughing...
Only info I'm finding is in specifying a custom Launch Template in the Environment definition, and the Launch Template setup is as follows: https://aws.amazon.com/premiumsupport/knowledge-center/batch-mount-efs/
As of now, by default EC2 instances on Batch have 30 GB EBS storage attached. This can be tuned by adding a custom launch template as presented here:
{
"LaunchTemplateName": "increase-container-volume-encrypt",
"LaunchTemplateData": {
"BlockDeviceMappings": [
{
"DeviceName": "/dev/xvda",
"Ebs": {
"Encrypted": true,
"VolumeSize": 100,
"VolumeType": "gp2"
}
}
]
}
}
Then, add it to the compute environment definition:
{
"computeEnvironmentName": "",
"type": "MANAGED",
"state": "ENABLED",
"computeResources": {
"type": "EC2",
"allocationStrategy": "BEST_FIT_PROGRESSIVE",
"minvCpus": 2,
"maxvCpus": 20,
"desiredvCpus": 2,
"instanceTypes": [
"c6i"
],
"imageId": "",
"subnets": [
""
],
"launchTemplate": {
"launchTemplateName": "increase-container-volume-encrypt"
},
]
}
}

linking to a google cloud bucket file in a terminal command?

I'm trying to find my way with Google Cloud.
I have a Debian VM Instance that I am running a server on. It is installed and working via SSH Connection in a browser window. The command to start the server is "./ninjamsrv config-file-path.cfg"
I have the config file in my default google firebase storage bucket as I will need to update it regularly.
I want to start the server referencing the cfg file in the bucket, e.g:
"./ninjamsrv gs://my-bucket/ninjam-config.cfg"
But the file is not found:
error opening configfile 'gs://my-bucket/ninjam-config.cfg'
Error loading config file!
However if I run:
"gsutil acl get gs://my-bucket/"
I see:
[
{
"entity": "project-editors-XXXXX",
"projectTeam": {
"projectNumber": "XXXXX",
"team": "editors"
},
"role": "OWNER"
},
{
"entity": "project-owners-XXXXX",
"projectTeam": {
"projectNumber": "XXXXX",
"team": "owners"
},
"role": "OWNER"
},
{
"entity": "project-viewers-XXXXX",
"projectTeam": {
"projectNumber": "XXXXX",
"team": "viewers"
},
"role": "READER"
}
]
Can anyone advise what I am doing wrong here? Thanks
The first thing to verify is if indeed the error thrown is a permission one. Checking the logs related to the VM’s operations will certainly provide more details in that aspect, and a 403 error code would confirm if this is a permission issue. If the VM is a Compute Engine one, you can refer to this documentation about logging.
If the error is indeed a permission one, then you should verify if the permissions for this object are set as “fine-grained” access. This would mean that each object would have its own set of permissions, regardless of the bucket-level access set. You can read more about this here. You could either change the level of access to “uniform” which would grant access to all objects in the relevant bucket, or make the appropriate permissions change for this particular object.
If the issue is not a permission one, then I would recommend trying to start the server from the same .cfg file hosted on the local directory of the VM. This might point the error at the file itself, and not its hosting on Cloud Storage. In case the server starts successfully from there, you may want to re-upload the file to GCS in case the file got corrupted during the initial upload.

How to get the Initialization Vector (IV) from the AWS Encryption CLI?

I'm encrypting a file using the AWS Encryption CLI using a command like so:
aws-encryption-cli --encrypt --input test.mp4 --master-keys key=arn:aws:kms:us-west-2:123456789012:key/exmaple-key-id --output . --metadata-output -
From the output of the command, I can clearly see that it's using an Initialization Vector (IV) of strength 12, which is great, but how do I actually view the IV? In order to pass the encrypted file to another service, like AWS Elastic Transcoder, where it'll do the decryption itself, I need to actually know what the IV was that was used for encrypting the file.
{
"header": {
"algorithm": "AES_256_GCM_IV12_TAG16_HKDF_SHA384_ECDSA_P384",
"content_type": 2,
"encrypted_data_keys": [{
"encrypted_data_key": "...............",
"key_provider": {
"key_info": "............",
"provider_id": "..........."
}
}],
"encryption_context": {
"aws-crypto-public-key": "..............."
},
"frame_length": 4096,
"header_iv_length": 12,
"message_id": "..........",
"type": 128,
"version": "1.0"
},
"input": "/home/test.mp4",
"mode": "encrypt",
"output": "/home/test.mp4.encrypted"
}
Unfortunately, you won't be able to use the AWS Encryption SDK CLI to encrypt data for Amazon Elastic Transcoder's consumption.
One of the primary benefits of the AWS Encryption SDK is the message format[1] which packages all necessary information about the encrypted message into a binary blob and provides a more scalable way of handling large messages. Extracting the data primitives from that blob is not recommended and even if you did, they may or may not be directly compatible with another system, depending on how you used the AWS Encryption SDK and what that other system expects.
In the case of Elastic Transcoder, they expect the raw ciphertext encrypted using the specified AES mode[2]. This is not compatible with the AWS Encryption SDK format.
[1] https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/message-format.html
[2] https://docs.aws.amazon.com/elastictranscoder/latest/developerguide/create-job.html#create-job-request-inputs-encryption

Elasticsearch Snapshot Fails With RepositoryMissingException

Three node ElasticSearch cluster on AWS. Bigdesk and Head both show a healthy cluster. All three nodes are running ES 1.3, and the latest Amazon Linux updates. When I fire off a snapshot request like:
http://localhost:9200/_snapshot/taxanalyst/201409031540-snapshot?wait_for_completion=true
the server churns away for several minutes before responding with the following:
{
"snapshot": {
"snapshot": "201409031521-snapshot",
"indices": [
"docs",
"pdflog"
],
"state": "PARTIAL",
"start_time": "2014-09-03T19:21:36.034Z",
"start_time_in_millis": 1409772096034,
"end_time": "2014-09-03T19:28:48.685Z",
"end_time_in_millis": 1409772528685,
"duration_in_millis": 432651,
"failures": [
{
"node_id": "ikauhFYEQ02Mca8fd1E4jA",
"index": "pdflog",
"reason": "RepositoryMissingException[[faxmanalips] missing]",
"shard_id": 0,
"status": "INTERNAL_SERVER_ERROR"
}
],
"shards": {
"total": 10,
"failed": 1,
"successful": 9
}
}
}
These are three nodes on three different virtual EC2 machines, but they're able to communicate via 9300/9200 without any problems. Indexing and searching works as expected. There doesn't appear to be anything in the elasticsearch log files that speaks to the server error.
Does anyone know what's going on here, or at least where a good place to start would be?
UPDATE: Turns out that each of the nodes in the cluster need to have snapshot directories that match the directory specified when you register the snapshot with the elasticsearch cluster.
I guess the next question is: when you want to tgz up the snapshot directory so you can archive it, or provision a backup cluster, is it sufficient to just tgz the snapshot directory on the Master node? Or do you have to somehow consolidate the snapshot directories of all the nodes. (That can't be right, can it?)
Elasticsearch supports shared file system repository uses the shared file system to store snapshots.
In order to register the shared file system repository it is necessary to mount the same shared filesystem to the same location on all master and data nodes.
All you need to know Put in elasticsearch.yml of all 3 nodes same repository_name.
for eg:-
path.repo:[/my_repository]
I think you are looking for this aws plugin for elasticsearch (I guess you already installed it to configure your cluster) : https://github.com/elasticsearch/elasticsearch-cloud-aws#s3-repository
It will allow you to create a repository mapped to a S3 bucket.
To use (create/restore/whatever) a snapshot, you need to create a repository first. Then when you will do some actions on a snapshot, Elasticsearch will directly manage it on your S3 bucket.