What is root snapshot in Snapshot or EBS? - amazon-web-services

When I try to run a Python script to build an AMI using snapshot it says:
botocore.exceptions.ClientError: An error occurred (InvalidBlockDeviceMapping) when calling the RegisterImage operation: No root snapshot specified in device mapping.
When I check everything is right. I don't find any root snapshot details in EBS.
BlockDeviceMappings=[
{
'DeviceName': '/dev/sdb',
'Ebs': {
'SnapshotId': destination_snapshot_id
},
},
],
EnaSupport=True,
Name="jenkins-slave-" + str(int(time.time())),
VirtualizationType='hvm',
RootDeviceName='/dev/sda1'
)

RootDeviceName must match one of the DeviceNames in the BlockDeviceMappings[].
In Kanthi K's case, /dev/sda1 does not match /dev/sdb.

Related

AWS SSM error while targets.1.member.values failed to satisfy constraint: Member must have length less than or equal to 50

I am trying to run a SSM command on more than 50 EC2 instances of my fleet. By using AWS boto3's SSM client, I am running a specific command on my nodes. My code is given below. After running the code, an unexpected error is showing up.
# running ec2 instances
instances = client.describe_instances()
instance_ids = [inst["InstanceId"] for inst in instances] # might contain more than 50 instances
# run command
run_cmd_resp = ssm_client.send_command(
Targets=[
{"Key": "InstanceIds", "Values": inst_ids_all},
],
DocumentName="AWS-RunShellScript",
DocumentVersion="1",
Parameters={
"commands": ["#!/bin/bash", "ls -ltrh", "# some commands"]
}
)
On executing this, getting below error
An error occurred (ValidationException) when calling the SendCommand operation: 1 validation error detected: Value '[...91 instance IDs...]' at 'targets.1.member.values' failed to satisfy constraint: Member must have length less than or equal to 50.
How do I run the SSM command my whole fleet?
As shown in the error message and boto3 documentation (link), the number of instances in one send_command call is limited up to 50. To run the SSM command for all instances, splitting the original list into 50 each could be a solution.
FYI: If your account has a fair amount of instances, describe_instances() can't retrieve all instance info in one api call, so it would be better to check whether NextToken is in response.
ref: How do you use "NextToken" in AWS API calls
# running ec2 instances
instances = client.describe_instances()
instance_ids = [inst["InstanceId"] for inst in instances]
while "NextToken" in instances:
instances = client.describe_instances(NextToken=instances["NextToken"])
instance_ids += [inst["InstanceId"] for inst in instances]
# run command
for i in range(0, len(instance_ids), 50):
target_instances = instance_ids[i : i + 50]
run_cmd_resp = ssm_client.send_command(
Targets=[
{"Key": "InstanceIds", "Values": inst_ids_all},
],
DocumentName="AWS-RunShellScript",
DocumentVersion="1",
Parameters={
"commands": ["#!/bin/bash", "ls -ltrh", "# some commands"]
}
)
Finally after #Rohan Kishibe's answer, I tried to implement below batched execution for the SSM runShellScript.
import math
ec2_ids_all = [...] # all instance IDs fetched by pagination.
PG_START, PG_STOP = 0, 50
PG_SIZE = 50
PG_COUNT = math.ceil(len(ec2_ids_all) / PG_SIZE)
for page in range(PG_COUNT):
cmd = ssm.send_command(
Targets=[{"Key": "InstanceIds", "Values": ec2_ids_all[PG_START:PG_STOP]}],
DocumentVersion="AWS-RunShellScript",
Parameters={"commands": ["ls -ltrh", "# other commands"]}
}
PG_START += PG_SIZE
PG_STOP += PG_SIZE
In above way, the total number of instance IDs will be distributed in batches and then executed accordingly. One can also save the Command IDs and batch instance IDs in a mapping for future usage.

How to sync 1) EC2 volume attachment's name with 2) a variable in EC2's user-data?

Rephrasing the title, how to not have to guess the EC2 attached volume/device's name in generated aws_instance's user_data? That is, how to not be forced to have an additional attached_device_actual_name variable in below Terraform locals?
Here's the relevant Terraform configuration:
locals {
attached_device_name = "/dev/sdf" # Used in `aws_volume_attachment`.
attached_device_actual_name = "/dev/nvme1n1" # Used in `templatefile`.
}
resource "aws_instance" "foo" {
user_data = templatefile("./user-data.sh.tpl", {
attached_device_name = local.attached_device_actual_name
})
}
resource "aws_volume_attachment" "foo" {
device_name = local.attached_device_name
instance_id = aws_instance.foo.id
}
The docs say
The device names that you specify for NVMe EBS volumes in a block device mapping are renamed using NVMe device names (/dev/nvme[0-26]n1).
Does the above-quoted part "device names [...] are renamed" also imply that one should not use these, reserved /dev/nvme... names? Indeed, if I set local.attached_device_name to /dev/nvme1n1, which is a "correct" guess in this case, this error pops up:
Error: Error attaching volume (vol-some_id) to instance (i-some_id), message: "Value (/dev/nvme1n1) for parameter device is invalid. /dev/nvme1n1 is not a valid EBS device name.", code: "InvalidParameterValue"
"/dev/nvme1n1 is not a valid EBS device name."
The goal was to have user_data synced with the attached-volume's name and then be able to wait for the volume:
DEVICE="${attached_device_name}"
while [ ! -e "$${DEVICE}" ] ; do
echo "Waiting for $DEVICE ..."
sleep 1
done
Env:
Terraform 1.1.2
hashicorp/aws 4.10.0

Pyspark - read data from elasticsearch cluster on EMR

I am trying to read data from elasticsearch from pyspark. I was using the elasticsearch-hadoop api in Spark. The es cluster sits on aws emr, which requires credential to sign in. My script is as below:
from pyspark import SparkContext, SparkConf sc.stop()
conf = SparkConf().setAppName("ESTest") sc = SparkContext(conf=conf)
es_read_conf = { "es.host" : "vhost", "es.nodes" : "node", "es.port" : "443",
"es.query": '{ "query": { "match_all": {} } }',
"es.input.json": "true", "es.net.https.auth.user": "aws_access_key",
"es.net.https.auth.pass": "aws_secret_key", "es.net.ssl": "true",
"es.resource" : "index/type", "es.nodes.wan.only": "true"
}
es_rdd = sc.newAPIHadoopRDD( inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable", conf=es_read_conf)
Pyspark keeps throwing error:
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: [HEAD] on
[index] failed; servernode:443] returned [403|Forbidden:]
I checked everything which all made sense except for the user and pass entries, would aws access key and secret key work here? We don't want to use the console user and password here for security purpose. Is there a different way to do the same thing?

Automating network interface related configurations on red hat ami-7.5

I have an ENI created, and I need to attach it as a secondary ENI to my EC2 instance dynamically using cloud formation. As I am using red hat AMI, I have to go ahead and manually configure RHEL which includes steps as mentioned in below post.
Manually Configuring secondary Elastic network interface on Red hat ami- 7.5
Can someone please tell me how to automate all of this using cloud formation. Is there a way to do all of it using user data in a cloud formation template? Also, I need to make sure that the configurations remain even if I reboot my ec2 instance (currently the configurations get deleted after reboot.)
Though it's not complete automation but you can do below to make sure that the ENI comes up after every reboot of your ec2 instance (only for RHEL instances). If anyone has any better suggestion, kindly share.
vi /etc/systemd/system/create.service
Add below content
[Unit]
Description=XYZ
After=network.target
[Service]
ExecStart=/usr/local/bin/my.sh
[Install]
WantedBy=multi-user.target
Change permissions and enable the service
chmod a+x /etc/systemd/system/create.service
systemctl enable /etc/systemd/system/create.service
Below shell script does the configuration on rhel for ENI
vi /usr/local/bin/my.sh
add below content
#!/bin/bash
my_eth1=`curl http://169.254.169.254/latest/meta-data/network/interfaces/macs/0e:3f:96:77:bb:f8/local-ipv4s/`
echo "this is the value--" $my_eth1 "hoo"
GATEWAY=`ip route | awk '/default/ { print $3 }'`
printf "NETWORKING=yes\nNOZEROCONF=yes\nGATEWAYDEV=eth0\n" >/etc/sysconfig/network
printf "\nBOOTPROTO=dhcp\nDEVICE=eth1\nONBOOT=yes\nTYPE=Ethernet\nUSERCTL=no\n" >/etc/sysconfig/network-scripts/ifcfg-eth1
ifup eth1
ip route add default via $GATEWAY dev eth1 tab 2
ip rule add from $my_eth1/32 tab 2 priority 600
Start the service
systemctl start create.service
You can check if the script ran fine or not by --
journalctl -u create.service -b
Still need to figure out the joining of the secondary ENI from Linux, but this was the Python script I wrote to have the instance find the corresponding ENI and attach it to itself. Basically the script works by taking a predefined naming tag for both the ENI and Instance, then joins the two together.
Pre-reqs for setting this up are:
IAM role on the instance to allow access to S3 bucket where script is stored
Install pip and the AWS CLI in the user data section
curl -O https://bootstrap.pypa.io/get-pip.py
python get-pip.py
pip install awscli --upgrade
aws configure set default.region YOUR_REGION_HERE
pip install boto3
sleep 180
Note on sleep 180 command: I have my ENI swap out on instance in an autoscaling group. This allows an extra 3 min for the other instance to shut down and drop the ENI, so the new one can pick it up. May or may not be necessary for your use case.
AWS CLI command in user data to download the file onto the instance (example below)
aws s3api get-object --bucket YOURBUCKETNAME --key NAMEOFOBJECT.py /home/ec2-user/NAMEOFOBJECT.py
# coding: utf-8
import boto3
import sys
import time
client = boto3.client('ec2')
# Get the ENI ID
eni = client.describe_network_interfaces(
Filters=[
{
'Name': 'tag:Name',
'Values': ['Put the name of your ENI tag here']
},
]
)
eni_id = eni['NetworkInterfaces'][0]['NetworkInterfaceId']
# Get ENI status
eni_status = eni['NetworkInterfaces'][0]['Status']
print('Current Status: {}\n'.format(eni_status))
# Detach if in use
if eni_status == 'in-use':
eni_attach_id = eni['NetworkInterfaces'][0]['Attachment']['AttachmentId']
eni_detach = client.detach_network_interface(
AttachmentId=eni_attach_id,
DryRun=False,
Force=False
)
print(eni_detach)
# Wait until ENI is available
print('start\n-----')
while eni_status != 'available':
print('checking...')
eni_state = client.describe_network_interfaces(
Filters=[
{
'Name': 'tag:Name',
'Values': ['Put the name of your ENI tag here']
},
]
)
eni_status = eni_state['NetworkInterfaces'][0]['Status']
print('ENI is currently: ' + eni_status + '\n')
if eni_status != 'available':
time.sleep(10)
print('end')
# Get the instance ID
instance = client.describe_instances(
Filters=[
{
'Name': 'tag:Name',
'Values': ['Put the tag name of your instance here']
},
{
'Name': 'instance-state-name',
'Values': ['running']
}
]
)
instance_id = instance['Reservations'][0]['Instances'][0]['InstanceId']
# Attach the ENI
response = client.attach_network_interface(
DeviceIndex=1,
DryRun=False,
InstanceId=instance_id,
NetworkInterfaceId=eni_id
)

AWS mounting old volume from old instance to new instance

I did detach old volume from old instance and attach old volume to new instance in aws console.
and I followed this question: Add EBS to Ubuntu EC2 Instance
When I command 'sudo mount /vol'
It shows me the error :
mount: wrong fs type, bad option, bad superblock on /dev/xvdf,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
The output of 'dmesg | tail' is below
[ 9.158108] audit: type=1400 audit(1481970181.964:8):
apparmor="STATUS" operation="profile_load" profile="unconfined"
name="/usr/lib/NetworkManager/nm-dhcp-helper" pid=705
comm="apparmor_parser" [ 9.158434] audit: type=1400
audit(1481970181.964:9): apparmor="STATUS" operation="profile_load"
profile="unconfined" name="/usr/lib/connman/scripts/dhclient-script"
pid=705 comm="apparmor_parser" [ 9.178292] audit: type=1400
audit(1481970181.984:10): apparmor="STATUS" operation="profile_load"
profile="unconfined" name="/usr/bin/lxc-start" pid=761
comm="apparmor_parser" [ 9.341874] audit: type=1400
audit(1481970182.148:11): apparmor="STATUS" operation="profile_load"
profile="unconfined" name="/usr/lib/lxd/lxd-bridge-proxy" pid=763
comm="apparmor_parser" [ 11.673698] random: nonblocking pool is
initialized [ 11.766032] EXT4-fs (xvda1): resizing filesystem from
2094474 to 2095139 blocks [ 11.766371] EXT4-fs (xvda1): resized
filesystem to 2095139 [ 12.716500] cgroup: new mount options do not
match the existing superblock, will be ignored [ 236.029463]
blkfront: xvdf: barrier or flush: disabled; persistent grants:
disabled; indirect descriptors: enabled; [ 236.038716] xvdf: xvdf1
Old volume Attachment information in AWS console is below :
VOLUME_ID (NEW_INSTANCE_NAME):/dev/sdf (attached)
Your volume has a partition table, as evidenced by...
[ 236.038716] xvdf: xvdf1
...so you need to mount the partition, not the volume.
sudo mount /dev/xvdf1 /path/to/mount-point
You can also see this using lsblk.