Try as I might, I cannot get the import-image task to work. I'm looking for a working example that I can reproduce, preferably starting with a "raw" disk image.
Most recent problems:
"Unsupported kernel version" when using an image that works fine when converted with the mouse instead of the API (posted to EC2 forum, no response: https://forums.aws.amazon.com/thread.jspa?threadID=221844)
"No valid partitions" when using a VirtualBox VMDK image that boots just fine in VirtualBox.
I ran into a similar issue when I tried importing FreeBSD bundled OVAs to it. According to the pre-requisites/checklist, Amazon does not yet support vmimporting of FreeBSD. That produces the "No valid partitions".
Also, if you use LUKS encrypted partitions it produced that same error for me, (Ubuntu).
For "Unsupported kernel version", here is my output of that same error:
c:\Users\XXXXX\Documents>aws ec2 describe-import-image-tasks --import-task-ids "import-ami-fgacu4yu"
{
"ImportImageTasks": [
{
"Status": "deleted",
"SnapshotDetails": [
{
"UserBucket": {
"S3Bucket": "myautomationbucket",
"S3Key": "ubuntu14.04-patched.ova"
},
"DiskImageSize": 843476480.0,
"Format": "VMDK"
}
],
"Description": "Optimus Custom Ubuntu14.04",
"StatusMessage": "ClientError: Unsupported kernel version 4.2.0-36-generic",
"ImportTaskId": "import-ami-XXXXXXXX"
}
]
}
According to AWS they posted a list of known good kernels however they are not verbose for my favorite flavor, Ubuntu.
http://docs.amazonaws.cn/en_us/AWSEC2/latest/WindowsGuide/VMImportPrerequisites.html
So what I had done is downgrade the kernel to their acceptable ones.
I obtained how to get what was "acceptable" by performing this command on an existing, known good running instance in my EC2:
c:\Users\XXXXXX\Documents>aws ec2 describe-instance-attribute --instance-id i-12345678 --attribute kernel --region us-east-1
{
"InstanceId": "i-12345678",
"KernelId": {
"Value": "aki-825ea7eb"
}
}
So this aki-824ea7eb is the supported kernel ID. That isn't very helpful, so after some research I realized that AWS may only have a list of supported kernels due to the limitation of their existing platform -- they are not running ESXi you know. ;)
I had searched and found this to be useful and followed the instructions for 13.04 https://www.linode.com/docs/tools-reference/custom-kernels-distros/run-a-distributionsupplied-kernel-with-pvgrub
I performed 1,2,3,4, but I had skipped steps 5,6,7,8... performed 9 and then 15.
And then when I performed them on my VM, repackaged the VM to an OVA and ran my vmimport, it successfully imported with an instance.
Hope this helps.
Related
We are using Dataflow Flex Templates and following this guide (https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates) to stage and launch jobs. This is working in our environment. However, when I SSH onto the Dataflow VM and run docker ps I see it is referencing the a different docker image to the one we speccify in our template (underlined in green):
The template I am launching from is as follows and jobs are created using gcloud beta dataflow flex-template run:
{
"image": "gcr.io/<MY PROJECT ID>/samples/dataflow/streaming-beam-sql:latest",
"metadata": {
"description": "An Apache Beam streaming pipeline that reads JSON encoded messages from Pub/Sub, uses Beam SQL to transform the message data, and writes the results to a BigQuery",
"name": "Streaming Beam SQL",
"parameters": [
{
"helpText": "Pub/Sub subscription to read from.",
"label": "Pub/Sub input subscription.",
"name": "inputSubscription",
"regexes": [
".*"
]
},
{
"helpText": "BigQuery table spec to write to, in the form 'project:dataset.table'.",
"is_optional": true,
"label": "BigQuery output table",
"name": "outputTable",
"regexes": [
"[^:]+:[^.]+[.].+"
]
}
]
},
"sdkInfo": {
"language": "JAVA"
}
}
So I would expect the output of docker ps to show gcr.io/<MY PROJECT ID>/samples/dataflow/streaming-beam-sql as the image on Dataflow. When I launch the image from GCR to run on a GCE instance I get the following output when running docker ps:
Should I expect to see the name of the image I have referenced in the Dataflow template on the Dataflow VM? Or have I missed a step somewhere?
Thanks!
TLDR; You are looking in the worker VM instead of launcher VM.
In case of flex templates, when you run the job, it first creates a launcher VM where it pulls your container and runs it to generate the job graph. This VM will destroyed after this step is completed. Then the worker VM is started to actually run the generated job graph. In the worker VM there is no need for your container. Your container is used only to generate the job graph based on the parameters passed.
In your case, you are trying to search for your image in the worker VM. The launcher VM is short lived and starts with launcher-*********************. If you SSH into that VM and do docker ps you will be able to see your container image.
That is what I get when I follow the instructions at the official Docker tutorial here: tutorial link
I uploaded my Dockerrun.aws.json file and followed all other instructions.
The logs show nothing even when I click Request:
If anyone has a clue as to what I need to do, ie. why would not having a default VPC even matter here? I have only used my AWS account to set up Linux Machine EC2 instances for a Deep Learning nanodegree at Udacity in the past (briefly tried to set up a VPC just for practice but am sure I deleted/terminated everything when I found out that is not included in the free tier).
The author of the official tutorial forgot to add that you have to add the tag to the image name in the Dockerrun.aws.json file per below in gedit or other editor where :firsttry is the tag:
{
"AWSEBDockerrunVersion": "1",
"Image": {
"Name": "hockeymonkey96/catnip:firsttry",
"Update": "true"
},
"Ports": [
{
"ContainerPort": "5000"
}
],
"Logging": "/var/log/nginx"
}
It works:
I'm trying to use a large docker image (the image is on dockerhub here about 18GB) as a job definition for AWS batch. I'm getting the following error about running out of space:
CannotPullContainerError: write /var/lib/docker/tmp/GetImageBlob#######: no space left on device
The Cloudformation JSON section that defines the job is here
"JobDef3": {
"Type": "AWS::Batch::JobDefinition",
"Properties": {
"Type": "container",
"ContainerProperties": {
"Image": {
"Fn::Join": [
"",
[
"cornhundred/",
"dockerized-cellranger-nick:latest"
]
]
},
"Vcpus": 1,
"Command": ["some command"],
"Memory": 3000,
},
"RetryStrategy": {
"Attempts": 1
}
}
},
How can I get AWS to increase the amount of space available so that I can run this image?
I was able to run the docker container by moving the large files (~15GB reference genome files) out of the docker image and downloading them after running the container. I also needed to make a custom Amazon Machine Image (AMI, see AWS Batch Genomics for an example) and attach a volume to handle the large reference genome files since the default container was not large enough.
I had a similar issue. Clearing up unused docker images and volumes didn't work for me (ie docker container prune nor docker system prune
I saw another page saying that restarting docker fixed it for that user, but doing a service docker restart I got this error: /etc/init.docker: line 35: ulimit: open files: cannot modify limit: Operation not permitted
To try and fix that issue, I saw sites mentioning to update the ulimit values in some configuration files but when I tried to save the file with the updated parameters I got write error (file system full?)
At which point, I realized (as the initial error you showed) I needed to clean up and remove files.
I did a du -h from the root folder and saw that the /var/lib/docker/tmp/ folder (which is part of the error message I experienced and you posted above) used up way more disk space than other folders.
So I removed older files there and I no longer got that error message.
Google recently added support for GPUs in their cloud service.
I'm trying to follow the instructions found here to start a machine with a GPU. Running this script on Windows:
gcloud beta compute instances create gpu-instance-1^
--machine-type n1-standard-2^
--zone us-east1-d^
--accelerator type=nvidia-tesla-k80,count=1^
--image-family ubuntu-1604-lts^
--image-project ubuntu-os-cloud^
--maintenance-policy TERMINATE^
--restart-on-failure^
with gcloud command line tool version 146.0.0 fails, saying:
ERROR: (gcloud.beta.compute.instances.create) unknown collection [compute.acceleratorTypes]
Any ideas?
Was never able to get the gcloud utility working. Using the API did work. Of note, when posting the API request (instructions on the same page as the gcloud instructions, here) the key that creates an instance with a GPU is guestAccelerators. This key does not have an analogous option in gcloud.
Copying the API request as it appears on the instructions page linked above.
POST https://www.googleapis.com/compute/beta/projects/[PROJECT_ID]/zones/[ZONE]/instances?key={YOUR_API_KEY}
{
"machineType": "https://www.googleapis.com/compute/beta/projects/[PROJECT_ID]/zones/[ZONE]/machineTypes/n1-highmem-2",
"disks":
[
{
"type": "PERSISTENT",
"initializeParams":
{
"diskSizeGb": "[DISK_SIZE]",
"sourceImage": "https://www.googleapis.com/compute/beta/projects/[IMAGE_PROJECT]/global/images/family/[IMAGE_FAMILY]"
},
"boot": true
}
],
"name": "[INSTANCE_NAME]",
"networkInterfaces":
[
{
"network": "https://www.googleapis.com/compute/beta/projects/[PROJECT_ID]/global/networks/[NETWORK]"
}
],
"guestAccelerators":
[
{
"acceleratorCount": [ACCELERATOR_COUNT],
"acceleratorType": "https://www.googleapis.com/compute/beta/projects/[PROJECT_ID]/zones/[ZONE]/acceleratorTypes/[ACCELERATOR_TYPE]"
}
],
"scheduling":
{
"onHostMaintenance": "terminate",
"automaticRestart": true
},
"metadata":
{
"items":
[
{
"key": "startup-script",
"value": "[STARTUP_SCRIPT]"
}
]
}
}
Sometimes you need to ensure you have the latest version of the gcloud utility installed in order to use certain GCP features.
Try running this command or read the below docs on how to update your gcloud utility:
gcloud components update
https://cloud.google.com/sdk/gcloud/reference/components/update
I have packer configured to use the amazon-ebs builder to create a custom AMI from the Red Hat 6 image supplied by Red Hat. I'd really like to packer to post process the custom AMI into a virtualbox image for local testing. I've tried adding a simple post processor to my packer json as follows:
"post-processors": [
{
"type": "vagrant",
"keep_input_artifact": false
}
],
But all I end up with is a tiny .box file. When I add this to vagrant, it just seems to be a wrapper for my original AMI in Amazon:
$ vagrant box list
packer (aws, 0)
I was hoping to see something like this:
rhel66 (virtualbox, 0)
Can packer convert my AMI into a virtualbox image?
Post-processor in your example just gives you the vagrant for that image. That image was aws, so no it didn't change anything. To change it to virtualbox you'd have to convert it.
Per the docs have you tried:
{
"type": "virtualbox",
"only": ["virtualbox-iso"],
"artifact_type": "vagrant.box",
"metadata": {
"provider": "virtualbox",
"version": "0.0.1"
}
}
The above is untested. AWS provides some docs on exporting here