I have a question regarding Google Cloud custom images and how/if credentials are stored. Namely, if I customize a VM and save the machine image with public access, am I possibly exposing credentials??
In particular, I'm working on a cloud-based application that relies on a "custom" image which has both gsutil and docker installed. Basic GCE VMs have gsutil pre-installed but do not have docker. On the other hand, the container-optimized OS have docker, but do not have gsutil. Hence, I'm just starting from a basic debian image and installing docker to get what I need.
Ideally, when I distribute my application, I would like to just expose that customized image for public use; this way, users will not have to spend extra effort to make their own images.
My concern, however, is that since I have used gsutil on the customized VM, persisting this disk to an image will inadvertently save some credentials related to my project (if so, where are they??). Hence, anyone using my image will also get those credentials.
I tried to reproduce your situation. I created a customer image from the disk of an instance who could access my project Storage buckets. Then, I shared the image for another user in a different project. The user could create an instance out of that shared image. However, when he tried to access my project buckets, he encountered AccessDeniedException error.
According to this reproduction and my investigations, your credentials are not exposed with the image. IAM grant permissions are based on roles given to a user, a group, or a service account. Sharing images can't grant them to others.
Furthermore, (as Patrick W mentioned below) any thing you run from within a GCE VM instance will use the VM's service account (unless otherwise specified). As long as the service account has access to the bucket, so will your applications (including docker containers.
Related
I know there is a good tutorial on how to create jupyter notebooks on AWS sagemaker "the easy way".
Do you know if it is possible to allow 10 students to create jupyter-notebooks who do not have an AWS accounts, and also allow them to edit jupyter-notebooks?
Enabling multiple users to leverage the same notebook (in this case, without authentication) will involve managing your Security Groups to enable open access. You can filter, allowing access for a known IP address range, if your students are accessing it from a classroom or campus, for example.
Tips for this are available in this answer and this page from the documentation, diving into network configurations for SageMaker hosted notebook instances.
As for enabling students to spin up their own notebooks, I'm not sure if it's possible to enable completely unauthenticated AWS-level resource provisioning -- however once you've spun up a single managed notebook instance yourself, students can create their own notebooks directly from the browser in Jupyter, once they've navigated to the publicly available IP. You may need to attach a new SageMaker IAM role that enables notebook creation (amongst other things, depending on the workload requirements). Depending on the computational needs (number, duration, and types of concurrent workloads), there will be different optimal setups of number of managed instances and instance type to prevent computational bottlenecking.
How to export google compute engine instance image ?. I want to reuse the instance in another google cloud account. Is that possible?
Here's the documentation for exporting an image, and here for importing an image.
You can also share images across projects by granting the “compute.imageUser” IAM role. As explained in this article:
“For example, assume that a User A owns Project A and wants to create VM instances using images owned by Project B. The owner of Project B must grant User A the compute.imageUser role on Project B. This grants User A the ability to use the images from Project B to create instances in Project A.”
Yes it is possible.
First of all, Instance and Image are two different things. Image is more like an install-able package and Instance is the actual installed version.
What might be confusing here, w.r.t. Google Cloud is that Google refers to creating an instance from image as Exporting an Image which is contrary to VMWare or other hypervisor based virtualization terminology where creating an instance is done by Importing a VM (template) and Exporting a VM is basically reverse of installation i.e. creating a re-loadable package from installed version of VM.
This process of creating an Image from VM Instance is called Creating an Image in Google Cloud and the process for same is documented here
Short background, we're a small business but our clients are much larger businesses. We have some software they subscribe to which is deployed to AWS elastic beanstalk. Clients have their own devops teams, unlike us, and will need to manage some of the technical support. They will need access to the AWS account running the software, so they can do things like reboot the server, clear the database if they screw it up, change the EC2 instance type etc. This is OK but we want to prevent the software being downloaded outside of the AWS account.
The software is a java WAR running on Tomcat, on a single elastic beanstalk instance. We only care about limiting access to the WAR file (not the database for example).
The beanstalk application versions page appears to have no way to download the WAR file - which is good. They could SSH into the underlying EC2 instance though so presumably they could just copy the WAR out of the tomcat directory. Given the complexity of AWS there's probably other ways they could get access the WAR file too (e.g. clone the EBS volume and attach to another EC2 instance).
I assume that the machine instances available for purchase via AWS marketplace must have some form of copy protection but I've not been able to find any details on this. Also it looks like AWS only accepts marketplace vendors who are much larger than us, so marketplace option may not be open to us.
Any idea how I could prevent access to the WAR file running on elastic beanstalk while still allowing the client access to the AWS account? (Or at least make access hard).
The only solution that comes to mind for this would be, removing any EC2 SSH Key Pairs from the account, and specifically denying them access to ec2:CreateKeyPair. Really, what you need to be doing is granting them least privilege access to the account, that is, specifically granting them access only to those actions they absolutely need.
This will go a long way, but with sufficient knowledge of AWS, it's going to be an uphill battle trying to ensure that you give them enough access to do what they need, while not giving them more than you want. I'd question if a legal option (like contracts, licenses, etc) would be a better protection for this.
I'm planning to start a small business and submit an Linux AMI to Amazon's AWS Marketplace. As I'm reading the seller's guide, I see this:
AMIs MUST allow OS-level administration capabilities to allow for compliance requirements, vulnerability updates and log file access. For Linux-based AMIs this is through SSH." (6.2.2)
How can I protect my source code if anyone who uses my product can SSH to the machine and poke around? Can I lock down certain folders yet still allow "os-level administration"?
Here is a bit of context if needed:
I'm using Ubuntu Server 16.04 LTS (HVM), SSD Volume Type (ami-cd0f5cb6) as my base AMI
I'm provisioning a slightly modified MySQL database that I want my customers to be able to access. This is their primary way of interacting with my service.
I'm building a django web service that will come packaged on the AMI. This is what I'd like to lock down and prevent access to.
Whether or not you provide SSH access, it'll always be possible for your users to mount the root EBS-volume of your AMI on another EC2-instance to investigate its contents, so disabling SSH or making certain files unreadable for an SSH-user doesn't help you in this regard.
Instead of trying to to keep users away from your source code I suggest you simply state clearly what the users are allowed to do with it and what not in the terms of service.
Even large companies provide OS-images which contain the source code of their applications (whenever they use a scripting language) in clear form or just slightly obfuscated.
I'm using Packer, and I'm new to creating machine images. Although I've created and deployed Docker containers before.
One concept I'd like to apply to the machine image building that I've found useful with Docker images is using the same exact image for staging testing that gets deployed to production. The different environments behave differently due to different environment variable values being passed in on startup, which in the case of Docker containers is often handled by a startup script ("entrypoint" in Docker terminology).
This has worked fine for me, but now I need to handle SSL certificates (actual files) being different between staging and production. In the case of Docker containers, you could just mount different volumes to the container. But I can't do that with machine images.
So how do people handle this scenario with machine images? Should I store these important files encrypted externally, and curl them in a startup script?
You could consider using a configuration management tool such as Ansible or Puppet to do any environment/host specific configuration you need once Packer has deployed the bulk of the VM.
Alternatively you could do as you mentioned and simply have a startup script curl the appropriate SSL certs (or any other environment specific files/config) that are needed from some location. Considering you've tagged your question with amazon-web-services you could use separate, private S3 buckets for testing or production and only allow certain instances access to the relevant buckets via IAM roles, protecting that data from being viewed by others or the wrong environment but also reducing the need to encrypt the data and then manage keys as well.
When you launch EC2 instances using your AMI, you can specify tags. Inside instances you can use AWS CLI to read these tags, so you can craft a script to run when the system starts and load whatever external files as you want based on the tag values (as #ydaetskcoR suggested from a private S3 bucket).
This is also useful: Find out the instance id from within an ec2 machine