Sagemaker Studio domain creation: possible to mount a previous EFS - amazon-web-services

In order to change some settings, there is currently a need to recreate the domain. However, I have an existing EFS that I would like to retain. Is it possible to remount this EFS directory to the new domain?

It is not currently possible to attach an existing EFS to a new domain. If you have multiple users and data, I'd recommend the steps in the Backup and recovery section here.
If it is only a handful user profiles, you can simply download the files/use S3 as an intermediate storage.

Related

Identifying user from AWS Sagemaker Studio generated EFS storage

When a sagemaker studio domain is created. An EFS storage is associated with the domain. As the assigned users log into Sagemaker studio, a corresponding home directory is created.
Using a separate EC2 instance, I mounted the EFS storage that was created to try to see whether is it possible to look at each of the individual home domains. I noticed that each of these home directories are shown in terms of numbers (e.g 200000, 200005). Is there a specific rule on how this folders are named? Is it possible to trace the folders back to a particular user or whether this is done by design?
(currently doing exploration on my personal aws account)
Yes, if you list and describe the domain users, you'll get back the user's HomeEfsFileSystemUid value.
Here's a CLI example:
aws sagemaker describe-user-profile --domain-id d-lcn1vbt47yku --user-profile-name default-1588670743757
{
...
"UserProfileName": "default-1588670743757",
"HomeEfsFileSystemUid": "200005",
...
}

How Can we share files between multiple Windows VMs in GCP?

I have 10 Windows VMs where I want to have PD with both read-write in all the VM's. But I came to know that we cannot mount a disk to multiple VMs with read-write. SO I am looking for option where I can access a disk from any of those VMs. For Linux we can use GCSFuse to mount the Cloud storage as a disk, Do we have any option for windows where we can mount a single disk/Cloud Storage buckets to Multiple Windows VMs.
If you want it specifically to be a GCP Disk, your best option will be setting up an additional Windows instance, and set up a shared SMB disk with the other instances.
Another option, if you don't want to get too messy, best option would be using the Filestore service ( https://cloud.google.com/filestore/ ) , which is an NFS as a service, provided you have an NFS client for your Windows version
I believe you could use Google Cloud Storage buckets, which could be an intermediate transfer point between your instances, regardless of OS.
Upload your files from your workstation to a Cloud Storage bucket. Then, download those files from the bucket to your instances. When you need to transfer files in the other direction, reverse the process. Upload the files from your instance and then download those files to your workstation.
To achieve this follow these steps:
Create a new Cloud Storage bucket or identify an existing
bucket that you want to use to transfer files.
Upload files to
the bucket
Connect to your instance using RDP
upload/download files from the bucket.
However, there are other options like using file servers on Compute engine or following options:
Cloud Storage
Compute Engine persistent disks
Single Node File Server
Elastifile
Quobyte
Avere vFXT
These options have their advantages and disadvantages, for more details for the links attached to each of these options.

Setting up AWS for data processing S3 or EBS?

Hey there I am new to AWS and trying to piece together the best way to do this.
I have thousands of photos I'd like to upload and process on AWS. The software is Agisoft Photoscan and is run in stages. So for the first stage i'd like to use an instance that is geared towards CPU/Memory usage and the second stage geared towards GPU/Memory.
What is the best way to do this? Do I create a new volume for each project in EC2 and attach that volume to each instance when I need to? I see people saying to use S3, do I just create a bucket for each project and then attach the bucket to my instances?
Sorry for the basic questions, the more I read the more questions I seem to have,
I'd recommend starting with s3 and seeing if it works - will be cheaper and easier to setup. Switch to EBS volumes if you need to, but I doubt you will need to.
You could create a bucket for each project, or you could just create a bucket a segregate the images based on the file-name prefix (i.e. project1-image001.jpg).
You don't 'attach' buckets to EC2, but you should assign an IAM role to the instances as you create them, and then you can grant that IAM role permissions to access the S3 bucket(s) of your choice.
Since you don't have a lot of AWS experience, keep things simple, and using S3 is about as simple as it gets.
You can go with AWS S3 to upload photos. AWS S3 is similar like Google Drive.
If you want to use AWS EBS volumes instead of S3. The problem you may face is,
EBS volumes is accessible within availability zone but not within region also means you have to create snapshots to transfer another availability zone. But S3 is global.
EBS volumes are not designed for storing multimedia files. It is like hard drive. Once you launch an EC2 instance need to attach EBS volumes.
As per best practice, you use AWS S3.
Based on your case view, you can create bucket for each project or you can use single bucket with multiple folders to identify the projects.
Create an AWS IAM role with S3 access permission and attach it to EC2 instance. No need of using AWS Credentials in the project. EC2 instance will use role to access S3 and role doesn't have permanent credentials, it will keep rotating it.

Is it possible to use s3 buckets to create and grant admin privileges on different directorys in my ec2 instance?

I have an ec2 instance that I use as sort of a staging environment for small websites and custom Wordpress websites.
What I'm trying to find out is; Can I create a bucket for /var/www/html/site1 and assign FTP access to Developer X to work on this particular site within this particular bucket?
No. Directories on your EC2 instance have no relationship with S3.*
If you want to set up permissions for files stored on your EC2 instance, you'll have to do it by making software configuration changes on that instance, just as if it were any other Linux-based server.
*: Assuming you haven't set up something weird like s3fs, which I assume isn't the case here.

Shared File Systems between multiple AWS EC2 instances

I have a couple of windows server instances running on Amazon EC2 and would like to make them a bit more fault tolerant by running a duplicate instance with load balancers.
The problem is the specific data, as an example it does no good to fail over from one web server to another web server if the contents of the document root i.e. C:/htdocs/ (Apache) or C:/Repositories (VisualSvn Server) are not identical.
Is there a way to share a volume across two or more instances?
My idea is share folder between EC2 istances:
I read it's not possible to attach the same EBS volume to multiple instances. I believe also AWS is not NFS friendly either in case I want to mount them across NFS.
And finally, I've also checked S3 bucket mounted with s3fs but I found out it's not a good option too.
Can anyone help point me in the right direction?
You are right, at the moment it is not possible to add an EBS volume to multiple instances. To create a common storage for all instances, there are options like NFS, mounting S3 buckets or using a distributed cluster filesystem like GlusterFS.
However in most cases you can simplify your setup. Try to offload static assets to another (static) domain or even host it on an website-enabled S3 bucket. This way you only have to care about the dynamic application logic or scripts on your app servers.
Also try to use some automated deployment and/or configuration management tools. With these you can for example create new machines easily, or you can use them to deploy the latest code on your machines.