How (or can) I backup a Google Artifact Registry instance? - google-cloud-platform

I'm looking at moving to using Artifact Registry as a replacement for Artifactory and I wondering how, or if, I would backup the stored files to guard against a server failure?
Maybe that's not a thing I need to worry about with GCP?

Artifact Registry is a managed Service. you don't have access to the underlying Infrastructure powering the service so you have to worry about backup.
You can read more about this here https://cloud.google.com/artifact-registry/docs/repositories/repo-locations

Related

How to Deploy a docker container with volume in Cloud Run

I am trying to publish an application I wrote in .NET Core with docker and a mounted volume. I can't really figure out or see any clear solution to my issue that will be cheap (Its for a university project.)
I tried running a docker-compose via a cloudbuild.yml linked in this post with no luck, also tried to put my dbfile in a firebase project and tried to access it via the program but it didn't work. I also read in the GCP documentation that i can probably use Filestore but the pricing is way out of budget for me. I need to publish an SQLite so my server can work correctly, that's it.
Any help would be really apreciated!
Basically, you can't mount volume in Cloud Run. It's a stateless environment and you can't persist data on it. You have to use external storage to persist your data. See the runtime contract
WIth the 2nd execution runtime environment, you can now mount Cloud Storage bucket with GCSFuse, and Filestore path with NFS

Google Container Registry : Prevent a group of users to push reserved tag names

I need to restrict some users to push 'latest' or 'master' tags to a shared GCR repository, only automated process like jenkins should be able to push this tags, is that possible?
Is there a way to do this like AWS Iam policies and conditions?
I think not but it's an interesting question.
I wondered whether IAM conditions could be used but neither Container Registry nor Artifact Registry are resources that accept conditional bindings.
Container Registry uses Cloud Storage and Cloud Storage is a resource type that accepts bindings (albeit only on buckets). However, I think tags aren't manifest (no pun intended) at the GCS level.
One way to approach this would be limit container pushes to your automated processes and then add some process (workflow) in which developers can request restricted tags and have these applied only after approval.
Another approach would be to audit changes to the registry.
Google Artifact Registry (GAR) is positioned as a "next generation" (eventual replacement?) of GCR. With it, you can have multiple repositories within a project that could be used as a way to provide "free-for-all" and "restricted" repositories. I think (!?) even with GAR, you are unable to limit pushes by tag.
You could submit a feature request on Google's Issue Tracker for registries but, given Google's no new features on GCR, you may be out of luck.

How to backup ETCD of google kubernetes engine?

For google kubernetes engine, the master node, and ETCD cluster is abstracted away from the me the user.
Most of the ETCD backup guide (such as) assumes I have the endpoint or file system access to perform backups respectively.
As such - how do I perform such a backup, and restoration of ETCD in GKE?
Or would GKE provide subsequently a managed backup/restore service similar to cloud SQL?
Also if a full backup is not possible, even namespace backups will be great.
To clarify the scenario to guard against is not "if google goes down", but "if we do something stupid"
GKE backend is completely managed and thus, there is no way to access the etcd API. Even if you could access the cluster etcd, there are no guarantees of backwards compatibility for the storage backend. So the storage layer could change.
You'll have to use the kubernetes API which is backwards compatible for any backups you might want. There is some discussion on the kubernetes users google group here which should clarify this further.

Prevent an elastic beanstalk app being downloaded using the AWS account?

Short background, we're a small business but our clients are much larger businesses. We have some software they subscribe to which is deployed to AWS elastic beanstalk. Clients have their own devops teams, unlike us, and will need to manage some of the technical support. They will need access to the AWS account running the software, so they can do things like reboot the server, clear the database if they screw it up, change the EC2 instance type etc. This is OK but we want to prevent the software being downloaded outside of the AWS account.
The software is a java WAR running on Tomcat, on a single elastic beanstalk instance. We only care about limiting access to the WAR file (not the database for example).
The beanstalk application versions page appears to have no way to download the WAR file - which is good. They could SSH into the underlying EC2 instance though so presumably they could just copy the WAR out of the tomcat directory. Given the complexity of AWS there's probably other ways they could get access the WAR file too (e.g. clone the EBS volume and attach to another EC2 instance).
I assume that the machine instances available for purchase via AWS marketplace must have some form of copy protection but I've not been able to find any details on this. Also it looks like AWS only accepts marketplace vendors who are much larger than us, so marketplace option may not be open to us.
Any idea how I could prevent access to the WAR file running on elastic beanstalk while still allowing the client access to the AWS account? (Or at least make access hard).
The only solution that comes to mind for this would be, removing any EC2 SSH Key Pairs from the account, and specifically denying them access to ec2:CreateKeyPair. Really, what you need to be doing is granting them least privilege access to the account, that is, specifically granting them access only to those actions they absolutely need.
This will go a long way, but with sufficient knowledge of AWS, it's going to be an uphill battle trying to ensure that you give them enough access to do what they need, while not giving them more than you want. I'd question if a legal option (like contracts, licenses, etc) would be a better protection for this.

How to setup shared persistent storage for multiple AWS EC2 instances?

I have a service hosted on Amazon Web Services. There I have multiple EC2 instances running with the exact same setup and data, managed by an Elastic Load Balancer and scaling groups.
Those instances are web servers running web applications based on PHP. So currently there are the very same files etc. placed on every instance. But when the ELB / scaling group launches a new instance based on load rules etc., the files might not be up-to-date.
Additionally, I'd rather like to use a shared file system for PHP sessions etc. than sticky sessions.
So, my question is, for those reasons and maybe more coming up in the future, I would like to have a shared file system entity which I can attach to my EC2 instances.
What way would you suggest to resolve this? Are there any solutions offered by AWS directly so I can rely on their services rather than doing it on my on with a DRBD and so on? What is the easiest approach? DRBD, NFS, ...? Is S3 also feasible for those intends?
Thanks in advance.
As mentioned in a comment, AWS has announced EFS (http://aws.amazon.com/efs/) a shared network file system. It is currently in very limited preview, but based on previous AWS services I would hope to see it generally available in the next few months.
In the meantime there are a couple of third party shared file system solutions for AWS such as SoftNAS https://aws.amazon.com/marketplace/pp/B00PJ9FGVU/ref=srh_res_product_title?ie=UTF8&sr=0-3&qid=1432203627313
S3 is possible but not always ideal, the main blocker being it does not natively support any filesystem protocols, instead all interactions need to be via an AWS API or via http calls. Additionally when looking at using it for session stores the 'eventually consistent' model will likely cause issues.
That being said - if all you need is updated resources, you could create a simple script to run either as a cron or on startup that downloads the files from s3.
Finally in the case of static resources like css/images don't store them on your webserver in the first place - there are plenty of articles covering the benefit of storing and accessing static web resources directly from s3 while keeping the dynamic stuff on your server.
From what we can tell at this point, EFS is expected to provide basic NFS file sharing on SSD-backed storage. Once available, it will be a v1.0 proprietary file system. There is no encryption and its AWS-only. The data is completely under AWS control.
SoftNAS is a mature, proven advanced ZFS-based NAS Filer that is full-featured, including encrypted EBS and S3 storage, storage snapshots for data protection, writable clones for DevOps and QA testing, RAM and SSD caching for maximum IOPS and throughput, deduplication and compression, cross-zone HA and a 100% up-time SLA. It supports NFS with LDAP and Active Directory authentication, CIFS/SMB with AD users/groups, iSCSI multi-pathing, FTP and (soon) AFP. SoftNAS instances and all storage is completely under your control and you have complete control of the EBS and S3 encryption and keys (you can use EBS encryption or any Linux compatible encryption and key management approach you prefer or require).
The ZFS filesystem is a proven filesystem that is trusted by thousands of enterprises globally. Customers are running more than 600 million files in production on SoftNAS today - ZFS is capable of scaling into the billions.
SoftNAS is cross-platform, and runs on cloud platforms other than AWS, including Azure, CenturyLink Cloud, Faction cloud, VMware vSPhere/ESXi, VMware vCloud Air and Hyper-V, so your data is not limited or locked into AWS. More platforms are planned. It provides cross-platform replication, making it easy to migrate data between any supported public cloud, private cloud, or premise-based data center.
SoftNAS is backed by industry-leading technical support from cloud storage specialists (it's all we do), something you may need or want.
Those are some of the more noteworthy differences between EFS and SoftNAS. For a more detailed comparison chart:
https://www.softnas.com/wp/nas-storage/softnas-cloud-aws-nfs-cifs/how-does-it-compare/
If you are willing to roll your own HA NFS cluster, and be responsible for its care, feeding and support, then you can use Linux and DRBD/corosync or any number of other Linux clustering approaches. You will have to support it yourself and be responsible for whatever happens.
There's also GlusterFS. It does well up to 250,000 files (in our testing) and has been observed to suffer from an IOPS brownout when approaching 1 million files, and IOPS blackouts above 1 million files (according to customers who have used it). For smaller deployments it reportedly works reasonably well.
Hope that helps.
CTO - SoftNAS
For keeping your webserver sessions in sync you can easily switch to Redis or Memcached as your session handler. This is a simple setting in the PHP.ini and they can all access the same Redis or Memcached server to do sessions. You can use Amazon's Elasticache which will manage the Redis or Memcache instance for you.
http://phpave.com/redis-as-a-php-session-handler/ <- explains how to setup Redis with PHP pretty easily
For keeping your files in sync is a little bit more complicated.
How to I push new code changes to all my webservers?
You could use Git. When you deploy you can setup multiple servers and it will push your branch (master) to the multiple servers. So every new build goes out to all webserver.
What about new machines that launch?
I would setup new machines to run a rsync script from a trusted source, your master web server. That way they sync their web folders with the master when they boot and would be identical even if the AMI had old web files in it.
What about files that change and need to be live updated?
Store any user uploaded files in S3. So if user uploads a document on Server 1 then the file is stored in s3 and location is stored in a database. Then if a different user is on server 2 he can see the same file and access it as if it was on server 2. The file would be retrieved from s3 and served to the client.
GlusterFS is also an open source distributed file system used by many to create shared storage across EC2 instances
Until Amazon EFS hits production the best approach in my opinion is to build a storage backend exporting NFS from EC2 instances, maybe using Pacemaker/Corosync to achieve HA.
You could create an EBS volume that stores the files and instruct Pacemaker to umount/dettach and then attach/mount the EBS volume to the healthy NFS cluster node.
Hi we currently use a product called SoftNAS in our AWS environment. It allows us to chooses between both EBS and S3 backed storage. It has built in replication as well as a high availability option. May be something you can check out. I believe they offer a free trial you can try out on AWS
We are using ObjectiveFS and it is working well for us. It uses S3 for storage and is straight forward to set up.
They've also written a doc on how to share files between EC2 instances.
http://objectivefs.com/howto/how-to-share-files-between-ec2-instances