How to ensure consistency between multiple GCP cloud memory store instances? - google-cloud-platform

I have my application caching some data on cloud memory store. The application has multiple instances running on the same region. AppInstanceA caches to MemStoreA and AppInstanceB caches to MemStoreB.
A particular user action from the app should perform cache evictions.
Is there an option in GCP to evict the entries on both MemStoreA and MemStoreB regardless from which app instance the action is triggered?
Thanks

You can use PubSub for this.
Create a topic
Publish in the topic when you have a key to invalidate
Create 1 subscription per memory store instance
Plug 1 function (each time the same function) per subscription with an environment variable that specifies the instance to use
Like this, the function are trigger in parallel and you can expect to invalidate roughly in the same time the key in all memory store instances.

Related

How to configure automatic failover of storage account when using AzureWebJobsStorage for Web Job Timer Triggers

I have an Azure Web Job that runs on a TimerTrigger to put some messages on a Service Bus queue. I have deployed this Web Job on 2 separate regions, for high availability in case one region goes down. As per https://github.com/Azure/azure-webjobs-sdk-extensions/wiki/TimerTrigger, I can see that the distributed lock mechanism is working perfectly and the timer is only executing in one region at a time, so there are no duplicate requests coming through.
However, the Web Jobs in both the regions are using the same common storage account, and the storage account is deployed to just one region. I can't use 2 separate storage accounts, because then I lose on the distributed lock functionality. I know that Azure provides Geo-redundant storage for my storage account, so the data is replicated to a secondary region.
My question is - in the event of a disaster in one region (specifically the primary region of the storage account), is there a way to have the web job automatically failover to the secondary end point? Right now, I have the "AzureWebJobsStorage" application setting specified to be one of the shared access keys of the storage account.
Appreciate any pointers!
I'm not an expert on the storage SDK but I've linked two docs that may help walk you through how to make your app highly available.
https://learn.microsoft.com/en-us/azure/storage/common/storage-disaster-recovery-guidance?toc=/azure/storage/blobs/toc.json
https://learn.microsoft.com/en-us/azure/storage/common/geo-redundant-design?tabs=legacy
https://learn.microsoft.com/en-us/azure/storage/blobs/storage-create-geo-redundant-storage?tabs=dotnet11
Since the caveat with Geo-redundant storage is that it's read-only on the secondary until you make a request otherwise, I did find GeoRedundantSecondaryUri property part of BlobClientOptions that will use the secondary address as part of a retry policy.

Caching results of a lambda function

We are developing a serverless application. The application has "users" that get added, "groups" that have "permissions" on different "resources". To check if a user has permission to do an action on a resource, there would be some calculations we will need to do. (We are using DynamoDB)
Basically, before every action, we will need to check if the user has permission to do that particular action on the given resource. I was thinking we could have a lambda function that checks that from a cache and If not in the cache, hits the DB, does the calculation, writes in the cache, and returns.
What kind of cache would be best to use here? We are going to be calling this internally from the backend itself.
Is API gateway the way to go still?
How about elastic cache for this purpose? Can we use it without having to configure a VPC? We are trying not to have to use a VPC in our application.
Any better ways?
They are all good options!
Elasticache is designed for caching data. API Gateway can also cache results.
An alternative is to keep the data "inside" the AWS Lambda function by using global variables. The values will remain present the next time the Lambda function is invoked, so you could cache results and an expiry time. Note, however, that Lambda might launch multiple containers if the function is frequently run (even in parallel), or not run for some time. Therefore, you might end up with multiple caches.
I'd say the simplest option would be API Gateway's cache.
Where are those permissions map (user <-> resource) is stored?
This aws's blog post might be interesting (it's about caching in lambda execution environment's memory.), because you could use dynamodb's table for that.

Is there a way to turn on SageMaker model endpoints only when I am receiving inference requests

I have created a model endpoint which is InService and deployed on an ml.m4.xlarge instance. I am also using API Gateway to create a RESTful API.
Questions:
Is it possible to have my model endpoint only Inservice (or on standby) when I receive inference requests? Maybe by writing a lambda function or something that turns off the endpoint (so that it does not keep accumulating the per hour charges)
If q1 is possible, would this have some weird latency issues on the end users? Because it usually takes a couple of minutes for model endpoints to be created when I configure them for the first time.
If q1 is not possible, how would choosing a cheaper instance type affect the time it takes to perform inference (Say I'm only using the endpoints for an application that has a low number of users).
I am aware of this site that compares different instance types (https://aws.amazon.com/sagemaker/pricing/instance-types/)
But, does having a moderate network performance mean that the time to perform realtime inference may be longer?
Any recommendations are much appreciated. The goal is not to burn money when users are not requesting for predictions.
How large is your model? If it is under the 50 MB size limit required by AWS Lambda and the dependencies are small enough, there could be a way to rely directly on Lambda as an execution engine.
If your model is larger than 50 MB, there might still be a way to run it by storing it on EFS. See EFS for Lambda.
If you're willing to wait 5-10 minutes for SageMaker to launch, you can accomplish this by doing the following:
Set up a Lambda function (or create a method in an existing function) to check your endpoint status when the API is called. If the status != 'InService', call the function in #2.
Create another method that when called launches your endpoint and creates a metric alarm in Cloudwatch to monitor your primary lambda function's invocations. When the threshold falls below your desired invocations / period, it will call the function in #3.
Create a third method to delete your endpoint and the alarm when called. Technically, the alarm can't call a Lambda function, so you'll need to create a topic in SNS and subscribe this function to it.
Good luck!

Using S3 to store application configuration files

I'm creating a simple web app that needs to be deployed to multiple regions in AWS. The application requires some dynamic configuration which is managed by a separate service. When the configuration is changed through this service, I need those changes to propagate to all web app instances across all regions.
I considered using cross-region replication with DynamoDB to do this, but I do not want to incur the added cost of running DynamoDB in every region, and the replication console. Then the thought occurred to me of using S3 which is inherently cross-region.
Basically, the configuration service would write all configurations to S3 as static JSON files. Each web app instance will periodically check S3 to see if the any of the config files have changed since the last check, and download the new config if necessary. The configuration changes are not time-sensitive, so polling for changes every 5/10 mins should suffice.
Have any of you used a similar approach to manage app configurations before? Do you think this is a smart solution, or do you have any better recommendations?
The right tool for this configuration depends on the size of the configuration and the granularity you need it.
You can use both DynamoDB and S3 from a single region to serve your application in all regions. You can read a configuration file in S3 from all the regions, and you can read the configuration records from a single DynamoDB table from all the regions. There is some latency due to the distance around the globe, but for reading configuration it shouldn't be much of an issue.
If you need the whole set of configuration every time that you are loading the configuration, it might make more sense to use S3. But if you need to read small parts of a large configuration, by different parts of your application and in different times and schedule, it makes more sense to store it in DynamoDB.
In both options, the cost of the configuration is tiny, as the cost of a text file in S3 and a few gets to that file, should be almost free. The same low cost is expected in DynamoDB as you have probably only a few KB of data and the number of reads per second is very low (5 Read capacity per second is more than enough). Even if you decide to replicate the data to all regions it will still be almost free.
I have an application I wrote that works in exactly the manner you suggest, and it works terrific. As it was pointed out, S3 is not 'inherently cross-region', but it is inherently durable across multiple availability zones, and that combined with cross region replication should be more than sufficient.
In my case, my application is also not time-sensitive to config changes, but none-the-less besides having the app poll on a regular basis (in my case 1 once per hour or after every long-running job), I also have each application subscribed to SNS endpoints so that when the config file changes on S3, an SNS event is raised and the applications are notified that a change occurred - so in some cases the applications get the config changes right away, but if for whatever reason they are unable to process the SNS event immediately, they will 'catch up' at the top of every hour, when the server reboots and/or in the worst case by polling S3 for changes every 60 minutes.

Limiting EC2 resources used by AWS data pipeline during DynamoDB table backups

I need to backup 6 DynamoDB tables every couple of hours. I've created 6 pipeliness from templates and it ran great, except that it created 6 or more virtual machines which were mostly staying up. That's not the economy I can afford.
Does anyone have experience optimizing this kind of scenario?
Some solutions that come to mind are:
One:
To ensure that EC2 resources are being terminated, you can set the terminateAfter property on the EC2 resource definition. The semantics of terminate after are discussed here - How does AWS Data Pipeline run an EC2 instance?.
Two:
This thread on the AWS forum discusses how existing EC2 instance may be used by data pipeline.
Three:
Using the backup pipeline template always creates a single pipeline with a single Activity for the backup that reads from a single source and writes to a single destination. You can view the JSON source of the pipeline in the AWS console and write a similar pipeline with multiple Activity instances - one for each table you want to backup. Since the pipeline definition will only have one EMR resource, only that EMR resource will do the work of all the activities.
You can set the field maxActiveInstances on the Ec2Resource object.
maxActiveInstances The maximum number of concurrent active instances of a component. For activities, setting this to 1 runs instances in strict chronological order. A value greater than 1 allows different instances of the activity to run concurrently and requires you to ensure your activity can tolerate concurrent execution.
See this: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-ec2resource.html
Aravind. R