DSX scheduled jobs do not refresh internal notebooks - data-science-experience

when creating a scheduled job, only the notebooks that are shared (in a public link) are updated every hour but not any of the notebooks that that are viewed internally, that is, by the collaborators of the project. Is this planned to be fixed at some point?

How is viewing a notebook supposed to update anything? I assume that you mean updates from internal collaborators are not picked up by the scheduled job.
Ask the internal collaborators to use "Save Version" instead of "Save" when they want their changes to be picked up by the scheduled job. Sharing a notebook implicitly saves a version instead of a snapshot. But you can do it explicitly as well, without sharing.

Related

Does GCP scheduled VM exits/loses its schedule when the VM is manipulated, stop/start, manualy?

I have previously scheduled my VM (e2-micro in europe-central2-a) succesfully. It has a schedule assigned to start and stop at certain hours... in order to save machine use time at night. It worked until I stopped the instance manually from CLI for some work.
Since that, the schedule is not forcing the instance to Start-Stop and I have to do it manually.
Now, The machine is working fine, no issues, But it doesn't resume/return to said schedule previously set if I stop-start it manually (for some manual work through ssh-connection). Is there a specific procedure (maybe I missed some instructions for the case) to do it?

GCP cloud run send a request to all running instances

I have a rest API running on cloud run that implements a cache, which needs to be cleared maybe once a week when I update a certain property in the database. Is there any way to send a HTTP request to all running instances of my application? Right now my understanding is even if I send multiple requests and there are 5 instances, it could all go to one instance. So is there a way to do this?
Let's go back to basics:
Cloud Run instances start based on a revision/image.
If you have the above use case, where suppose you have 5 instances running and you suddenly need to re-start them as restarting the instances resolves your use case, such as clearing/rebuilding the cache, what you need to do is:
Trigger a change in the service/config, so a new revision gets
created.
This will automatically replace, so will stop and relaunch all your instances on the fly.
You have a couple of options here, choose which is suitable for you:
if you have your services defined as yaml files, the easiest is to run the replace service command:
gcloud beta run services replace myservice.yaml
otherwise add an Environmental variable like a date that you increase, and this will yield a new revision (as a change in Env means new config, new revision) read more.
gcloud run services update SERVICE --update-env-vars KEY1=VALUE1,KEY2=VALUE2
As these operations are executed, you will see a new revision created, and your active instances will be replaced on their next request with fresh new instances that will build the new cache.
You can't reach directly all the active instance, it's the magic (and the tradeoff) of serverless: you don't really know what is running!! If you implement cache on Cloud Run, you need a way to invalidate it.
Either based on duration; when expired, refresh it
Or by invalidation. But you can't on Cloud Run.
The other way to see this use case is that you have a cache shared between all your instance, and thus you need a shared cache, something like memory store. You can have only 1 Cloud Run instance which invalidate it and recreate it and all the other instances will use it.

AWS Sagemaker: Jupyter Notebook kernel keeps dying

I get disconnect every now and then when running a piece of code in Jupyter Notebooks on Sagemaker. I usually just restart my notebook and run all the cells again. However, I want to know if there is a way to reconnect to my instance without having to lose my progress. At the minute, it shows that there is "No Kernel" at the bottom bar, but my file seems active in the kernel sessions tab. Can I recover my notebook's variables and contents? Also, is there a way to prevent future kernel disconnections?
Note that I reverted back to tornado = 5.1.1, which seems to decrease the number of disconnections, but it still happens every now and then.
Often, disconnections will be caused by inactivity because a job is running for a long time with no user input. If it's pre-processing that's taking a long time, you could increase the instance size of the processing job so that it executes faster, or increase the instance count. If you're using EMR, you can now run an EMR Spark query directly on the EMR cluster since December 2021:
https://aws.amazon.com/about-aws/whats-new/2021/12/amazon-sagemaker-studio-data-notebook-integration-emr/
There's a useful blog here https://aws.amazon.com/blogs/machine-learning/build-amazon-sagemaker-notebooks-backed-by-spark-in-amazon-emr/ which is helpful in getting you up and running.
Please let me know if you need more information, or vote for the answer if it's useful. :-)
For me a quick solution was to open a Terminal instead, save the notebook file as a Pytohn file, and run it from the terminal within Sagemaker.

how to auto update AWS windows EC2 instances when updates become available

I am working with AWS EC2 windows instances and my goal is to associate them with a maintenance window or a patch baseline (I'm not sure which one) to schedule an automation that, when updates for the instance become available it automatically updates itself. I have created a maintenance window for the instances but I think my issue is figuring out how to link up a system to check for updates and run them when they become available.
What you're looking for is the Patch Manager feature of the EC2 Systems Manager service: http://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-patch.html.
There is also a handy getting started blog post available here: https://aws.amazon.com/blogs/mt/getting-started-with-patch-manager-and-amazon-ec2-systems-manager/
Creating a Maintenance Window is the right first step, this will let you control when you want to do patching of your instances, which instances you want to patch and how you want them to be patched.
To define which instances you want to patch, the easiest way is to tag your instances (e.g. create a tag with the key 'Type' and value 'Patching'), but if you have a reasonably small number of instances and don't launch new instances on a regular basis you can also add them individually, by instance id to the Maintenance Window as a target. If you regularly launch new instances (either manually or as part of an Auto Scaling Group), tagging is convenient as those instances will be picked up automatically for patching.
Once you've added your instances as targets to your maintenance window, the next step is to add a task to the maintenance window. Specifically you want to add the Run Command task 'AWS-RunPatchBaseline' and run that for the target you created above (making sure to set Operation to 'Install').
This completes the minimum steps needed to patch all of your instances whenever the maintenance window runs. Every time the maintenance window runs, the AWS-RunPatchBaseline command will be sent to your instances and all approved patches will be installed and patch compliance reported.
If you want more control over exactly which patches are approved you can also create a custom patch baseline and define specific rules controlling which patches to approve when. If you choose to do so (if not, the default patch baseline is used), you'll also want to set the 'Patch Group' tag on your instances to define which patch baseline to use for which instance. That's described in more detail in the documentation.
Hope this helps, feel free to ping me otherwise.
/Mats

AWS - Keeping AMIs updated

Let's say I've created an AMI from one of my EC2 instances. Now, I can add this manually to then LB or let the AutoScaling group to do it for me (based on the conditions I've provided). Up to this point everything is fine.
Now, let's say my developers have added a new functionality and I pull the new code on the existing instances. Note that the AMI is not updated at this point and still has the old code. My question is about how I should handle this situation so that when the autoscaling group creates a new instance from my AMI it'll be with the latest code.
Two ways come into my mind, please let me know if you have any other solutions:
a) keep AMIs updated all the time; meaning that whenever there's a pull-request, the old AMI should be removed (deleted) and replaced with the new one.
b) have a start-up script (cloud-init) on AMIs that will pull the latest code from repository on initial launch. (by storing the repository credentials on the instance and pulling the code directly from git)
Which one of these methods are better? and if both are not good, then what's the best practice to achieve this goal?
Given that anything (almost) can be automated using the AWS using the API; it would again fall down to the specific use case at hand.
At the outset, people would recommend having a base AMI with necessary packages installed and configured and have init script which would download the the source code is always the latest. The very important factor which needs to be counted here is the time taken to checkout or pull the code and configure the instance and make it ready to put to work. If that time period is very big - then it would be a bad idea to use that strategy for auto-scaling. As the warm up time combined with auto-scaling & cloud watch's statistics would result in a different reality [may be / may be not - but the probability is not zero]. This is when you might consider baking a new AMI frequently. This would enable you to minimize the time taken for the instance to prepare themselves for the war against the traffic.
I would recommend measuring and seeing which every is convenient and cost effective. It costs real money to pull down the down the instance and relaunch using the AMI; however thats the tradeoff you need to make.
While, I have answered little open ended; coz. the question is also little.
People have started using Chef, Ansible, Puppet which performs configuration management. These tools add a different level of automation altogether; you want to explore that option as well. A similar approach is using the Docker or other containers.
a) keep AMIs updated all the time; meaning that whenever there's a
pull-request, the old AMI should be removed (deleted) and replaced
with the new one.
You shouldn't store your source code in the AMI. That introduces a maintenance nightmare and issues with autoscaling as you have identified.
b) have a start-up script (cloud-init) on AMIs that will pull the
latest code from repository on initial launch. (by storing the
repository credentials on the instance and pulling the code directly
from git)
Which one of these methods are better? and if both are not good, then
what's the best practice to achieve this goal?
Your second item, downloading the source on server startup, is the correct way to go about this.
Other options would be the use of Amazon CodeDeploy or some other deployment service to deploy updates. A deployment service could also be used to deploy updates to existing instances while allowing new instances to download the latest code automatically at startup.