Is it possible to have SageMaker output Objective Metrics during a training job? - amazon-web-services

In the SageMaker hyper parameter tuning jobs, you can use a RegEx expression to parse your logs and output a objective metric to the web console. Is it possible to do this during a normal training job?
It would be great to have this feature so I don't need to look through all the logs to find the metric.

Thank you for your suggestion! We will incorporate your feedback into our roadmap planning and prioritize this feature accordingly. As always, we deliver a feature as fast as we can if we see strong customer needs in it.
Thanks for using Amazon Sagemaker !!!

Related

Best way to ingest data to bigquery

I have heterogeneous sources like flat files residing on prem, json on share point, api which serves data so and so. Which is the best etl tool to bring data to bigquery environment ?
Im a kinder garden student in GCP :)
Thanks in advance
There are many solutions to achieve this. It depends on several factors some of which are:
frequency of data ingestion
whether or not the data needs to be
manipulated before being written into bigquery (your files may not
be formatted correctly)
is this going to be done manually or is this going to be automated
size of the data being written
If you are just looking for an ETL tool you can find many. If you plan to scale this to many pipelines you might want to look at a more advanced tool like Airflow but if you just have a few one-off processes you could set up a Cloud Function within GCP to accomplish this. You can schedule it (via cron), invoke it through HTTP endpoint, or pub/sub. You can see an example of how this is done here
After several tries and datalake/datawarehouse design and architecture, I can recommend you only 1 thing: ingest your data as soon as possible in BigQuery; no matter the format/transformation.
Then, in BigQuery, perform query to format, clean, aggregate, value your data. It's not ETL, it's ELT: you start by loading your data and then you transform them.
It's quicker, cheaper, simpler, and only based on SQL.
It works only if you use ONLY BigQuery as destination.
If you are starting from scratch and have no legacy tools to carry with you, the following GCP managed products target your use case:
Cloud Data Fusion, "a fully managed, code-free data integration service that helps users efficiently build and manage ETL/ELT data pipelines"
Cloud Composer, "a fully managed data workflow orchestration service that empowers you to author, schedule, and monitor pipelines"
Dataflow, "a fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing"
(Without considering a myriad of data integration tools and fully customized solutions using Cloud Run, Scheduler, Workflows, VMs, etc.)
Choosing one depends on your technical skills, real-time processing needs, and budget. As mentioned by Guillaume Blaquiere, if BigQuery is your only destination, you should try to leverage BigQuery's processing power on your data transformation.

AWS Glue jobs status Dashboard

In our project total 10 Glue jobs are running daily. I would like to build a dashboard to show last 7 days jobs status it means either succeeded or failure. Tried to achieve it in CloudWatch with metrics, but not able do it. Please give an idea to build this dashboard.
Probably a little late for the original questioner, but maybe helpful for others.
We had a similar task in our project. We have many jobs and need to monitor success and failure. In our experience, the built-in metrics aren't really reliable, nor do they really answer the question of whether a job was successful or not.
But we found a good way for us by generating custom metrics in a generic way for all jobs. This also works for existing jobs afterwards without having to change the code.
I wrote an article about it: https://medium.com/#ettefette/metrics-for-aws-glue-jobs-as-you-know-them-from-lambda-functions-e5e1873c615c
We have set cloudwatch alerts based on these metrics and we use the metrics in our grafana dashboard to monitor the glue jobs.

Training Yolact on Google Colab+ without timing out

I want to train Yolact on a custom dataset using Google Colab+.
Is it possible to train on Colab+ or does it time out to easily?
Thank you!
Yes, you can train your model on Colab+. The problem is that Colab has a relatively short lifecycle compared with other cloud platforms such as AWS SageMaker or Google Cloud. I run the code below to extend a bit more such time.
%%javascript
function ClickConnect(){
console.log("Working");
document.querySelector("colab-toolbar-button#connect").click()
}setInterval(ClickConnect,50000)

Analyze Number value in Different Conditions with google cloud platform logging

I'm struggling to find out how to use GCP logging to log a number value for analysis, I'm looking for a link to a tutorial or something (or a better 3rd party service to do this).
Context: I have a service that I'd like to test different conditions for the function execution time and analyze it with google-cloud-platform logging.
Example Log: { condition: 1, duration: 1000 }
Desire: Create graph using GCP logs to compare condition 1 and 2.
Is there a tutorial somewhere for this? Or maybe there is a better 3rd party service to use?
PS: I'm using the Node google cloud logging client which only talks about text logs.
PSS: I considered doing this in loggly, but ended up getting lost in their documentation and UI.
There are many tools that you could use to solve this problem. However, you suggest a willingness to use Google Cloud Platform services (e.g. Stackdriver monitoring), so I'll provide some guidance using it.
NOTE Please read around the topic and understand the costs involved with using e.g. Cloud Monitoring before you commit to an approach.
Conceptually, the data you're logging (!) more closely matches a metric. However, this approach would require you to add some form of metric library (see Open Telemetry: Node.js) to your code and instrument your code to record the values that interest you.
You could then use e.g. Google Cloud Monitoring to graph your metric.
Since you're already producing a log with the data you wish to analyze, you can use Log-based metrics to create a metric from your logs. You may be interested in reviewing the content for distribution metric.
Once you've a metric (either directly or using logs-based), you can then graph the resulting data in Cloud Monitoring. For logs-based metrics, see the Monitoring documentation.
For completeness and to provide an alternative approach to producing and analyzing metrics, see the open-source tool, Prometheus. Using a 3rd-party Prometheus client library for Node.js, you could instrument you code to produce a metric. You would then configure Prometheus to scrape your app for its metrics and graph the results for you.

How can i monitor usage of each iot device separately on aws using rules engine? Is there any other way to do the same?

we are currently using aws iot messaging and shadow service, and the total usage can be monitored using cloudwatch, but i want to monitor usage per device.i am new to aws so the only way i can think of is to make a rule which gets triggered every time message is published, extract the thing id from the topic and increase the counter for that thing in dynamodb. How can i do it step by step? i have followed this tutorial but it doesn't work. is there any better way to do the same.
I would look into some IoT analytics software. There is a lot of companies which do this type of thing. You could even build your own with open source software, but it would require you learn and stand up ELK, along with your own instrumentation. I work for a company (AppDynamics) which offers these capabilities along with other application performance monitoring. Have a look at our IoT solution.