What is AWS SageMaker mAP referring to? - amazon-web-services

I'm running a model on AWS SageMaker, using their example object detection Jupyter notebook (https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/object_detection_pascalvoc_coco/object_detection_recordio_format.ipynb). In the results it gives the following:
validation mAP =(0.111078678154)
I was wondering what this mAP score is referring to?
I've used tensorflow, where it gives an averaged mAP(averages from .5IoU to .95IoU with .05 increments), mAP#.5IoU, mAP#.75IoU. I've checked the documents on SageMaker, but cannot find anything referring to what the definition of mAP is.
Is it safe to assume that the mAP score SageMaker reports is the "averaged mAP(averages from .5IoU to .95IoU with .05 increments)"?

Heyo,
The mAP score is the mean average precision score that is widely used for object detection (https://docs.aws.amazon.com/sagemaker/latest/dg/object-detection-tuning.html)
Take a look at this link for more info on mAP: https://medium.com/#jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173

Related

Correct approach to improve/retrain an offiline model

I have a recommendation system that was trained using Behavior Cloning (BC) with offline data generated using a supervised learning model converted to batch format using the approach described here. Currently, the model is exploring using an e-greedy strategy. I want to migrate from BC to MARWIL changing the beta.
There is a couple of ways to do that:
Convert the data employed to train the BC algorithm plus the agent’s new data and retrain from scratch using MARWIL.
Convert the new data generated by the agent and put it together with the previous converted data employed to train the BC algorithm, using the input parameter, doing something similar to what is described here, and retrain from scratch using MARWIL .
Convert the new data generated by the agent and put it together with the previous converted data employed to train the BC algorithm, using the input parameter, doing something similar to what is described here, and retrain using the restored BC agent using MARWIL .
Questions:
Following option 1.:
Given that the new data slice would be very small compared with the previous one, would the model learn something new?
When we stop using original data?
Following option 2.:
Given that the new data slice would be very small compared with the previous one, would the model learn something new?
When we stop using original data?
This approach works for trajectories associated with new episodes ids, but it will extend the trajectories of episodes already present in the original batch?
Following option 3.:
Given that the new data slice would be very small compared with the previous one, would the model learn something new?
When we stop using original data?
This approach works for trajectories associated with new episodes ids, but it will extend the trajectories of episodes already present in the original batch?
The retrain would update the networks’ weights using the new data points, but to do that how many iterations should we use?
How to prevent catastrophic forgetting?

How are confidence scores calculated in AWS SageMaker GroundTruth?

AWS's SageMaker/GroundTruth Labelling jobs return a confidence score for each human-annotated label.
However, the score is not a direct function of the responses of the N workers who labeled the task.
For example, on tasks with all three workers assigning different labels the score varies (0.61, 0.55, 0.68). And where 2/3 agree, the score varies also (0.95, 0.91).
"Automated data labelling" is disabled, which indicates that all items are labeled by a human, rather than being fully/partially automatically classified.
How does AWS calculate these confidence scores?
I can't find the details, so leaving this question open hoping for a real answer. But this is what I can find out so far:
Each labelling job has a AnnotationConsolidationConfig param which lets you control how the confidence score is calculated using an AWS Lambda function.
The default for single-image classification is described as:
a variant of the Expectation Maximisation approach.
It estimates parameters for each worker and uses Bayesian inference to estimate the true class based on the class annotations from individual workers."
however it appears regular AWS users are not able to view the function itself due to lack of permissions.

Evaluating previous checkpoints

I'm fairly new to TensorFlow, and is experimenting with Bert in TensorFlow. I notice that the example scripts are storing checkpoints every 1000 epoch. It is both storing .data, .index and .meta for each checkpoints. It also creates an eval-folder with a events.out.tfevents.*-file.
It stores an eval_results.txt-file containing the evaluation results for the latest checkpoints.
I want look at the eval-results for previous checkpoints both to see progress and to see if I am overfitting.
I had some issues getting tensorboard running. Are these kind of data stored in the .meta or .index? Do I need tensorboard to see this data, or are there other ways? Or do I have to rerun predictions manually by loading each individual checkpoint?

Unable to deploy a Cloud ML model

Why I try to deploy my trained model to Google Cloud ML, I get the following error:
Create Version failed.Model validation failed: Model metagraph does not have inputs collection.
What does this mean and how to get around this?
The Tensorflow model deployed on CloudML did not have a collection named “inputs”. This collection should name all the input tensors for your graph. Similarly, a collection named “outputs” is required to name the output tensors for your graph. Assuming your graph has two input tensors x and y, and one output tensor scores, this can be done as follows:
tf.add_to_collection(“inputs”, json.dumps({“x” : x.name, “y”: y.name}))
tf.add_to_collection(“outputs”, json.dumps({“scores”: scores.name}))
Here “x”, “y” and “scores” become aliases to the actual tensor names (x.name, y.name and scores.name)

sequential/online kmeans clustering, how does it work? Existing codes?

I'm a little confused about online kmeans clustering. I know that it allows me to cluster with just one data at a time. But,is this all limited to one session? Suppose that I have a bunch of data clustered via this method and I get the clustered data result, would I be able to add more data to the cluster in the future?
I've also been looking for implementations of this code, and to no avail. Anyone know of any?
Update:
To clarify more. Here is how my code works right now:
Image is taken from live video feed, once enough pictures are saved, get kmeans of sift features.
Repeat step 1, a new batch of live feed pictures, get kmeans again. Combine the kmeans vectors with the previous kmeans like :[A B]
You can see that this is bad, because I quickly get too much clusters, and each batch of clusters will definitely have overlaps with another batch.
What I want:
Image taken from live video feed, once pics are saved, get kmeans
Repeat step 1, get kmeans again, which updates and adds new clusters to the previous cluster.
Nothing that I've seen could accommodate that, unless I'm just not understanding them correctly.
If you look at the original (!) publications, the method proposed by MacQueen - where the name k-means comes from - was in fact an online algorithm. I'm not sure if MacQueen did multiple passes over the data to improve the result. I believe he used a single pass, and objects would never be reassigned to a different cluster. If so, it was already an online algorithm!
Means are commonly computed as sum / count. This is not very sensible from a numerical point of view. E.g. in the classic Knuth book you can find a method for incrementally updating means. Wikipedia has it also.
Things get slightly more complicated once you actually want to reassign earlier points. But usually in a streaming context you do not know the previous points, so you cannot do that anyway.