Getting error "INTERNAL" when training a model with AutoML - google-cloud-platform

I'm training a small model with AutoML entity extraction, but the training keeps failing with the error message "INTERNAL" and no other details.
I'm doing this from the Google Cloud console, and I've followed the same steps I've used successfully to train other models.
The dataset has two labels with a few hundred text items each, so I doubt it's a timeout or anything like that.
What might be causing this and is there a way to debug/get more visibility?

Could be that dataset contains duplicate columns which is not currently supported. If this is not your case, I'd suggest to reach with GCP Support to check it internally.

Related

Possible to do incremental training with AWS comprehend?

I am looking at AWS Comprehend for a text classification task which will involve an active learning component. I am trying to understand if it's possible to incrementally train a custom comprehend model using batches of newly annotated data, or if it only supports training from scratch. In this blog post it sounds like they are stitching the annotated data back together with the original training data (i.e. retraining from screatch each time), but I don't see the mentioned cloudformation template (part 1 has the template for training/deployment, but part 2 seems to be talking about another template).
Is it possible to do incremental training with Comprehend? Or would I need to use a custom text classification model through SageMaker and then do incremental training that way? I am attempting to do the following
Get a pretrained model
Fine tune on own classification data
Incrementally train on annotated low confidence preditions
1 and 2 can be done with AWS Comprehend, but not sure about 3. Thanks

Google Vertex AI fails AutoML training due to large BigQuery dataset being too large

I am currently training some models via Googles AutoML feature contained within their Vertex AI products.
The normal pipeline is creating a dataset, which I do by creating a table in Bigquery, and then starting the training process.
This has normally worked before but for my latest dataset I get the following error message:
Training pipeline failed with error message: The size of source BigQuery table is larger than 107374182400 bytes.
While it seemed unlikely to me that the table is actually too large for AutoML, I tried re-training on a new dataset that's a 50% sample of the original table but the same error occured.
Is my dataset really to large for AutoML to handle or is there another issue?
There are some perspectives of limits for AutoML Tables -- not only size in bytes (100GB as maximum supported size), but also number of rows (~200bi lines) and number of columns (up to 1000 columns).
You can find more details on AutoML Tables limits documentation.
Is your source data within those limits?

batch predictions in GCP Vertex AI

While trying out batch predictions in GCP Vertex AI for an AutoML model, the batch prediction results span over several files(which is not convenient from a user perspective). If it would have been a single batch prediction result file i.e. covering all the records in a single file, it would make the procedure much more simple.
For instance, I had 5585 records in my input dataset file. The batch prediction results comprise of 21 files wherein each file has records in the range of 200-300, thus, covering 5585 records altogether.
Batch predictions on an image, text,video,tabular AutoML model, runs the jobs using distributed processing which means the data is distributed among an arbitrary cluster of virtual machines and is processed in an unpredictable order because of which you will get the prediction results stored across various files in Cloud Storage. Since the batch prediction output files are not generated with the same order as an input file, a feature request has been raised and you can track the update on this request from this link.
We cannot provide an ETA at this moment but you can follow the progress in the issue tracker and you can ‘STAR’ the issue to receive automatic updates and give it traction by referring to this link.
However, if you are doing batch prediction for a tabular AutoML model, there you have the option to choose the BigQuery as storage where all the prediction output will be stored in a single table and then you can export the table data to a single CSV file.

How to make Automl model with Healthcare Entity extraction as base model?

I am facing a problem while making a custom model with Automl. I am supplying Automl with JSONL training data with label DISEASE.
My service account has the permission healthcare.nlpservce.analyzeEntities and also before Start training i am choosing the option to Enable Healthcare Entity Extraction.
But still after 4+ hours of training the model detects only the DISEASE label.
It is not detecting Problems, Procedures, etc
I am following the steps mentioned in the documentation .
Attached photo of the service account's permission(no utilization analysis)
Can anyone please point me in the right direction....

How to explicitly set sagemaker autopilot's validation set?

The example notebook: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/autopilot/autopilot_customer_churn.ipynb states that in the Analyzing Data step:
The dataset is analyzed and Autopilot comes up with a list of ML pipelines that should be tried out on the dataset. The dataset is also split into train and validation sets.
Presumably, autopilot uses this validation set to select the best performing model candidates to return to the user. However, I have not found a way to manually set this validation set used by sagemaker autopilot.
For example, google automl, allows users to add TRAIN, VALIDATE,TEST keywords to a data_split column to manually set which data points are in which set.
Is something like this currently possible which sagemaker autopilot?
I'm afraid you can't do this at the moment. The validation set is indeed built by Autopilot itself.