Error trying to create a GCP Pub Sub BigQuery subscriptions - google-cloud-platform

I am trying to create a GCP Pub Sub BigQuery subscriptions using the console: https://cloud.google.com/pubsub/docs/bigquery
However, I get the following error message:
API returned error: ‘Request contains an invalid argument.’
Any help would be appreciated.
NOTE
When the big query table does not exist I get the following error:
2.Pub Sub schema was not deleted

Update:
Actually, despite the generic error message, turns out to be a straightforward issue.
To use the "write metadata" option the BigQuery table must previously have the following field structure:
subscription_name (string),
message_id (string)
publish_time (timestamp)
data (bytes, string or json)
attributes (string or json)
they are better described in the docs here
Once they are created, this option works fine.
I believe that problem with the "Use topic schema" is also related to the schema used in the topic, since the table must have already the same structure (but there is a need to check it in your configuration). If your topic follow an avro schema, this might help: https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#avro_conversions
---------- previous answer
Not a definitive answer, but am having the same problem and figured out that it is somewhat related to the following options:
Use topic schema
Write metadata
Unchecking then make it work.
The same thing happens algo using terraform to try to set up the infrastructure.
I am still investigating if it is a bug or perhaps an error in my schema definition, but maybe it can be a starting point to you as well

A 404 error would indicate that either the BigQuery table does not exist or the Pub/Sub schema associated with the topic was deleted. For the former, ensure that the project, data set, and table names all match the names of the existing table to which you want to write data. For the latter, you could look at the topic details page and make sure that the schema name is not _deleted-schema_.

Related

Tagging AWS glue tables

I wanted to know if tagging of tables is possible in aws glue in any way. I know that in documentation , it is not given that it is possible for tables. It also is given that the tagging of a database is also not possible but I was able to do so using tag_resource boto3 call as specifed in this link.
This is the error I am getting if I try same tagging a table.
botocore.errorfactory.InvalidInputException: An error occurred
(InvalidInputException) when calling the TagResource operation
Just wanted to double check to see if it is not possible or I am missing something here.

Azure Text Analytics API to Power BI - error: Web.Contents failed to get contents

My goal is to get a new column in Power BI with keyphrases based on a column with text data. I try to connect the Azure text analytics API to PowerBI. I use this tutorial:
https://learn.microsoft.com/nl-nl/azure/cognitive-services/text-analytics/tutorials/tutorial-power-bi-key-phrases
After I invoke the custom function, and set the authentication and privacy to "anonymous" and "public", the KeyPhrases column I get only contains the values "Error" with the following description:
An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from 'https://******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases' (404): Resource Not Found
Details:
DataSourceKind=Web
DataSourcePath=https://*******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases
Url=https://******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases
Also, not sure if it is related to my issue, but I see the following warning on my Azure account in the Networking menu:
"VNet setting is not supported for current API type or resource location."
I checked all the steps in the tutorial, I re-entered the authentication and privacy settings. Also, I tried the same for the sentiment analysis function. Finally, I tried everything on a different and very simplistic dataset.
Not sure what the cause of my issue is and how to solve it.
Any suggestions would be much appreciated.
Best, Rosanne
Look at your error message:
'https://******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases' (404): Resource Not Found Details: DataSourceKind=Web DataSourcePath=https://*******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases Url=https://******.cognitiveservices.azure.com/.cognitiveservices.azure.com/text/analytics/v2.1/keyPhrases
It throw a 404, so you are pointing to the wrong URL.
And s you can see in the beginning of your url:
https://******.cognitiveservices.azure.com/.cognitiveservices.azure.com << here you have twice ".cognitiveservices.azure.com/" so you url setup is wrong.
I don't know exactly how it is setup on your side, but you may have provided a region or endpoint during it and it's here where you put the wrong value.

AWS Amplify filter for #searchable annotation

Currently I am using a DynamoDB instance for my social media application. While designing the schema I sticked to the "one table" rule. So I am putting every data in the same table like posts, users, comments etc. Now I want to make flexible queries for my data. Here I found out that I could use the #searchable annotation to create an Elastic Search instance for a table which is annotated with #model
In my GraphQL schema I only have one #model, since I only have one table. My problem now is that I don't want to make everything in the table searchable, since that would be most likely very expensive. There are some data which don't have to be added to the Elastic Search instance (For example comment related data). How could I handle it ? Do I really have to split my schema down into multiple tables to be able to manage the #searchable annotation ? Couldn't I decide If the row should be stored to the Elastic Search with help of the Partitionkey / Primarykey, acting like a filter ?
The current implementation of the amplify-cli uses a predefined python Lambda that are added once we add the #searchable directive to one of our models.
The Lambda code can not be edited and currently, there is no option to define a custom Lambda, you read about it
https://github.com/aws-amplify/amplify-cli/issues/1113
https://github.com/aws-amplify/amplify-cli/issues/1022
If you want a custom Lambda where you can filter what goes to the Elasticsearch Instance, you can follow the steps described here https://github.com/aws-amplify/amplify-cli/issues/1113#issuecomment-476193632
The closest you can get is by creating a template in amplify\backend\api\myapiname\stacks\ where you can manage all the resources related to Elasticsearch. A good start point is to
Add #searchable to one of your model in the schema.grapql
Run amplify api gql-compile
Copy the generated template in the build folder, \amplify\backend\api\myapiname\build\stacks\SearchableStack.json to amplify\backend\api\myapiname\stacks\
Remove the #searchable directive from the model added in step 1
Start editing your new template copied in step 3
Add a Lambda and use it in the template as the resolver for the DynamoDB Stream
Using this approach will give you total control of the resources related to the Elasticsearch service, but, will also require to do it all by your own.
Or, just go by creating a table for each model.
Hope it helps
It is now possible to override the generated streaming function code as well.
thanks to the AWS Support for the information provided
leaved a message on the related github issue as well https://github.com/aws-amplify/amplify-category-api/issues/437#issuecomment-1351556948
All you need is to run
amplify override api
edit the corresponding overrode.ts
change the code with the resources.opensearch.OpenSearchStreamingLambdaFunction.code
resources.opensearch.OpenSearchStreamingLambdaFunction.functionName = 'python_streaming_function';
resources.opensearch.OpenSearchStreamingLambdaFunction.handler = 'index.lambda_handler';
resources.opensearch.OpenSearchStreamingLambdaFunction.code = {
zipFile: `
# python streaming function customized code goes here
`
}
Resources:
[1] https://docs.amplify.aws/cli/graphql/override/#customize-amplify-generated-resources-for-searchable-opensearch-directive
[2]AWS::Lambda::Function Code - Properties - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-lambda-function-code.html#aws-properties-lambda-function-code-properties

How can I deal with `No enum constant` error in AWS Glue

I am creating a JOB in AWS Glue but it shows below error in the last step.
{"service":"AWSGlue","statusCode":400,"errorCode":"InvalidInputException","requestId":"ad9ee511-adb8-11e9-9bbf-9d08424a9846","errorMessage":"No enum constant com.amazonaws.services.glue.FileFormat.UNKNOWN","type":"AwsServiceError"}
The column mapping is shown correctly as below screenshot.
what I don't understand is which input field has an invalid data? Glue doesn't give me more detailed information about this error. How can I debug this issue?
I faced the same issue, because accidentally I tried to write the data to my DynamoDB table (as my Target). When I changed to S3. It worked.
You can learn more about it on AWS Documentation
Here is the issue raised in amazon related to this.

AWS DynamoDB resource not found exception

I have a problem with connection to DynamoDB. I get this exception:
com.amazonaws.services.dynamodb.model.ResourceNotFoundException:
Requested resource not found (Service: AmazonDynamoDB; Status Code:
400; Error Code: ResourceNotFoundException; Request ID: ..
But I have a table and region is correct.
From the docs it's either you don't have a Table with that name or it is in CREATING status.
I would double check to verify that the table does in fact exist, in the correct region, and you're using an access key that can reach it
My problem was stupid but maybe someone has the same... I changed recently the default credentials of aws (~/.aws/credentials), I was testing in another account and forgot to rollback the values to the regular account.
I spent 1 day researching the problem in my project and now I should repay a debt to humanity and reduce the entropy of the universe a little.
Usually, this message says that your client can't reach a table in your DB.
You should check the next things:
1. Your database is running
2. Your accessKey and secretKey are valid for the database
3. Your DB endpoint is valid and contains correct protocol ("http://" or "https://"), and correct hostname, and correct port
4. Your table was created in the database.
5. Your table was created in the database in the same region that you set as a parameter in credentials. Optional, because some
database environments (e.g. Testcontainers Dynalite) don't have an incorrect value for the region. And any nonempty region value will be correct
In my case problem was that I couldn't save and load data from a table in tests with DynamoDB substituted by Testcontainers and Dynalite. I found out that in our project tables creates by Spring component marked with #Component annotation. And in tests, we are using a global setting for lazy loading components to test, so our component didn't load by default because no one call it in the test explicitly. ¯_(ツ)_/¯
If DynamoDB table is in a different region, make sure to set it before initialising the DynamoDB by
AWS.config.update({region: "your-dynamoDB-region" });
This works for me:)
Always ensure that you do one of the following:
The right default region is set up in the AWS CLI configuration files on all the servers, development machines that you are working on.
The best choice is to always specify these constants explicitly in a separate class/config in your project. Always import this in code and use it in the boto3 calls. This will provide flexibility if you were to add or change based on the enterprise requirements.
If your resources are like mine and all over the place, you can define the region_name when you're creating the resource.
I do this for all my instantiations as it forces me to think about what I'm putting/calling where.
boto3.resource("dynamodb", region_name='us-east-2')
I was getting this issue in my .NetCore Application.
Following fixed the issue for me in Startup class --> ConfigureServices method
services.AddDefaultAWSOptions(
new AWSOptions
{
Region = RegionEndpoint.GetBySystemName("eu-west-2")
});
I got Error warning Lambda : lifecycleIteration=0 lambda handler returned an error: ResourceNotFoundException: Requested resource not found
I spent 1 week to fix the issue.
And so its root cause and steps to find issue is mentioned in below Git Issue thread and fixed it.
https://github.com/soto-project/soto/issues/595