Replicating data from SQL Server to BigQuery

Replicating data from SQL Server to BigQuery - google-cloud-platform

I've been trying to follow instructions from Google on Replicating data from SQL Server to BigQuery available here: https://cloud.google.com/data-fusion/docs/tutorials/replicating-data/sqlserver-to-bigquery. Following instructions to the letter step by step always results in this odd error when creating the Cloud Fusion instance
Invalid argument (HTTP 400): retry budget exhausted (3 attempts): cloud-control2-saas::GCE_BAD_REQUEST: Invalid value for field 'networkPeering.name': '*******'. Must be a match of regex '(?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?)'.
**** is the project ID with the VPC network suffix after a dash and it looks a bit like this (I've changed values)
website.com:api-project-0000000000-default
This value is being assigned somewhere by Google, I am not given a choice to select this or enter this through the instructions when creating the Instance.
Googling the error doesn't show me anything useful and sadly I do not have budget to acquire GCP support in this instance to try and ask them why their instruction appear not to work.
I've already checked quotas, billing, service account permissions, etc. I've also tried both a new VPC as well as a shared VPC with all the settings from the guide.
Would appreciate someone more experienced in this area maybe point me in the right direction or if someone has some sort of understanding of where else to check what could be wrong I would appreciate it.
Instructions do point at creating a peering connection but the instructions themselves require the Cloud Data Fusion Instance to be created before configuring the peering connection and since I can't create the Cloud Data Fusion Instance I am unsure on what exactly I am supposed to do.
Appreciate the help!

According to this documentation, before creating a private instance I assume you're creating a VPC network.
networkPeering.name is a combination of your Project-id and VPC-network. The error which you're getting is due to incorrect naming convention of networkPeeering name. ie. the value of networkPeering.name does not match the regex expression (?:[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?), which in your case is due to the project-ID: website.com:api-project-xxxxxxxxx.
Also note that networkPeering name should be less than 63 characters in length as per the regex expression.

Related

Google Analytics not filtering internal traffic

I know there have been similar questions in the past but I have tried many solutions given online to no avail. I am just not able to hide internal traffic for Google Analytics on my Django site.
I am setting the filter from Admin->View->Filters. Have tried Predefined and Custom both with fixed IP as well as a regex pattern. (Yes, I have double checked my IP from whatismyip.com and I am using the right one)
I read somewhere that it takes time for the filters to come into effect, so even waited for 24 hours but I still see a lot of internal traffic.
Google Tag Assistant is also tracking the pages when I access them from internal IP (not sure if its supposed to know about the filters)
Not sure where could I be going wrong.
(I am using reverse proxy but hopefully that wouldn't change anything since the google analytics code is run on the client side)

Do not use any filter on the default view (called 'All Website Data'). Create a separate view and then create a filter on it. That will work.
(After struggling with it for a few days, this response helped me with the above fix)

I struggled with this as well, so here is what I found out.
Note that real time reporting can take up to 2hrs to catch up to and reflect analytics configuration changes such as the addition of filters.
Possible solutions
1) As suggested in the other answer, leave the default view as default and create an additional view for the filters:
The default view collects
all traffic. You need to create a new view for which you can apply
your filter. Check out item 3 here
https://support.google.com/analytics/answer/1009618?hl=en
How to add
a new view: https://support.google.com/analytics/answer/1009714?hl=en
2) Filter IP v6, not v4:
Exclude the ipv6 address as mentioned in above post. This is the one
that "what is my ip address" returns. It's not the ipv4 syntax
(xxx.xxx.xxx.xxx) However, I have noticed that wired machines that
stay connected seem to keep the same ipv6 IP (the 31 digit sequence),
however wireless accounts (mobile phones, tablets) tend to be dynamic.
However, as posted above if you use just the first 15 digits of the
sequence and use the "begins with" filter type, it will block
the devices using the same shared router (ie. internet router in your
home)
About filtering only the first 15 digits:
I think it is meant to filter the first four blocks, so if your IPv6 looks like 2601:191:c001:2f9:5c5a:1c20:61b6:675a, then filter IP that begins with 2601:191:c001:2f9:.
Information found here.

Amazon Web Services - Length of HostIds returned

I'm making a platform that requires allocating hosts programatically. The API returns a HostId, that's a string. (Source: boto3 docs). If anyone has experience dealing with AWS, could you tell me if this string has a constant length? And if it does, how long is it?
This is from the perspective of designing a database - specifically for setting a maximum length to the field. I don't want to assign the host ids superfluous space.

Since december 2016 aws has moved to a new format which is a resource identifier followed by a 17-character string.
Each resources have their own prefix, for example:
instance is i-xxxx....
an EBS snapshot is snap-xxxx....
reserved instances are r-xxx....
...

AWS DynamoDB resource not found exception

I have a problem with connection to DynamoDB. I get this exception:
com.amazonaws.services.dynamodb.model.ResourceNotFoundException:
Requested resource not found (Service: AmazonDynamoDB; Status Code:
400; Error Code: ResourceNotFoundException; Request ID: ..
But I have a table and region is correct.

From the docs it's either you don't have a Table with that name or it is in CREATING status.
I would double check to verify that the table does in fact exist, in the correct region, and you're using an access key that can reach it

My problem was stupid but maybe someone has the same... I changed recently the default credentials of aws (~/.aws/credentials), I was testing in another account and forgot to rollback the values to the regular account.

I spent 1 day researching the problem in my project and now I should repay a debt to humanity and reduce the entropy of the universe a little.
Usually, this message says that your client can't reach a table in your DB.
You should check the next things:
1. Your database is running
2. Your accessKey and secretKey are valid for the database
3. Your DB endpoint is valid and contains correct protocol ("http://" or "https://"), and correct hostname, and correct port
4. Your table was created in the database.
5. Your table was created in the database in the same region that you set as a parameter in credentials. Optional, because some
database environments (e.g. Testcontainers Dynalite) don't have an incorrect value for the region. And any nonempty region value will be correct
In my case problem was that I couldn't save and load data from a table in tests with DynamoDB substituted by Testcontainers and Dynalite. I found out that in our project tables creates by Spring component marked with #Component annotation. And in tests, we are using a global setting for lazy loading components to test, so our component didn't load by default because no one call it in the test explicitly. ¯_(ツ)_/¯

If DynamoDB table is in a different region, make sure to set it before initialising the DynamoDB by
AWS.config.update({region: "your-dynamoDB-region" });
This works for me:)

Always ensure that you do one of the following:
The right default region is set up in the AWS CLI configuration files on all the servers, development machines that you are working on.
The best choice is to always specify these constants explicitly in a separate class/config in your project. Always import this in code and use it in the boto3 calls. This will provide flexibility if you were to add or change based on the enterprise requirements.

If your resources are like mine and all over the place, you can define the region_name when you're creating the resource.
I do this for all my instantiations as it forces me to think about what I'm putting/calling where.
boto3.resource("dynamodb", region_name='us-east-2')

I was getting this issue in my .NetCore Application.
Following fixed the issue for me in Startup class --> ConfigureServices method
services.AddDefaultAWSOptions(
new AWSOptions
{
Region = RegionEndpoint.GetBySystemName("eu-west-2")
});

I got Error warning Lambda : lifecycleIteration=0 lambda handler returned an error: ResourceNotFoundException: Requested resource not found
I spent 1 week to fix the issue.
And so its root cause and steps to find issue is mentioned in below Git Issue thread and fixed it.
https://github.com/soto-project/soto/issues/595

Trying to set up AWS IoT button for the first time: Please correct validation errors with your trigger

Has anyone successfully set up their AWS IoT button?
When stepping through with default values I keep getting this message: Please correct validation errors with your trigger. But there are no validation errors on any of the setup pages, or the page with the error message.
I hate asking a broad question like this but it appears no one has ever had this error before.

This has been driving me nuts for a week!
I got it to work by using Custom IoT Rule instead of IoT Button on the IoT Type. The default rule name is iotbutton_xxxxxxxxxxxxxxxx and the default SQL statement is SELECT * FROM 'iotbutton/xxxxxxxxxxxxxxxx' (xxx... = serial number).
Make sure you copy the policy from the sample code into the execution role - I know that has tripped up a lot of people.

I was getting the same error. The cause turned out to be that I had multiple certificates associated with the button. This was caused by me starting over again on the wizard, generating cert & key, loading cert & key again. While on the device itself this doesn't seem to be a problem, the result was that on AWS I had multiple certs associated to the device.
Within the AWS IoT Resources view I eventually managed to delete all resources. Took some fiddling to get certs detached and able to be deleted. Once I deleted all resources I returned to the wizard, created yet another cert & key pair, pushed the Lambda code, and everything works.

Is it possible to get a time for state transition for an Amazon EC2 instance?

I'm accessing EC2 with the aws-sdk for Ruby. I have an array of instances from describe_instances().
This provides me with the state of the instances and even a state transition reason. But how can I get a time for the state transition?
Edit
So I have:
client=Aws::EC2::Client()
resp =client.describe_instances({ filters })
and I would need
resp.reservations[0].instances[0].state_transition_time #=> Time
similar to
resp.reservations[0].instances[0].state_transition_reason #=> String

This information is not available via the Amazon EC2 API at this time. The aws-sdk gem returns all of the information available from the DescribeInstances operation as documented here: http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeInstances.html

The State Transition Reason is not always populated with a date and time and may not even be populated at all per the documentation. I have not found any hints in the documentation that specify the conditions in which you DO get a date/time, but in my experience, the date/time are present in the State Transition Reason for between 30 and 90 days. After that, the reason seems to persist, but the date is dropped from the string.
All of the documentation that I can find is listed here:
Attribute Definition
EC2 API - Ruby

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js