Below is my code for writing into a BigQuery table:
from google.cloud import bigquery
response = bigquery.tabledata.insertAll(projectId=PROJECT_ID,datasetId=DATASET_ID,
tableId=TABLE_ID,
body=data).execute()
However, I'm getting the following error:
no module tabledata in google.cloud.bigquery
Can anyone help me with this?
Which API should I use here?
Please, check the Streaming data into BigQuery documentation. When using python you need to use following function:
insert_rows(table, rows, selected_fields=None, **kwargs)
which insert rows into a table via the streaming API. For more information refer to following BigQuery Python API reference documentation.
You can check Python streaming insert example:
# TODO(developer): Import the client library.
# from google.cloud import bigquery
# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()
# TODO(developer): Set table_id to the ID of the model to fetch.
# table_id = "your-project.your_dataset.your_table"
table = client.get_table(table_id) # Make an API request.
rows_to_insert = [(u"Phred Phlyntstone", 32), (u"Wylma Phlyntstone", 29)]
errors = client.insert_rows(table, rows_to_insert) # Make an API request.
if errors == []:
print("New rows have been added.")
You can also use API, calling tabledata.insertAll method to see the request and response of API. You need to specify projectId, datasetId, tableId. You can see JavaScript code snippet that is used to perform request:
function execute() {
return gapi.client.bigquery.tabledata.insertAll({
"projectId": "<your_projectId>",
"datasetId": "<your_datasetId>",
"tableId": "<your_tableId>",
"resource": {}
})
.then(function(response) {
// Handle the results here (response.result has the parsed body).
console.log("Response", response);
},
function(err) { console.error("Execute error", err); });
}
Let me know about the results.
Related
The below-mentioned code is created for exporting all the findings from the security hub to an S3 bucket using lambda functions. The filters are set for exporting only CIS-AWS foundations benchmarks. There are more than 20 accounts added as the members in security hub. The issue that I'm facing here is even though I'm using the NextToken configuration. The output doesn't have information about all the accounts. Instead, it just displays any one of the account's data randomly.
Can somebody look into the code and let me know what could be the issue, please?
import boto3
import json
from botocore.exceptions import ClientError
import time
import glob
client = boto3.client('securityhub')
s3 = boto3.resource('s3')
storedata = {}
_filter = Filters={
'GeneratorId': [
{
'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
'Comparison': 'PREFIX'
}
],
}
def lambda_handler(event, context):
response = client.get_findings(
Filters={
'GeneratorId': [
{
'Value': 'arn:aws:securityhub:::ruleset/cis-aws-foundations-benchmark',
'Comparison': 'PREFIX'
},
],
},
)
results = response["Findings"]
while "NextToken" in response:
response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
results.extend(response["Findings"])
storedata = json.dumps(response)
print(storedata)
save_file = open("/tmp/SecurityHub-Findings.json", "w")
save_file.write(storedata)
save_file.close()
for name in glob.glob("/tmp/*"):
s3.meta.client.upload_file(name, "xxxxx-security-hubfindings", name)
TooManyRequestsException error is also getting now.
The problem is in this code that paginates the security findings results:
while "NextToken" in response:
response = client.get_findings(Filters=_filter,NextToken=response["NextToken"])
results.extend(response["Findings"])
storedata = json.dumps(response)
print(storedata)
The value of storedata after the while loop has completed is the last page of security findings, rather than the aggregate of the security findings.
However, you're already aggregating the security findings in results, so you can use that:
save_file = open("/tmp/SecurityHub-Findings.json", "w")
save_file.write(json.dumps(results))
save_file.close()
I am trying to query the Cost Controller API of AWS for the cost forecast using boto3. Here is the code:
import boto3
client = boto3.client('ce', region_name='us-east-1', aws_access_key_id=key_id, aws_secret_access_key=secret_key)
#the args object presents the filters
data = client.get_cost_forecast(**args)
The result is:
AttributeError: 'CostExplorer' object has no attribute 'get_cost_forecast'
But the actual documentation for the API says that it provides the get_cost_forecast() function.
There is no method get_cost_forecast, you can refer below document to get cost forecast,
Boto3 CostForecast
eg.
import boto3
client = boto3.client('ce')
response = client.get_cost_forecast(
TimePeriod={
'Start': 'string',
'End': 'string'
},
Metric='BLENDED_COST'|'UNBLENDED_COST'|'AMORTIZED_COST'|'NET_UNBLENDED_COST'|'NET_AMORTIZED_COST'|'USAGE_QUANTITY'|'NORMALIZED_USAGE_AMOUNT',
Granularity='DAILY'|'MONTHLY'|'HOURLY',
},
PredictionIntervalLevel=123
)
So, I figured out that the version of botocore I am using 1.8.45 does not support the method get_cost_forecast(). An upgrade to the version 1.9.71 is needed. I hope that this will help other people facing this issue.
I am trying to write tests for a serverless application using the AWS serverless framework. I am facing a weird issue. Whenever I try to mock S3 or DynamoDB using moto, it does not work. Instead of mocking, the boto3 call actually goes to my AWS account and tries to do things there.
This is not desirable behaviour. Could you please help?
Sample Code:
import datetime
import boto3
import uuid
import os
from moto import mock_dynamodb2
from unittest import mock, TestCase
from JobEngine.job_engine import check_duplicate
class TestJobEngine(TestCase):
#mock.patch.dict(os.environ, {'IN_QUEUE_URL': 'mytemp'})
#mock.patch('JobEngine.job_engine.logger')
#mock_dynamodb2
def test_check_duplicate(self, mock_logger):
id = 'ABCD123'
db = boto3.resource('dynamodb', 'us-east-1')
table = db.create_table(
TableName='my_table',
KeySchema=[
{
'AttributeName': 'id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'id',
'AttributeType': 'S'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 1,
'WriteCapacityUnits': 1
}
)
table.meta.client.get_waiter('table_exists').wait(TableName='my_table')
table.put_item(
Item={
'id': {'S': id},
... other data ...
}
)
res = check_duplicate(id)
self.assertTrue(mock_logger.info.called)
self.assertEqual(res, True, 'True')
Please see the above code, I am trying to insert a record into the table and then call a function that would verify if the specified id is already present in the table. Here, I get an error table already exists when I run this code.
If I disable the network, I get an error:
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://dynamodb.us-east-1.amazonaws.com/"
I fail to understand why there is an attempt to connect to AWS if we are trying to mock.
I did some digging and have finally managed to solve this.
See https://github.com/spulec/moto/issues/1793
This issue was due to some incompatibilities between boto and moto. Turns around that everything works fine when we downgrade botocore to 1.10.84
New to the Big Query API. Trying to just do a basic query and have it save to a table.
I am not sure what I am doing wrong with the below.(I have read the similar questions posted about this topic) I don't get an error but it also doesn't save the results in a table like I want.
Any thoughts/advice?
import argparse
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
bigquery_service = build('bigquery', 'v2', credentials=credentials)
query_request = bigquery_service.jobs()
query_data = {
'query': (
'SELECT * '
'FROM [analytics.ddewber_acq_same_day] limit 5;'),
'destinationTable':{
"projectId": 'XXX-XXX-XXX',
"datasetId": 'analytics',
"tableId": "ddewber_test12"
},
"createDisposition": "CREATE_IF_NEEDED",
"writeDisposition": "WRITE_APPEND",
}
query_response = query_request.query(
projectId='XXX-XXX-XXX',
body=query_data).execute()
See the difference between Jobs: query (that you use in your example) and Jobs: insert (that you should use) APIs
Hope this gives you direction to fix your code
I want to get the results of multiple queries into single call onto freebase,which is there in this chapter http://mql.freebaseapps.com/ch04.html. I am using python for querying. I want to query like this
{ # Start the outer envelope
"q1": { # Query envelope for query named q1
"query":{First MQL query here} # Query property of query envelope
}, # End of first query envelope
"q2": { # Start query envelope for query q2
"query":[{Second MQL query here}] # Query property of q2
} # End of second query envelope
}
and get answers like
{
"q1": {
"result":{First MQL result here},
"code": "/api/status/ok"
},
"q2": {
"result":[{Second MQL result here}],
"code": "/api/status/ok"
},
"status": "200 OK",
"code": "/api/status/ok",
"transaction_id":[opaque string value]
}
As specified on that link. I also came across some of the question on SO, which are -
Freebase python
Multiple Queries in MQL on Freebase
But they seems to be using the old API which is "api.freebase.com". The updated API is "www.googleapis.com/freebase"
I tried the following code, but its not working.
import json
import urllib
api_key = "freebase_api_key"
service_url = 'https://www.googleapis.com/freebase/v1/mqlread'
query1 = [{'id': None, 'name': None, 'type': '/astronomy/planet'}]
query2 = [{'id': None, 'name': None, 'type': '/film/film'}]
envelope = {
'q1':query1,
'q2':query2
}
encoded = json.dumps(envelope)
params = urllib.urlencode({'query':encoded})
url = service_url + '?' + params
print url
response = json.loads(urllib.urlopen(url).read())
print response
I am getting error as
{u'error': {u'code': 400, u'message': u'Type /type/object does not have property q1', u'errors': [{u'domain': u'global', u'message': u'Type /type/object does not have property q1', u'reason': u'invalid'}]}}
How can I embed multiple queries into a single MQL query
I'd suggest looking at the Batch capability of the Python client library for the Google APIs.