Nothing being written into the Redshift table [closed] - amazon-web-services

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed yesterday.
Improve this question
I have this AWS Lambda function for writing to Redshift. It executes without error but doesn't actually create the table. Does anyone have any thoughts on what might be wrong or what checks I could perform?
import json
import boto3
import botocore.session as bc
from botocore.client import Config
print('Loading function')
secret_arn = 'arn:aws:secretsmanager:<some secret stuff here>'
cluster_id = 'cluster_id'
bc_session = bc.get_session()
region = boto3.session.Session().region_name
session = boto3.Session(
botocore_session=bc_session,
region_name=region
)
config = Config(connect_timeout = 180, read_timeout = 180)
client_redshift = session.client("redshift-data", config = config)
def lambda_handler(event, context):
query_str = "create table db.lambda_func (id int);"
try:
result = client_redshift.execute_statement(Database = 'db',
SecretArn = secret_arn,
Sql = query_str,
ClusterIdentifier = cluster_id)
print("API successfully executed")
print('RESULT: ', result)
stmtid = result['Id']
response = client_redshift.describe_statement(Id=stmtid)
print('RESPONSE: ', response)
except Exception as e:
raise Exception(e)
return str(result)
RESULT: {'ClusterIdentifier': 'redshift-datalake', 'CreatedAt':
datetime.datetime(2023, 2, 16, 16, 56, 9, 722000, tzinfo=tzlocal()),
'Database': 'db', 'Id': '648bd5b6-6d3f-4d12-9435-
94e316e8dbaa', 'SecretArn': 'arn:aws:secretsmanager:<secret_here>',
'ResponseMetadata': {'RequestId': '648bd5b6-6d3f-4d12-9435-
94e316e8dbaa', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-
requestid': '648bd5b6-6d3f-4d12-9435-94e316e8dbaa', 'content-type':
'application/x-amz-json-1.1', 'content-length': '249', 'date': 'Thu,
16 Feb 2023 16:56:09 GMT'}, 'RetryAttempts': 0}}
RESPONSE: {'ClusterIdentifier': 'redshift-datalake', 'CreatedAt':
datetime.datetime(2023, 2, 16, 16, 56, 9, 722000, tzinfo=tzlocal()),
'Duration': -1, 'HasResultSet': False, 'Id': '648bd5b6-6d3f-4d12-
9435-94e316e8dbaa', 'QueryString': 'create table db.lambda_func (id
int);', 'RedshiftPid': 0, 'RedshiftQueryId': 0, 'ResultRows': -1,
'ResultSize': -1, 'SecretArn': 'arn:aws:secretsmanager:
<secret_here>', 'Status': 'PICKED', 'UpdatedAt':
datetime.datetime(2023, 2, 16, 16, 56, 9, 904000, tzinfo=tzlocal()),
'ResponseMetadata': {'RequestId': '15e99ba3-8b63-4775-bd4e-
c8d4f2aa44b4', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-
requestid': '15e99ba3-8b63-4775-bd4e-c8d4f2aa44b4', 'content-type':
'application/x-amz-json-1.1', 'content-length': '437', 'date': 'Thu,
16 Feb 2023 16:56:09 GMT'}, 'RetryAttempts': 0}}

Related

KeyError: 'GroupName'

I'm creating an IAM group and trying to print the group name that gets created. When I try that, it's giving me this error
KeyError: 'GroupName'
Here's my function
def cf_admin_iam_group():
iam = boto3.client('iam')
try:
response = iam.create_group(GroupName='Test')
print(response['GroupName'])
except botocore.exceptions.ClientError as error:
print(error)
When I try to just print(response)
I get the expected output
{'Group': {'Path': '/', 'GroupName': 'Test', 'GroupId': 'AGPAXVCO7KXYHZP24FQFZ', 'Arn':
'arn:aws:iam::526299125232:group/Test', 'CreateDate': datetime.datetime(2022, 9, 13, 19, 17, 51, tzinfo=tzutc())},
'ResponseMetadata': {'RequestId': '7b2e7c6c-a811-497a-b2c5-177c70a0464c',
'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '7b2e7c6c-a811-497a-b2c5-177c70a0464c',
'content-type': 'text/xml', 'content-length': '490', 'date': 'Tue, 13 Sep 2022 19:17:51 GMT'}, 'RetryAttempts': 0}}
I'm not sure why running print(response['GroupName']) is giving me an error instead of printing the group name.
response is a dictionary, and GroupName is a key of the Group value, so you need to use:
print(response['Group']['GroupName'])

Empty datapoints received while retrieving AWS S3 Request metrics

Following is my payload
response = cloudwatch.get_metric_statistics(
Namespace='AWS/S3',
Dimensions=[
{
'Name': 'BucketName',
'Value': 'foo-bar'
},
{
'Name': 'StorageType',
'Value': 'AllStorageTypes'
}
],
MetricName='BytesUploaded',
StartTime=datetime(2021, 3, 11),
EndTime=datetime(2021, 3, 14),
Period=86400,
Statistics=[
'Maximum', 'Average'
]
)
and this is the response
{'Label': 'BytesUploaded', 'Datapoints': [], 'ResponseMetadata': {'RequestId': '1c6b02e9-9a8f-48e9-a2fd-1e21fd31a096', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '1c6b02e9-9a8f-48e9-a2fd-1e21fd31a096', 'content-type': 'text/xml', 'content-length': '336', 'date': 'Tue, 16 Mar 2021 05:51:05 GMT'}, 'RetryAttempts': 0}}
From AWS Console, I'm able to see datapoints for the same timestamp. I tried increasing the timeframe but it still gibes the same result
Can some help me please? thanks
First, If period variable is a refresh period maybe you should need to reduce. When I checked out to example, I saw period is 300.
Second, try to change endTime like:
from datetime import datetime
from datetime import timedelta
EndTime = datetime.utcnow(),

How to stream word document in bytes stored in AWS S3 from boto3

Using boto3, I am trying to retrieve a Microsoft Word document stored in S3. However, when I try to access the object calling client.get_object() the content-length of Word Document is 0 while files with .txt extensions return the correct content-length. Is there a way to decode the Word Document in order to write its output to a stream?
I have tested this with .txt files and .docs files and I have also tried using the .decode() method after reading the file, but based on the content being returned, there doesn't seem to be anything to decode.
Accessing a .txt Document I notice that the content-length is 17 (the number of characters in the file) and they can be read by calling txt_file.read()
s3 = boto3.client('s3')
txt_file = s3.get_object(Bucket="test_bucket", Key="test.txt").get()
>>> txt_file
{
u'Body': <botocore.response.StreamingBody object at 0x7fc5f0074f10>,
u'AcceptRanges': 'bytes',
u'ContentType': 'text/plain',
'ResponseMetadata': {
'HTTPStatusCode': 200,
'RetryAttempts': 0,
'HTTPHeaders': {
'content-length': '17',
'accept-ranges': 'bytes',
'server': 'AmazonS3',
'last-modified': 'Sat, 06 Jul 2019 02:13:45 GMT',
'date': 'Sat, 06 Jul 2019 15:58:21 GMT',
'x-amz-server-side-encryption': 'AES256',
'content-type': 'text/plain'
}
}
}
Accessing a .docx Document I notice that the content-length is 0 (while the document has the same string written to the .txt file) and calling txt_file.read() outputs the empty string u''
s3 = boto3.client('s3')
word_file = s3.get_object(Bucket="test_bucket", Key="test.docx").get()
>>> word_file
{
u'Body': <botocore.response.StreamingBody object at 0x7fc5f0074f10>,
u'AcceptRanges': 'bytes',
u'ContentType': 'binary/octet-stream',
'ResponseMetadata': {
'HTTPStatusCode': 200,
'RetryAttempts': 0,
'HTTPHeaders': {
'content-length': '0',
'accept-ranges': 'bytes',
'server': 'AmazonS3',
'last-modified': 'Thu, 04 Jul 2019 21:51:53 GMT',
'date': 'Sat, 06 Jul 2019 15:58:30 GMT',
'x-amz-server-side-encryption': 'AES256',
'content-type': 'binary/octet-stream'
}
}
}
I expect the content-length of both files to output the number of bytes in the file, however, only the .txt file is returning data.

Dataflow pipeline "lost contact with the service"

I'm running into trouble with an Apache Beam pipline on Google Cloud Dataflow.
The pipeline is simple: Reading json from GCS, extracting text from some nested fields, writing back to GCS.
It works fine when testing with a smaller subset of input files but when I run it on the full data set, I get the following error (after running fine through around 260M items).
Somehow the "worker eventually lost contact with the service"
(8662a188e74dae87): Workflow failed. Causes: (95e9c3f710c71bc2): S04:ReadFromTextWithFilename/Read+FlatMap(extract_text_from_raw)+RemoveLineBreaks+FormatText+WriteText/Write/WriteImpl/WriteBundles/Do+WriteText/Write/WriteImpl/Pair+WriteText/Write/WriteImpl/WindowInto(WindowIntoFn)+WriteText/Write/WriteImpl/GroupByKey/Reify+WriteText/Write/WriteImpl/GroupByKey/Write failed., (da6389e4b594e34b): A work item was attempted 4 times without success. Each time the worker eventually lost contact with the service. The work item was attempted on:
extract-tags-150110997000-07261602-0a01-harness-jzcn,
extract-tags-150110997000-07261602-0a01-harness-828c,
extract-tags-150110997000-07261602-0a01-harness-3w45,
extract-tags-150110997000-07261602-0a01-harness-zn6v
The Stacktrace shows a Failed to update work status/Progress reporting thread got error error:
Exception in worker loop: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 776, in run deferred_exception_details=deferred_exception_details) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 629, in do_work exception_details=exception_details) File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 168, in wrapper return fun(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 490, in report_completion_status exception_details=exception_details) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 298, in report_status work_executor=self._work_executor) File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py", line 333, in report_status self._client.projects_locations_jobs_workItems.ReportStatus(request)) File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py", line 467, in ReportStatus config, request, global_params=global_params) File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", line 723, in _RunMethod return self.ProcessHttpResponse(method_config, http_response, request) File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", line 729, in ProcessHttpResponse self.__ProcessHttpResponse(method_config, http_response, request)) File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", line 600, in __ProcessHttpResponse http_response.request_url, method_config, request) HttpError: HttpError accessing <https://dataflow.googleapis.com/v1b3/projects/qollaboration-live/locations/us-central1/jobs/2017-07-26_16_02_36-1885237888618334364/workItems:reportStatus?alt=json>: response: <{'status': '400', 'content-length': '360', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Wed, 26 Jul 2017 23:54:12 GMT', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json; charset=UTF-8'}>, content <{ "error": { "code": 400, "message": "(7f8a0ec09d20c3a3): Failed to publish the result of the work update. Causes: (7f8a0ec09d20cd48): Failed to update work status. Causes: (afa1cd74b2e65619): Failed to update work status., (afa1cd74b2e65caa): Work \"6306998912537661254\" not leased (or the lease was lost).", "status": "INVALID_ARGUMENT" } } >
And Finally:
HttpError: HttpError accessing <https://dataflow.googleapis.com/v1b3/projects/[projectid-redacted]/locations/us-central1/jobs/2017-07-26_18_28_43-10867107563808864085/workItems:reportStatus?alt=json>: response: <{'status': '400', 'content-length': '358', 'x-xss-protection': '1; mode=block', 'x-content-type-options': 'nosniff', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'server': 'ESF', '-content-encoding': 'gzip', 'cache-control': 'private', 'date': 'Thu, 27 Jul 2017 02:00:10 GMT', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json; charset=UTF-8'}>, content <{ "error": { "code": 400, "message": "(5845363977e915c1): Failed to publish the result of the work update. Causes: (5845363977e913a8): Failed to update work status. Causes: (44379dfdb8c2b47): Failed to update work status., (44379dfdb8c2e88): Work \"9100669328839864782\" not leased (or the lease was lost).", "status": "INVALID_ARGUMENT" } } >
at __ProcessHttpResponse (/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py:600)
at ProcessHttpResponse (/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py:729)
at _RunMethod (/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py:723)
at ReportStatus (/usr/local/lib/python2.7/dist-packages/apache_beam/runners/dataflow/internal/clients/dataflow/dataflow_v1b3_client.py:467)
at report_status (/usr/local/lib/python2.7/dist-packages/dataflow_worker/workerapiclient.py:333)
at report_status (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:298)
at report_completion_status (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:490)
at wrapper (/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py:168)
at do_work (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:629)
at run (/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py:776)
This looks like an error to the data flow internals to me. Can anyone confirm? Are there any workarounds?
The HttpError typically appears after the workflow has failed and is part of the failure/teardown process.
It looks like there were others error reported in your pipeline, such as the following. Note that if the same elements fail 4 times it will be marked failing.
Try looking the Stack Traces section in the UI to identify the other errors and their stack traces. Since this only occurs on the larger dataset, consider the possibility of their being malformed elements that only exist in the larger dataset.

Facebook graph API : can post on "me/feed" but not on "page_id/feed" (error : 1455002)

I guess the answer to this one is straightforward but I cannot find it. Any help would be very much appreciated.
I. Use case
The application (back-end in python / django) should write on a facebook page.
II. Symptoms
When running the code below on "me/feed", the post is correctly inserted
When running the code below on "PAGE_ID/feed", there is an exception (see below in section IV.)
The scope of the authorisation is publish_stream, manage_pages
Also, the user_token is from a user in the test domain
III. Code
## Getting the user_access_token is dealt with before
h = Http()
data = dict(message="Hello", access_token=user_access_token['access_token'])
resp, content = h.request("https://graph.facebook.com/PAGE_ID/feed", "POST", urlencode(data))
IV. Exception generated (using /PAGE_ID/feed)
resp : Response: {'status': '400', 'content-length': '119', 'expires': 'Sat, 01 Jan 2000 00:00:00 GMT', 'www-authenticate':
'OAuth "Facebook Platform" "invalid_request" "(#1) An unknown error occurred"', 'x-fb-rev': '976458',
'connection': 'keep-alive', 'pragma': 'no-cache', 'cache-control': 'no-store', 'date': 'Tue, 22 Oct 2013 21:45:20
GMT', 'access-control-allow-origin': '*', 'content-type': 'text/javascript; charset=UTF-8', 'x-fb-debug':
'HFItWh64ob+3hErv+rgYdFzHlRBVHP7Pg0Eg4hvqYlY='}
content str: {"error":{"message":"(#1) An unknown error occurred","type":"OAuthException","code":1,"error_data":
{"kError":1455002}}}