Archive to and retrieval from glacier storage of amazon aws - amazon-web-services

I am trying to create an archive and retrieval system in php. When the user click on archive button particular files will move to glacier storage from standard storage and when click on restore button file in the glacier storage will retrieved to standard storage.
Using AWS php SDK 3.0 api I have successfully move files to glacier vault and for retrieval an archive-retrieval job is initiated and I got the job Id after 3-5 hours and using that Job id after 5 hours I tried getJobOutput function. And I am getting the response same as mentioned in the api documentation but I am not getting the restored file in my s3 bucket.
Here is my code to upload to glacier and restore from glacier
public function archiveAndRestore() {
$this->s3Client = new S3Client(Configure::read('AWScredentials'));
$this->glacier = GlacierClient::factory(Configure::read('AWScredentials'));
// Upload to glacier
$this->s3Client->registerStreamWrapper();
$context = stream_context_create([
's3' => ['seekable' => true]
]);
$result = $this->glacier->uploadArchive(array(
'vaultName' => 'archiveTest',
'archiveDescription' => 'File Name is archiveTest.txt ',
'body' => fopen('s3://storage-bucket/Videos/archiveTest.txt', 'r', false, $context),
));
$archiveid = $result->get('archiveId');
$jobId = $this->glacier->initiateJob([
'accountId' => '-',
'vaultName' => 'archiveTest',
'jobParameters' => [
'Type' => 'archive-retrieval',
'ArchiveId' => 'ORgyyyqsKwoopp110EvFoyqj3G-csmOKLyy3IJnWF9Dpd8BJfwerEhg241nxHf6y6kNUUyhUHOaY4y8QvWBGESmAopa80f6GZ9C05tyyKANhY-qfBUB6YkfTABg',
],
]);
$this->s3Client->registerStreamWrapper();
$context = stream_context_create([
's3' => ['seekable' => true]
]);
$stream = fopen('s3://storage-bucket/RetrivedFiles/test1.txt', 'w');
$result = $this->glacier->getJobOutput([
'accountId' => '-',
'jobId' => '2dddfffffff9SwZIOPWxcB7TLm_3apNx--2rIiD7SgjOJjjkrerrcN1YCtivh_zsmpLyczY4br-bhyyX0Ev5B7e6-D1',
'vaultName' => 'archiveTest',
'saveAs' => $stream,
]);
fclose($stream);
}
According to the documentation (aws GetJobOutput operation documentation) saveAs attribute of getJobOutput function is to Specify where the contents of the operation should be downloaded. Can be the path to a file, a resource returned by fopen, or a Guzzle\Http\EntityBodyInterface object. As I am giving a path to the file in s3 also. What will be the issue. Any help is really appreciated. Thanks in advance.
This is the result contained in the response $result which is exactly same as mentioned in documentation
Aws\Result Object ( [data:Aws\Result:private] => Array ( [body] => GuzzleHttp\Psr7\Stream Object ( [stream:GuzzleHttp\Psr7\Stream:private] => Resource id #25 [size:GuzzleHttp\Psr7\Stream:private] => [seekable:GuzzleHttp\Psr7\Stream:private] => 1 [readable:GuzzleHttp\Psr7\Stream:private] => 1 [writable:GuzzleHttp\Psr7\Stream:private] => 1 [uri:GuzzleHttp\Psr7\Stream:private] => php://temp [customMetadata:GuzzleHttp\Psr7\Stream:private] => Array ( ) ) [checksum] => c176c1843fd0c0fc662lh9bb8de916540e6f9dpk9b22020bbb8388jk6f81d1c2 [status] => 200 [contentRange] => [acceptRanges] => bytes [contentType] => application/octet-stream [archiveDescription] => File Name is children-wide.jpg [#metadata] => Array ( [statusCode] => 200 [effectiveUri] => https://glacier.region-name.amazonaws.com/-/vaults/vaultname/jobs/gFdjAl4xhTAVEnmffgfg-Ao3-xmmjghfmqkCLOR1m34gHLQpMd0a3WKCiRRrItv2bklawwZnq9KeIch3LKs8suZoJwk2_/output [headers] => Array ( [x-amzn-requestid] => NzAiVAfrMQbpSjj-2228iiKWK_VteDwNyFTUR7Kyu0duno [x-amz-sha256-tree-hash] => c176c1843khfullc662f09bb8de916540e6f9dcc9b22020bbb8388de6f81d1c2 [accept-ranges] => bytes [x-amz-archive-description] => File Name is children-wide.jpg [content-type] => application/octet-stream [content-length] => 1452770 [date] => Tue, 31 Jan 2017 03:34:26 GMT [connection] => close ) [transferStats] => Array ( [http] => Array ( [0] => Array ( ) ) ) ) ) )

When you are restoring files from Glacier it will not get Standard as storage class anymore. it will still show Glacier. to determine either files came down from Glacier or not,
Use GetObject instead, and look at the Restore value of the result. And set the Range to "bytes=0-0" to skip retrieving the content of the file itself. And be sure to trap for exceptions
if the object is in Glacier and not restored, AWS will throw an InvalidObjectStateError and the script will die if the error is not caught.
This is what you will see it the item resored.
["Restore"] => string(68)"ongoing-request="false ", expiry-date=" Thu,
12 Oct 2017 00: 00: 00 GMT ""
And this is what you will get if Item is still in Glacier
Fatal error: Uncaught exception 'Aws\S3\Exception\S3Exception' with
message 'Error executing "GetObject" on
"OBJ PATH";
AWS HTTP error: Client error: GET
OBJ PATH
resulted in a 403 Forbidden response: InvalidObjectStateThe
operation is not valid for the (truncated...) InvalidObjectState
(client): The operation is not valid for the object's storage class -
InvalidObjectStateThe operation is not valid for the object 's storage
class879A42BDC3939282VjgBNmLxhqesAaOnnUKkIahdr9OlUnTPASmjh8zZNVzLeYEDz+QooqoFjyaeoyXGeAa/IPxTBrA='
GuzzleHttp\ Exception\ ClientException: Client error: `GET
OBJ PATH
in
C:\inetpub\wwwroot\cruisecheap.com\php_includes\SDKs\AWS\vendor\aws\aws-sdk-php\src\WrappedHttpHandler.php
on line 192
I hope this can help you and other people that having the same problem.

Related

Net::Amazon::S3::Client Produces Error on initiate_multipart_upload

There seems to be some issue triggering initiate_multipart_upload in Net::Amazon::S3::Client. When I do so, I receive the error:
Can't call method "findvalue" on an undefined value at /usr/local/share/perl5/Net/Amazon/S3/Operation/Object/Upload/Create/Response.pm line 17.
I can do a normal put on the object without any error. All else seems to be functional, with lists of the bucket working, etc. Here is a code snippet of what I'm trying:
use Net::Amazon::S3;
our $s3 = Net::Amazon::S3->new({
aws_access_key_id => $config{'access_key_id'},
aws_secret_access_key => $config{'access_key'},
retry => 1
});
our $s3client = Net::Amazon::S3::Client->new( s3 => $s3 );
my $bucket = $s3client->bucket( name => $bucketName );
my $s3object = $bucket->object( key => 'test.txt' );
print 'Key: ' . $s3object->key;
my $uploadId = $s3object->initiate_multipart_upload;
Net::Amazon::S3::Client doesn't object to creating the bucket or the object.
I've been able to initialize a multipart upload using the Paws::S3 library on this bucket just fine. I switched to Net::Amazon::S3::Client since Paws::S3 seems to get stuck at around the 21% mark of uploading my multipart file.

Unable to implement logstash pipeline for kaka as input and s3 as output with each message persisted as individual file

How can I create logstash (https://www.elastic.co/logstash/) pipeline to transfer a single individual message to AWS s3 bucket individual files with the file name as one of the attributes of Kafka message.
I am able to set up a simple pipeline using the following.
I am using s3 output plugin:
https://www.elastic.co/guide/en/logstash/current/plugins-outputs-s3.html
and Kafka input plugin :
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html
input {
kafka {
bootstrap_servers => "mykafkaserver:9092"
topics => "document"
group_id => "xLogAna1"
auto_offset_reset => "earliest"
max_poll_records => 1
fetch_max_bytes=>10
}
}
output {
s3{
access_key_id => "XXXXXXXXXXXXXXX"
secret_access_key => "SSSSSSSSSSSSSSS"
region => "eu-west-1"
bucket => "<my-documnt-bucket>"
size_file => 1
time_file => 5
codec => "plain"
}
}
Since I want individual messages in individual files i tweaked the parameters max_poll_records and fetch_max_bytes to get individual messages but it's no help. In resultant s3 files, i am getting numerous kafka messages

How to get the AWS S3 bucket location via PHP api call?

I am searching on the internet on how can I get the AWS s3 bucket region with an API call or directly in PHP using their library but have not luck finding the info.
I have the following info available:
Account credentials, bucket name, access key + secret. That is for multiple buckets, that I have access to, and I need to get the region programatically, so logging in to aws console and checking out is not an option.
Assuming you have an instance of the AWS PHP Client in $client, you should be able to find the location with $client->getBucketLocation().
Here is some example code:
<?php
$result = $client->getBucketLocation([
'Bucket' => 'yourBucket',
]);
The result will look like this
[
'LocationConstraint' => 'the-region-of-your-bucket',
]
When you create a S3 client, you can use any of the available regions in AWS, even if it's not one that you use.
$s3Client = new Aws\S3\S3MultiRegionClient([
'version' => 'latest',
'region' => 'us-east-1',
'credentials' => [
'key' => $accessKey,
'secret' => $secretKey,
],
]);
$region = $s3Client->determineBucketRegion($bucketname);

AWS S3 put object (video) with pre-signed URL, Not working?

I am uploading a video on S3 bucket via PHP S3 API with pre-signed URL.
The mp4 video is uploaded successfully to S3, but it's not streaming
and not giving any kind of error.
Here are the details.
My PHP file to create pre-singed url for S3 putObject method.
require 'aws/aws-autoloader.php';
use Aws\S3\S3Client;
use Aws\Exception\AwsException;
$s3Client = new Aws\S3\S3Client([
'version' => 'latest',
'region' => 'ap-south-1',
'credentials' => [
'key' => 'XXXXXXX',
'secret' => 'XXXXXXX'
]
]);
/*echo '<pre>';
print_r($_FILES);die;*/
if(!$_FILES['file']['tmp_name'] || $_FILES['file']['tmp_name']==''){
echo json_encode(array('status'=>'false','message'=>'file path is required!'));die;
}else{
$SourceFile =$_FILES['file']['tmp_name'];
$key=$_FILES['file']['name'];
$size=$_FILES['file']['size'];
}
try {
$cmd = $s3Client->getCommand('putObject', [
'Bucket' => 's3-signed-test',
'Key' => $key,
'SourceFile' => $SourceFile,
'debug' => false,
'ACL' => 'public-read-write',
'ContentType' => 'video/mp4',
'CacheControl'=>'no-cache',
'ContentLength'=>$size
]);
$request = $s3Client->createPresignedRequest($cmd, '+120 minutes');
// Get the actual presigned-url
$presignedUrl = (string) $request->getUri();
} catch (S3Exception $e) {
echo $e->getMessage() . "\n";die;
}
echo json_encode(array('status'=>'true','signedUrl'=>$presignedUrl));die;
This code is working fine and uploading video mp4 on s3 bucket.
But after upload when I am going to access that video, it's not working
I have tried also with getObject pre-singed url but it's not working.
Here are the S3 object URLs-
(1) getObject pre-singed URL
https://s3-signed-test.s3.ap-south-1.amazonaws.com/file.mp4?X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIVUO7AT4W4MCPDIA%2F20180402%2Fap-south-1%2Fs3%2Faws4_request&X-Amz-Date=20180402T112848Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Signature=d6b877f9bba5dd2221381f10017c8659fe42342d81f7af940d8693478679a8fc
(2) S3 Direct object URL-
https://s3.ap-south-1.amazonaws.com/s3-signed-test/file.mp4
My Problem is I am unable to access video which I have uploaded with pre-singed URL on the s3 bucket, bucket permission is public, and accessible for all origins.
Please let me know, someone who have solution for this.

Return JSON from Athena Query via API

I am able to use the Athena API with startQueryExecution() to create a CSV file of the responses in S3. However, I would like to be able to return to my application a JSON response so I can further process the data. I am trying to return JSON results after I run startQueryExecution() via the API, how do I can get the results as a JSON response back?
I am using the AWS PHP SDK [https://aws.amazon.com/sdk-for-php/] , however this is relevant to any language since I can not find any answers to actually getting a response back, it just saves a CSV file to S3.
$athena = AWS::createClient('athena');
$queryx = 'SELECT * FROM elb_logs LIMIT 20';
$result = $athena->startQueryExecution([
'QueryExecutionContext' => [
'Database' => 'sampledb',
],
'QueryString' => 'SELECT request_ip FROM elb_logs LIMIT 20', // REQUIRED
'ResultConfiguration' => [ // REQUIRED
'EncryptionConfiguration' => [
'EncryptionOption' => 'SSE_S3' // REQUIRED
],
'OutputLocation' => 's3://xxxxxx/', // REQUIRED
],
]);
// check completion : getQueryExecution()
$exId = $result['QueryExecutionId'];
sleep(6);
$checkExecution = $athena->getQueryExecution([
'QueryExecutionId' => $exId, // REQUIRED
]);
if($checkExecution["QueryExecution"]["Status"]["State"] == 'SUCCEEDED')
{
$dataOutput = $athena->getQueryResults([
'QueryExecutionId' => $result['QueryExecutionId'], // REQUIRED
]);
while (($data = fgetcsv($dataOutput, 1000, ",")) !== FALSE) {
$num = count($data);
echo "<p> $num fields in line $row: <br /></p>\n";
$row++;
for ($c=0; $c < $num; $c++) {
echo $data[$c] . "<br />\n";
}
}
}
The Amazon Athena SDK will return the results of a query and then you can write (send) this as JSON. The SDK will not do this for you itself.
The API startQueryExecution() retuns QueryExecutionId. Use this to call getQueryExecution() to determine if the query is complete. Once the query completes call getQueryResults().
You can then process each row in the result set.