What if I did not close the s3object in the finally clause?
Are there no resource leaks in the code below?
class S3ClientClass {
lazy val amazonS3Client = this.getS3Client()
private def getS3Client() = {
AmazonS3ClientBuilder
.standard()
.withRegion(Regions.AP_NORTHEAST_1)
.build()
}
def readFromS3(s3Bucket: String, filepath: String): String = {
var s3object: S3Object = null
try {
s3object = amazonS3Client.getObject(s3Bucket, filepath)
readFromS3(s3object)
}
finally {
if (s3object != null) {
s3object.close()
}
}
}
def readFromS3(obj: S3Object): String = {
val reader = new BufferedReader(new InputStreamReader(obj.getObjectContent))
reader.lines().collect(Collectors.joining())
}
}
If you didn't close the S3Object, it would lead to resource leaks.
The S3Object class should be closed to release the resources it holds, as it implements the Closeable interface; resources, in this case, would be network connection(s) to Amazon S3.
This is also explained in the AWS Developer Blog:
S3Object contains an S3ObjectInputStream that lets you stream down your data over the HTTP connection from Amazon S3. Since the HTTP connection is open and waiting, it’s important to read the stream quickly after calling getObject and to remember to close the stream so that the HTTP connection can be released properly.
Related
I have a Spring Boot application that has a POST end-point that accepts 2 types of files. Based on the file category, I need to write them to S3 buckets which are in different regions. Example: Category 1 file should be written to Frankfurt (eu-central-1) and Category 2 file should be written to Ohio (us-east-2) S3 buckets. Spring boot accepts a static region (cloud.aws.region.static=eu-central-1) through property configuration and the connection is established when starting the Spring boot so the AmazoneS3 Client Bean is already created with a connection to Frankfurt itself.
I need to containerize this entire setup and deploy it in a K8 Pod.
What is the recommendation for establishing connections and writing objects to different regional buckets? How do I need to implement this? Looking for a dynamic region finding solution rather statically created Bean per region.
Below is a working piece of code that connects to Frankfurt bucket and PUT the object.
#Service
public class S3Service {
#Autowired
private AmazonS3 amazonS3Client;
public void putObject(MultipartFile multipartFile) {
ObjectMetadata objectMetaData = new ObjectMetadata();
objectMetaData.setContentType(multipartFile.getContentType());
objectMetaData.setContentLength(multipartFile.getSize());
try {
PutObjectRequest putObjectRequest = new PutObjectRequest("example-bucket", multipartFile.getOriginalFilename(), multipartFile.getInputStream(), objectMetaData);
this.amazonS3Client.putObject(putObjectRequest);
} catch (IOException e) {
/* Handle Exception */
}
}
}
Updated Code (20/08/2021)
#Component
public class AmazoneS3ConnectionFactory {
private static final Logger LOGGER = LoggerFactory.getLogger(AmazoneS3ConnectionFactory.class);
#Value("${example.aws.s3.regions}")
private String[] regions;
#Autowired
private DefaultListableBeanFactory beanFactory;
#Autowired
private AWSCredentialsProvider credentialProvider;
#PostConstruct
public void init() {
for(String region: this.regions) {
String amazonS3BeanName = region + "_" + "amazonS3";
if (!this.beanFactory.containsBean(amazonS3BeanName)) {
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard().withPathStyleAccessEnabled(true)
.withCredentials(this.credentialProvider).withRegion(region).withChunkedEncodingDisabled(true);
AmazonS3 awsS3 = builder.build();
this.beanFactory.registerSingleton(amazonS3BeanName, awsS3);
LOGGER.info("Bean " + amazonS3BeanName + " - Not exist. Created a bean and registered the same");
}
}
}
/**
* Returns {#link AmazonS3} for a region. Uses the default {#link AWSCredentialsProvider}
*/
public AmazonS3 getConnection(String region) {
String amazonS3BeanName = region + "_" + "amazonS3";
return (AmazonS3Client)this.beanFactory.getBean(amazonS3BeanName, AmazonS3.class);
}
}
My Service layer will call the "getConnection()" and get the AmazonS3 Object to operate on it.
The only option that I am aware is to create different S3Client with S3ClientBuilder, one for each different region. You would need to register them as Spring Beans with different names so that you can later autowire them.
Update (19/08/2021)
The following should work (sorry for the Kotlin code but it is faster to write):
Class that may contain your configuration for each region.
class AmazonS3Properties(val accessKeyId: String,
val secretAccessKey: String,
val region: String,
val bucket: String)
Configuration for S3 that will create 2 S3Clients and stored the buckets for each region (later needed).
#Configuration
class AmazonS3Configuration(private val s3Properties: Map<String, AmazonS3Properties>) {
lateinit var buckets: Map<String, String>
#PostConstruct
fun init() {
buckets = s3Properties.mapValues { it.bucket }
}
#Bean(name = "regionA")
fun regionA(): S3Client {
val regionAProperties = s3Properties["region-a"]
val awsCredentials = AwsBasicCredentials.create(regionAProperties.accessKeyId, regionAProperties.secretAccessKey)
return S3Client.builder().region(Region.of(regionAProperties.region)).credentialsProvider { awsCredentials }.build()
}
#Bean(name = "regionB")
fun regionB(): S3Client {
val regionBProperties = s3Properties["region-b"]
val awsCredentials = AwsBasicCredentials.create(regionBProperties.accessKeyId, regionBProperties.secretAccessKey)
return S3Client.builder().region(Region.of(regionBProperties.region)).credentialsProvider { awsCredentials }.build()
}
}
Service that will target one of the regions (Region A)
#Service
class RegionAS3Service(private val amazonS3Configuration: AmazonS3Configuration,
#field:Qualifier("regionA") private val amazonS3Client: S3Client) {
fun save(region: String, byteArrayOutputStream: ByteArrayOutputStream) {
val inputStream = ByteArrayInputStream(byteArrayOutputStream.toByteArray())
val contentLength = byteArrayOutputStream.size().toLong()
amazonS3Client.putObject(PutObjectRequest.builder().bucket(amazonS3Configuration.buckets[region]).key("whatever-key").build(), RequestBody.fromInputStream(inputStream, contentLength))
}
}
I am trying to update my appsync client to authenticate with IAM credentials. In case of API_KEY I set the API_KEY_HEADER like so: request.addHeader(API_KEY_HEADER, this.apiKey); Is there a similar way to authenticate in a Java client with IAM credentials? Is there a header I can pass in to pass in the secret and access keys like here: https://docs.amplify.aws/lib/graphqlapi/authz/q/platform/js#iam? Or should I just be using a cognito user pool as a way to authenticate the request?
According to AWS Documentation we need to use sign requests using the process documented here: https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html and steps listed here: https://docs.aws.amazon.com/general/latest/gr/sigv4_signing.html.
I also found an implementation here: https://medium.com/#tridibbolar/aws-lambda-as-an-appsync-client-fbb0c1ce927d. Using the code above:
private void signRequest(final Request<AmazonWebServiceRequest> request) {
final AWS4Signer signer = new AWS4Signer();
signer.setRegionName(this.region);
signer.setServiceName("appsync");
signer.sign(request, this.appsyncCredentials);
}
private Request<AmazonWebServiceRequest> getRequest(final String data) {
final Request<AmazonWebServiceRequest> request =
new DefaultRequest<AmazonWebServiceRequest>("appsync");
request.setHttpMethod(HttpMethodName.POST);
request.setEndpoint(URI.create(this.appSyncEndpoint));
final byte[] byteArray = data.getBytes(Charset.forName("UTF-8"));
request.setContent(new ByteArrayInputStream(byteArray));
request.addHeader(AUTH_TYPE_HEADER, AWS_IAM_AUTH_TYPE);
request.addHeader(HttpHeaders.CONTENT_TYPE, APPLICATION_GRAPHQL);
request.addHeader(HttpHeaders.CONTENT_LENGTH, String.valueOf(byteArray.length));
signRequest(request);
return request;
}
private HttpResponseHandler<String> getResponseHandler() {
final HttpResponseHandler<String> responseHandler = new HttpResponseHandler<String>() {
#Override
public String handle(com.amazonaws.http.HttpResponse httpResponse) throws Exception {
final String result = IOUtils.toString(httpResponse.getContent());
if(httpResponse.getStatusCode() != HttpStatus.SC_OK) {
final String errorText = String.format(
"Error posting request. Response status code was %s and text was %s. ",
httpResponse.getStatusCode(),
httpResponse.getStatusText());
throw new RuntimeException(errorText);
} else {
final ObjectMapper objectMapper = new ObjectMapper();
//custom class to parse appsync response.
final AppsyncResponse response = objectMapper.readValue(result, AppsyncResponse.class);
if(CollectionUtils.isNotEmpty(response.getErrors())){
final String errorMessages = response
.getErrors()
.stream()
.map(Error::getMessage)
.collect(Collectors.joining("\n"));
final String errorText = String.format(
"Error posting appsync request. Errors were %s. ",
errorMessages);
throw new RuntimeException(errorText);
}
}
return result;
}
#Override
public boolean needsConnectionLeftOpen() {
return false;
}
};
return responseHandler;
}
private Response<String> makeGraphQlRequest(final Request<AmazonWebServiceRequest> request) {
return this.httpClient.requestExecutionBuilder()
.executionContext(new ExecutionContext())
.request(request)
.execute(getResponseHandler());
}
I'm trying to upload a large video file (800mb) to my S3 bucket, but it appears to timeout. It works just fine for smaller files. My project is an ASP.Net Core 2.1 application.
This is the exception that is thrown:
An unhandled exception occurred while processing the request.
SocketException: The I/O operation has been aborted because of either a thread exit or an application request.
Unknown location
IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request.
System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.ThrowException(SocketError error)
TaskCanceledException: The operation was canceled.
GamerPilot.Video.AWSS3Helper.UploadFileAsync(Stream stream, string key, S3CannedACL acl, bool useReducedRedundancy, bool throwOnError, CancellationToken cancellationToken) in AWSS3Helper.cs, line 770
My source code looks like this:
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, string filePath, CancellationToken cancellationToken = default(CancellationToken))
{
if (string.IsNullOrEmpty(filePath)) { throw new ArgumentNullException("filePath", "Video filepath is missing"); }
if (!File.Exists(filePath)) { throw new ArgumentNullException("filePath", "Video filepath does not exists"); }
//Test video file upload and db row insertion
using (var stream = File.OpenRead(filePath))
{
return await AddVideoAsync(instructorId, lectureId, videoName, stream, cancellationToken);
}
}
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, Stream videoFile, CancellationToken cancellationToken = default(CancellationToken))
{
var video = (Video) await GamerPilot.Video.Helper.Create(_awsS3AccessKey, _awsS3SecretKey, _awsS3BucketName, _awsS3Region)
.AddVideoAsync(instructorId, lectureId, videoName, videoFile, cancellationToken);
using (var db = new DbContext(_connectionString))
{
db.Videos.Add(video);
var count = await db.SaveChangesAsync();
}
return video;
}`
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, Stream videoFile, CancellationToken cancellationToken = default(CancellationToken))
{
if (string.IsNullOrEmpty(videoName)) { throw new ArgumentNullException("videoName", "Video name cannot be empty or null"); }
if (videoFile == null) { throw new ArgumentNullException("video", "Video stream is missing"); }
var videoNameCleaned = videoName.Replace(" ", "-").ToLower().Replace(".mp4", "");
var videoKey = string.Join('/', "videos", instructorId, lectureId, videoNameCleaned + ".mp4");
using (var aws = new AWSS3Helper(_awsS3AccessKey, _awsS3SecretKey, _awsS3BucketName, _awsS3Region))
{
try
{
//THIS FAILS ------
await aws.UploadFileAsync(videoFile, videoKey, Amazon.S3.S3CannedACL.PublicRead, true, true, cancellationToken);
}
catch (Exception ex)
{
throw;
}
}
return new Video
{
InstructorId = instructorId,
LectureId = lectureId,
Name = videoName,
S3Key = videoKey,
S3Region = _awsS3Region.SystemName,
S3Bucket = _awsS3BucketName,
Created = DateTime.Now
};
}
How can I work around this?
There is no general constraint on S3 itself which would prevent you from uploading an 800MB file. However, there are requirements for the handling of retries and timeouts when working with AWS. It is not clear from your question whether or not you are using Amazon's SDK, (I can't find the origin of GamerPilot.Video.AWSS3Helper.UploadFileAsync). However, Amazon's SDK for .NET should handle this for you if you use it in accordance with the following:
Programming with the AWS SDK for .NET - Retries and Timeouts
Using the AWS SDK for .NET for Multipart Upload (High-Level API)
I am trying to retrieve images from my bucket to send to my mobile apps, I currently have the devices accessing AWS directly, however I am adding a layer of security and having my apps (IOS and Android) now make requests to my server which will then respond with DynamoDB and S3 data.
I am trying to follow the documentation and code samples provided by AWS for .Net and they worked seamlessly for DynamoDB, I am running into problems with S3.
S3 .NET Documentation
My problem is that if I provide no credentials, I get the error:
Failed to retrieve credentials from EC2 Instance Metadata Service
This is expected as I have IAM roles set up and only want my apps and this server (in the future, only this server) to have access to the buckets.
But when I provide the credentials, the same way I provided credentials for DynamoDB, my server waits forever and doesn't receive any responses from AWS.
Here is my C#:
<%# WebHandler Language="C#" Class="CheckaraRequestHandler" %>
using System;
using System.Web;
using System.Collections.Generic;
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.Model;
using Amazon.DynamoDBv2.DocumentModel;
using Amazon;
using Amazon.Runtime;
using Amazon.S3;
using Amazon.S3.Model;
using System.IO;
using System.Threading.Tasks;
public class CheckaraRequestHandler : IHttpHandler
{
private const string bucketName = "MY_BUCKET_NAME";
private static readonly RegionEndpoint bucketRegion = RegionEndpoint.USEast1;
public static IAmazonS3 client = new AmazonS3Client("MY_ACCESS_KEY", "MY_SECRET_KEY", RegionEndpoint.USEast1);
public void ProcessRequest(HttpContext context)
{
if (context.Request.HttpMethod.ToString() == "GET")
{
string userID = context.Request.QueryString["User"];
string Action = context.Request.QueryString["Action"];
if (userID == null)
{
context.Response.ContentType = "text/plain";
context.Response.Write("TRY AGAIN!");
return;
}
if (Action == "GetPhoto")
{
ReadObjectDataAsync(userID).Wait();
}
var client = new AmazonDynamoDBClient("MY_ACCESS_KEY", "MY_SECRET_KEY", RegionEndpoint.USEast1);
Console.WriteLine("Getting list of tables");
var table = Table.LoadTable(client, "TABLE_NAME");
var item = table.GetItem(userID);
if (item != null)
{
context.Response.ContentType = "application/json";
context.Response.Write(item.ToJson());
}
else
{
context.Response.ContentType = "text/plain";
context.Response.Write("0");
}
}
}
public bool IsReusable
{
get
{
return false;
}
}
static async Task ReadObjectDataAsync(string userID)
{
string responseBody = "";
try
{
string formattedKey = userID + "/" + userID + "_PROFILEPHOTO.jpeg";
//string formattedKey = userID + "_PROFILEPHOTO.jpeg";
//formattedKey = formattedKey.Replace(":", "%3A");
GetObjectRequest request = new GetObjectRequest
{
BucketName = bucketName,
Key = formattedKey
};
using (GetObjectResponse response = await client.GetObjectAsync(request))
using (Stream responseStream = response.ResponseStream)
using (StreamReader reader = new StreamReader(responseStream))
{
string title = response.Metadata["x-amz-meta-title"]; // Assume you have "title" as medata added to the object.
string contentType = response.Headers["Content-Type"];
Console.WriteLine("Object metadata, Title: {0}", title);
Console.WriteLine("Content type: {0}", contentType);
responseBody = reader.ReadToEnd(); // Now you process the response body.
}
}
catch (AmazonS3Exception e)
{
Console.WriteLine("Error encountered ***. Message:'{0}' when writing an object", e.Message);
}
catch (Exception e)
{
Console.WriteLine("Unknown encountered on server. Message:'{0}' when writing an object", e.Message);
}
}
}
When I debug, this line waits forever:
using (GetObjectResponse response = await client.GetObjectAsync(request))
This is the same line that throws the credentials error when I don't provide them. Is there something that I am missing here?
Any help would be greatly appreciated.
I suspect that the AWS .NET SDK has some isses with it specifically with the async call to S3.
The async call to dynamoDB works perfect, but the S3 one hangs forever.
What fixed my problem was simply removing the async functionality (even tho in the AWS docs, the async call is supposed to be used)
Before:
using (GetObjectResponse response = await client.GetObjectAsync(request))
After:
using (GetObjectResponse response = myClient.GetObject(request))
Hopefully this helps anyone else encountering this issue.
I am successfully uploading multi-part files to AWS S3, but now I'm attempting to ad an MD5 checksum to each part:
static void sendPart(existingBucketName, keyName, multipartRepsonse, partNum,
sendBuffer, partSize, vertx, partETags, s3, req, resultClosure)
{
// Create request to upload a part.
MessageDigest md = MessageDigest.getInstance("MD5")
byte[] digest = md.digest(sendBuffer.bytes)
println(digest.toString())
InputStream inputStream = new ByteArrayInputStream(sendBuffer.bytes)
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(existingBucketName).withKey(keyName)
.withUploadId(multipartRepsonse.getUploadId()).withPartNumber(partNum)
.withInputStream(inputStream)
.withMD5Digest(Base64.getEncoder().encode(digest).toString())
.withPartSize(partSize);
// Upload part and add response to our list.
vertx.executeBlocking({ future ->
// Do the blocking operation in here
// Imagine this was a call to a blocking API to get the result
try {
println("Sending chunk for ${keyName}")
PartETag eTag = s3.uploadPart(uploadRequest).getPartETag()
partETags.add(eTag);
println("Etag: " + eTag.ETag)
req.response().write("Sending Chunk\n")
} catch(Exception e) {
}
def result = "success!"
future.complete(result)
}, resultClosure)
}
However I get the following error:
AmazonS3Exception: The XML you provided was not well-formed or did not
validate against our published schema (Service: Amazon S3; Status
Code: 400; Error Code: MalformedXML; Request ID: 91542E819781FDFC), S3
Extended Request ID:
yQs45H/ozn5+xlxV9lRgCQWwv6gQysT6A4ablq7/Epq06pUzy0qGvMc+YAkJjo/RsHk2dedH+pI=
What am I doing incorrectly?
Looks like I was converting the digest incorrectly.
static void sendPart(existingBucketName, keyName, multipartRepsonse, partNum,
sendBuffer, partSize, vertx, partETags, s3, req, resultClosure)
{
// Create request to upload a part.
MessageDigest md = MessageDigest.getInstance("MD5")
byte[] digest = md.digest(sendBuffer.bytes)
InputStream inputStream = new ByteArrayInputStream(sendBuffer.bytes)
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(existingBucketName).withKey(keyName)
.withUploadId(multipartRepsonse.getUploadId()).withPartNumber(partNum)
.withInputStream(inputStream)
.withMD5Digest(Base64.getEncoder().encodeToString(digest))
.withPartSize(partSize)
// Upload part and add response to our list.
vertx.executeBlocking({ future ->
try {
println("Sending chunk for ${keyName}")
PartETag eTag = s3.uploadPart(uploadRequest).getPartETag()
partETags.add(eTag);
req.response().write("Sending Chunk\n")
} catch(Exception e) {
}
def result = "success!"
future.complete(result)
}, resultClosure)
}