TaskCanceledException on large file upload to AWS S3

TaskCanceledException on large file upload to AWS S3 - amazon-web-services

I'm trying to upload a large video file (800mb) to my S3 bucket, but it appears to timeout. It works just fine for smaller files. My project is an ASP.Net Core 2.1 application.
This is the exception that is thrown:
An unhandled exception occurred while processing the request.
SocketException: The I/O operation has been aborted because of either a thread exit or an application request.
Unknown location
IOException: Unable to read data from the transport connection: The I/O operation has been aborted because of either a thread exit or an application request.
System.Net.Sockets.Socket+AwaitableSocketAsyncEventArgs.ThrowException(SocketError error)
TaskCanceledException: The operation was canceled.
GamerPilot.Video.AWSS3Helper.UploadFileAsync(Stream stream, string key, S3CannedACL acl, bool useReducedRedundancy, bool throwOnError, CancellationToken cancellationToken) in AWSS3Helper.cs, line 770
My source code looks like this:
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, string filePath, CancellationToken cancellationToken = default(CancellationToken))
{
if (string.IsNullOrEmpty(filePath)) { throw new ArgumentNullException("filePath", "Video filepath is missing"); }
if (!File.Exists(filePath)) { throw new ArgumentNullException("filePath", "Video filepath does not exists"); }
//Test video file upload and db row insertion
using (var stream = File.OpenRead(filePath))
{
return await AddVideoAsync(instructorId, lectureId, videoName, stream, cancellationToken);
}
}
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, Stream videoFile, CancellationToken cancellationToken = default(CancellationToken))
{
var video = (Video) await GamerPilot.Video.Helper.Create(_awsS3AccessKey, _awsS3SecretKey, _awsS3BucketName, _awsS3Region)
.AddVideoAsync(instructorId, lectureId, videoName, videoFile, cancellationToken);
using (var db = new DbContext(_connectionString))
{
db.Videos.Add(video);
var count = await db.SaveChangesAsync();
}
return video;
}`
public async Task<IVideo> AddVideoAsync(int instructorId, int lectureId, string videoName, Stream videoFile, CancellationToken cancellationToken = default(CancellationToken))
{
if (string.IsNullOrEmpty(videoName)) { throw new ArgumentNullException("videoName", "Video name cannot be empty or null"); }
if (videoFile == null) { throw new ArgumentNullException("video", "Video stream is missing"); }
var videoNameCleaned = videoName.Replace(" ", "-").ToLower().Replace(".mp4", "");
var videoKey = string.Join('/', "videos", instructorId, lectureId, videoNameCleaned + ".mp4");
using (var aws = new AWSS3Helper(_awsS3AccessKey, _awsS3SecretKey, _awsS3BucketName, _awsS3Region))
{
try
{
//THIS FAILS ------
await aws.UploadFileAsync(videoFile, videoKey, Amazon.S3.S3CannedACL.PublicRead, true, true, cancellationToken);
}
catch (Exception ex)
{
throw;
}
}
return new Video
{
InstructorId = instructorId,
LectureId = lectureId,
Name = videoName,
S3Key = videoKey,
S3Region = _awsS3Region.SystemName,
S3Bucket = _awsS3BucketName,
Created = DateTime.Now
};
}
How can I work around this?

There is no general constraint on S3 itself which would prevent you from uploading an 800MB file. However, there are requirements for the handling of retries and timeouts when working with AWS. It is not clear from your question whether or not you are using Amazon's SDK, (I can't find the origin of GamerPilot.Video.AWSS3Helper.UploadFileAsync). However, Amazon's SDK for .NET should handle this for you if you use it in accordance with the following:
Programming with the AWS SDK for .NET - Retries and Timeouts
Using the AWS SDK for .NET for Multipart Upload (High-Level API)

Related

Should I close an S3Object?

What if I did not close the s3object in the finally clause?
Are there no resource leaks in the code below?
class S3ClientClass {
lazy val amazonS3Client = this.getS3Client()
private def getS3Client() = {
AmazonS3ClientBuilder
.standard()
.withRegion(Regions.AP_NORTHEAST_1)
.build()
}
def readFromS3(s3Bucket: String, filepath: String): String = {
var s3object: S3Object = null
try {
s3object = amazonS3Client.getObject(s3Bucket, filepath)
readFromS3(s3object)
}
finally {
if (s3object != null) {
s3object.close()
}
}
}
def readFromS3(obj: S3Object): String = {
val reader = new BufferedReader(new InputStreamReader(obj.getObjectContent))
reader.lines().collect(Collectors.joining())
}
}

If you didn't close the S3Object, it would lead to resource leaks.
The S3Object class should be closed to release the resources it holds, as it implements the Closeable interface; resources, in this case, would be network connection(s) to Amazon S3.
This is also explained in the AWS Developer Blog:
S3Object contains an S3ObjectInputStream that lets you stream down your data over the HTTP connection from Amazon S3. Since the HTTP connection is open and waiting, it’s important to read the stream quickly after calling getObject and to remember to close the stream so that the HTTP connection can be released properly.

wso2 identity server custom handler reading from properties file

public class UserRegistrationCustomEventHandler extends AbstractEventHandler {
JSONObject jsonObject = null;
private static final Log log = LogFactory.getLog(UserRegistrationCustomEventHandler.class);
#Override
public String getName() {
return "customClaimUpdate";
}
if (IdentityEventConstants.Event.POST_SET_USER_CLAIMS.equals(event.getEventName())) {
String tenantDomain = (String) event.getEventProperties()
.get(IdentityEventConstants.EventProperty.TENANT_DOMAIN);
String userName = (String) event.getEventProperties().get(IdentityEventConstants.EventProperty.USER_NAME);
Map<String, Object> eventProperties = event.getEventProperties();
String eventName = event.getEventName();
UserStoreManager userStoreManager = (UserStoreManager) eventProperties.get(IdentityEventConstants.EventProperty.USER_STORE_MANAGER);
// String userStoreDomain = UserCoreUtil.getDomainName(userStoreManager.getRealmConfiguration());
#SuppressWarnings("unchecked")
Map<String, String> claimValues = (Map<String, String>) eventProperties.get(IdentityEventConstants.EventProperty
.USER_CLAIMS);
String emailId = claimValues.get("http://wso2.org/claims/emailaddress");
userName = "USERS/"+userName;
JSONObject json = new JSONObject();
json.put("userName",userName );
json.put("emailId",emailId );
log.info("JSON:::::::"+json);
// Sample API
//String apiValue = "http://192.168.1.X:8080/SomeService/user/updateUserEmail?email=sujith#gmail.com&userName=USERS/sujith";
try {
URL url = new URL(cityAppUrl) ;
HttpURLConnection con = (HttpURLConnection) url.openConnection();
con.setConnectTimeout(5000);
con.setRequestProperty("Content-Type", "application/json; charset=UTF-8");
con.setDoOutput(true);
con.setDoInput(true);
con.setRequestMethod("POST");
log.info("CONN:::::::::::::"+con);
OutputStream os = con.getOutputStream();
os.write(cityAppUrl.toString().getBytes("UTF-8"));
os.close();
InputStream in = new BufferedInputStream(con.getInputStream());
String result = org.apache.commons.io.IOUtils.toString(in, "UTF-8");
jsonObject = new JSONObject(result);
log.info("JSON OBJECT:::::::::"+jsonObject);
}
catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
#Override
public void init(InitConfig configuration) throws IdentityRuntimeException {
super.init(configuration);
}
#Override
public int getPriority(MessageContext messageContext) {
return 250;
}
}
I'm using wso2 identity server 5.10.0 and have to push the updated claim value to an API so I'm using a custom handler and have subscribed to POST_SET_USER_CLAIMS, i have to read the API value from deployment.toml file in jave code of the custom handler. So can any one please help here to read the value from deployment file
I can fetch the updated claim value in logs but im not able to get the API value. So can anyone help me here to read the value from deployment file.

Since the API path is required inside your custom event handler, let's define the API path value as one of the properties of the event handler.
Add the deployment.toml config as follows.
[[event_handler]]
name= "UserRegistrationCustomEventHandler"
subscriptions =["POST_SET_USER_CLAIMS"]
properties.apiPath = "http://192.168.1.X:8080/SomeService/user/updateUserEmail"
Once you restart the server identity-event.properties file populates the given configs.
In your custom event handler java code needs to read the config from identity-event.properties file. The file reading is done at the server startup and every config is loaded to the memory.
By adding this to your java code, you can load to configured value in the property.
configs.getModuleProperties().getProperty("UserRegistrationCustomEventHandler.apiPath")
NOTE: property name needs to be defined as <event_handler_name>.<property_name>
Here is a reference to such event hanlder's property loading code snippet https://github.com/wso2-extensions/identity-governance/blob/68e3f2d5e246b6a75f48e314ee1019230c662b55/components/org.wso2.carbon.identity.password.policy/src/main/java/org/wso2/carbon/identity/password/policy/handler/PasswordPolicyValidationHandler.java#L128-L133

AWS amplify - datastore and file upload offline and sync whenever internet gets

I am using AWS amplify with flutter for my application. One of the scenario that I need to create an object along with image/video and save it in the cloud even if there is no internet.
I am using S3 to store the image/video and taking the key from the stored response whenever we have active internet. and saving that stored file's key to the Datastore object and saving the datastore object. this is fine when we have internet connection.
But if there is no internet, still I want to do the same without stopping the user to wait till internet connection available.
The datastore is getting sync sometime if the internet connections gets(only sometimes not every time)
Using flutter how can we achieve?
thanks.

To achieve Online-/Offline-Support with automated synchronization for your media files you have to implement it yourself.
To do so, you first have to adapt the read and write methods for your media files so they do not only read/write from/to S3 but also from/to the local filesystem. For simplicity reasons I recommend utilizing a similiar structure of your file tree. Thereby, take into consideration that you need some kind of strategy for solving merge-conflicts (e.g. online version changed after offline version).
In addition to that, you have to write a syncing method that checks for all files that require sync (upload and download). For not making your app get stuck it should add all required sync operations to a queue that is handled in the background. The method may be triggered for example on each app-start, every specific time period or on each wake lock.
I faced the same problem some months ago in an open source project. We solved it by creating ower own class SyncedFile and some Repository and Bloc logic to solve the sync. For merge logic we used the last changed value of the files.
class SyncedFile {
SyncedFile(this.path) {
key = ValueKey(DateTime.now().toIso8601String());
}
String path;
late Key key;
Future<File?> file() async {
File localCacheFile = await getCachePath();
bool cached = await localCacheFile.exists();
if (!cached) {
print("file not in cache, loading it");
await StorageRepository.downloadFile(localCacheFile, path);
} else {
print("found in cache: $path");
}
cached = await localCacheFile.exists();
if (!cached) {
return null;
}
key = ValueKey(DateTime.now().toIso8601String());
print("returning file: $key");
return localCacheFile;
}
Future<File> getCachePath() async {
Directory appDocDir = await getApplicationDocumentsDirectory();
List<String> pathParts = path.split("/");
pathParts.removeAt(pathParts.length - 1);
String toCreateDir = "";
for (int i = 0; i < pathParts.length; i++) {
toCreateDir += pathParts[i];
if (i != pathParts.length - 1) {
toCreateDir += "/";
}
}
await Directory(appDocDir.path + "/" + toCreateDir).create(recursive: true);
File localCacheFile = File('${appDocDir.path}/$path');
key = ValueKey(DateTime.now().toIso8601String());
print("returning cache path: $key");
return localCacheFile;
}
Future<void> update(String utf8String) async {
File localCacheFile = await getCachePath();
await localCacheFile.writeAsString(utf8String, flush: true);
StorageRepository.uploadFile(localCacheFile, path);
key = ValueKey(DateTime.now().toIso8601String());
}
Future<File?> updateAsBytes(File file) async {
var bytes = await file.readAsBytes();
File localCacheFile = await getCachePath();
await localCacheFile.writeAsBytes(bytes, flush: true);
key = ValueKey(DateTime.now().toIso8601String());
StorageRepository.uploadFile(localCacheFile, path);
print("pic update finished: $key");
return await getCachePath();
}
Future<File?> updateAsPic(XFile xfile) async {
var bytes = await xfile.readAsBytes();
File localCacheFile = await getCachePath();
await localCacheFile.writeAsBytes(bytes, flush: true);
key = ValueKey(DateTime.now().toIso8601String());
print("pic update finished: $key");
StorageRepository.uploadFile(localCacheFile, path);
return await getCachePath();
}
Future<File?> updateAsAudio(File file) async {
File localCacheFile = await getCachePath();
await localCacheFile.writeAsBytes(file.readAsBytesSync(), flush: true);
await StorageRepository.uploadFile(localCacheFile, path);
key = ValueKey(DateTime.now().toIso8601String());
return await getCachePath();
}
Future<void> delete() async {
File localCacheFile = await getCachePath();
await localCacheFile.delete();
await StorageRepository.removeFile(path);
key = ValueKey(DateTime.now().toIso8601String());
}
Future<bool> sync(SyncBloc syncBloc) async {
try {
syncBloc.add(StartLoadingFileEvent());
File localCacheFile = await getCachePath();
bool cached = await localCacheFile.exists();
if (!cached) {
await StorageRepository.downloadFile(localCacheFile, path,
checkConnection: false);
} else {
ListResult listResult = await Amplify.Storage.list(path: path);
if (listResult.items.isEmpty) {
StorageRepository.uploadFile(await getCachePath(), path,
checkConnection: false);
} else {
DateTime? lastModifiedLocal;
try {
lastModifiedLocal = await localCacheFile.lastModified();
} catch (e) {}
DateTime? lastModifiedOnline = listResult.items.first.lastModified;
if (lastModifiedOnline == null) {
await StorageRepository.uploadFile(localCacheFile, path,
checkConnection: false);
} else if (lastModifiedLocal == null) {
await StorageRepository.downloadFile(localCacheFile, path,
checkConnection: false);
} else if (lastModifiedLocal.isAfter(lastModifiedOnline)) {
await StorageRepository.uploadFile(localCacheFile, path,
checkConnection: false);
} else if (lastModifiedLocal.isBefore(lastModifiedOnline)) {
await StorageRepository.downloadFile(localCacheFile, path,
checkConnection: false);
}
}
}
syncBloc.add(LoadedFileEvent());
return true;
} catch (e) {
return false;
}
}
}

Problems with AWS SDK .NET

I am trying to retrieve images from my bucket to send to my mobile apps, I currently have the devices accessing AWS directly, however I am adding a layer of security and having my apps (IOS and Android) now make requests to my server which will then respond with DynamoDB and S3 data.
I am trying to follow the documentation and code samples provided by AWS for .Net and they worked seamlessly for DynamoDB, I am running into problems with S3.
S3 .NET Documentation
My problem is that if I provide no credentials, I get the error:
Failed to retrieve credentials from EC2 Instance Metadata Service
This is expected as I have IAM roles set up and only want my apps and this server (in the future, only this server) to have access to the buckets.
But when I provide the credentials, the same way I provided credentials for DynamoDB, my server waits forever and doesn't receive any responses from AWS.
Here is my C#:
<%# WebHandler Language="C#" Class="CheckaraRequestHandler" %>
using System;
using System.Web;
using System.Collections.Generic;
using Amazon.DynamoDBv2;
using Amazon.DynamoDBv2.Model;
using Amazon.DynamoDBv2.DocumentModel;
using Amazon;
using Amazon.Runtime;
using Amazon.S3;
using Amazon.S3.Model;
using System.IO;
using System.Threading.Tasks;
public class CheckaraRequestHandler : IHttpHandler
{
private const string bucketName = "MY_BUCKET_NAME";
private static readonly RegionEndpoint bucketRegion = RegionEndpoint.USEast1;
public static IAmazonS3 client = new AmazonS3Client("MY_ACCESS_KEY", "MY_SECRET_KEY", RegionEndpoint.USEast1);
public void ProcessRequest(HttpContext context)
{
if (context.Request.HttpMethod.ToString() == "GET")
{
string userID = context.Request.QueryString["User"];
string Action = context.Request.QueryString["Action"];
if (userID == null)
{
context.Response.ContentType = "text/plain";
context.Response.Write("TRY AGAIN!");
return;
}
if (Action == "GetPhoto")
{
ReadObjectDataAsync(userID).Wait();
}
var client = new AmazonDynamoDBClient("MY_ACCESS_KEY", "MY_SECRET_KEY", RegionEndpoint.USEast1);
Console.WriteLine("Getting list of tables");
var table = Table.LoadTable(client, "TABLE_NAME");
var item = table.GetItem(userID);
if (item != null)
{
context.Response.ContentType = "application/json";
context.Response.Write(item.ToJson());
}
else
{
context.Response.ContentType = "text/plain";
context.Response.Write("0");
}
}
}
public bool IsReusable
{
get
{
return false;
}
}
static async Task ReadObjectDataAsync(string userID)
{
string responseBody = "";
try
{
string formattedKey = userID + "/" + userID + "_PROFILEPHOTO.jpeg";
//string formattedKey = userID + "_PROFILEPHOTO.jpeg";
//formattedKey = formattedKey.Replace(":", "%3A");
GetObjectRequest request = new GetObjectRequest
{
BucketName = bucketName,
Key = formattedKey
};
using (GetObjectResponse response = await client.GetObjectAsync(request))
using (Stream responseStream = response.ResponseStream)
using (StreamReader reader = new StreamReader(responseStream))
{
string title = response.Metadata["x-amz-meta-title"]; // Assume you have "title" as medata added to the object.
string contentType = response.Headers["Content-Type"];
Console.WriteLine("Object metadata, Title: {0}", title);
Console.WriteLine("Content type: {0}", contentType);
responseBody = reader.ReadToEnd(); // Now you process the response body.
}
}
catch (AmazonS3Exception e)
{
Console.WriteLine("Error encountered ***. Message:'{0}' when writing an object", e.Message);
}
catch (Exception e)
{
Console.WriteLine("Unknown encountered on server. Message:'{0}' when writing an object", e.Message);
}
}
}
When I debug, this line waits forever:
using (GetObjectResponse response = await client.GetObjectAsync(request))
This is the same line that throws the credentials error when I don't provide them. Is there something that I am missing here?
Any help would be greatly appreciated.

I suspect that the AWS .NET SDK has some isses with it specifically with the async call to S3.
The async call to dynamoDB works perfect, but the S3 one hangs forever.
What fixed my problem was simply removing the async functionality (even tho in the AWS docs, the async call is supposed to be used)
Before:
using (GetObjectResponse response = await client.GetObjectAsync(request))
After:
using (GetObjectResponse response = myClient.GetObject(request))
Hopefully this helps anyone else encountering this issue.

How to generate md5 checksum for AWS S3 multi-part upload?

I am successfully uploading multi-part files to AWS S3, but now I'm attempting to ad an MD5 checksum to each part:
static void sendPart(existingBucketName, keyName, multipartRepsonse, partNum,
sendBuffer, partSize, vertx, partETags, s3, req, resultClosure)
{
// Create request to upload a part.
MessageDigest md = MessageDigest.getInstance("MD5")
byte[] digest = md.digest(sendBuffer.bytes)
println(digest.toString())
InputStream inputStream = new ByteArrayInputStream(sendBuffer.bytes)
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(existingBucketName).withKey(keyName)
.withUploadId(multipartRepsonse.getUploadId()).withPartNumber(partNum)
.withInputStream(inputStream)
.withMD5Digest(Base64.getEncoder().encode(digest).toString())
.withPartSize(partSize);
// Upload part and add response to our list.
vertx.executeBlocking({ future ->
// Do the blocking operation in here
// Imagine this was a call to a blocking API to get the result
try {
println("Sending chunk for ${keyName}")
PartETag eTag = s3.uploadPart(uploadRequest).getPartETag()
partETags.add(eTag);
println("Etag: " + eTag.ETag)
req.response().write("Sending Chunk\n")
} catch(Exception e) {
}
def result = "success!"
future.complete(result)
}, resultClosure)
}
However I get the following error:
AmazonS3Exception: The XML you provided was not well-formed or did not
validate against our published schema (Service: Amazon S3; Status
Code: 400; Error Code: MalformedXML; Request ID: 91542E819781FDFC), S3
Extended Request ID:
yQs45H/ozn5+xlxV9lRgCQWwv6gQysT6A4ablq7/Epq06pUzy0qGvMc+YAkJjo/RsHk2dedH+pI=
What am I doing incorrectly?

Looks like I was converting the digest incorrectly.
static void sendPart(existingBucketName, keyName, multipartRepsonse, partNum,
sendBuffer, partSize, vertx, partETags, s3, req, resultClosure)
{
// Create request to upload a part.
MessageDigest md = MessageDigest.getInstance("MD5")
byte[] digest = md.digest(sendBuffer.bytes)
InputStream inputStream = new ByteArrayInputStream(sendBuffer.bytes)
UploadPartRequest uploadRequest = new UploadPartRequest()
.withBucketName(existingBucketName).withKey(keyName)
.withUploadId(multipartRepsonse.getUploadId()).withPartNumber(partNum)
.withInputStream(inputStream)
.withMD5Digest(Base64.getEncoder().encodeToString(digest))
.withPartSize(partSize)
// Upload part and add response to our list.
vertx.executeBlocking({ future ->
try {
println("Sending chunk for ${keyName}")
PartETag eTag = s3.uploadPart(uploadRequest).getPartETag()
partETags.add(eTag);
req.response().write("Sending Chunk\n")
} catch(Exception e) {
}
def result = "success!"
future.complete(result)
}, resultClosure)
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

TaskCanceledException on large file upload to AWS S3 - amazon-web-services

Related

Should I close an S3Object?

wso2 identity server custom handler reading from properties file

AWS amplify - datastore and file upload offline and sync whenever internet gets

Problems with AWS SDK .NET

How to generate md5 checksum for AWS S3 multi-part upload?

Categories

Resources