Service to upload images and videos to (S3, CloudFront etc) - amazon-web-services

I am looking for a service to upload videos and images from my mobile applications (frontend). I have heard about Amazon S3 and CloudFront. I am looking for a service that will store them, and will also be able to check if they meet certain criteria (for example, maximum file size of 3MB per picture), and return an error to the client if the file doesn't meet the criteria. Does Amazon S3 or CloudFront provide this? If not, is there any other recommended service for this?

You could use the AWS SDK. Here follows an example of the Java version (Amazon provides SDKs for different languages):
/**
* It stores the given file name in S3 and returns the key under which the file has been stored
* #param resource
* #param bucketName
* #return
*/
public String storeProfileImage(File resource, String bucketName, String username) {
String resourceUrl = null;
if (!resource.exists()) {
throw new IllegalArgumentException("The file " + resource.getAbsolutePath() + " doesn't exist");
}
long lengthInBytes = resource.length();
//For demo purposes. You should use a configurable property for the max size
if (lengthInBytes > (3 * 1024)) {
//Your error handling here
}
AccessControlList acl = new AccessControlList();
acl.grantPermission(GroupGrantee.AllUsers, Permission.Read);
String key = username + "/profilePicture." + FilenameUtils.getExtension(resource.getName());
try {
s3Client.putObject(new PutObjectRequest(bucketName, key, resource).withAccessControlList(acl));
resourceUrl = s3Client.getResourceUrl(bucketName, key);
} catch (AmazonClientException ace) {
LOG.error("A client exception occurred while trying to store the profile" +
" image {} on S3. The profile image won't be stored", resource.getAbsolutePath(), ace);
}
return resourceUrl;
}
You can also perform other operations, e.g. check if the bucket exists before storing the image
/**
* Returns the root URL where the bucket name is located.
* <p>Please note that the URL does not contain the bucket name</p>
* #param bucketName The bucket name
* #return the root URL where the bucket name is located.
*/
public String ensureBucketExists(String bucketName) {
String bucketUrl = null;
try {
if (!s3Client.doesBucketExist(bucketName)) {
LOG.warn("Bucket {} doesn't exists...Creating one");
s3Client.createBucket(bucketName);
LOG.info("Created bucket: {}", bucketName);
}
bucketUrl = s3Client.getResourceUrl(bucketName, null) + bucketName;
} catch (AmazonClientException ace) {
LOG.error("An error occurred while connecting to S3. Will not execute action" +
" for bucket: {}", bucketName, ace);
}
return bucketUrl;
}

Related

fetch batch translate job details using SDK

I was trying to find the LanguagePair and Operation associated with a given AWS translate job using the JAVA SDK.
Using the AWS web console, i created a couple of batch jobs to translate a few english sentences to french. In CloudWatch, i could see the metric dimensions as
LanguagePair: en-fr
Operation: TranslateText
Can i retrieve the same information (LanguagePair and Operation) for a given job,
using the TranslateAsyncClient.describeTextTranslationJob(...) method ?
You can use the TranslateClient to describes a translation job given the job number as input. This uses the TranslateClient; however you can use the Async version as well.
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.translate.TranslateClient;
import software.amazon.awssdk.services.translate.model.DescribeTextTranslationJobRequest;
import software.amazon.awssdk.services.translate.model.DescribeTextTranslationJobResponse;
import software.amazon.awssdk.services.translate.model.TranslateException;
// snippet-end:[translate.java2._describe_jobs.import]
/**
* To run this Java V2 code example, ensure that you have setup your development environment, including your credentials.
*
* For information, see this documentation topic:
*
* https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
*/
public class DescribeTextTranslationJob {
public static void main(String[] args) {
final String USAGE = "\n" +
"Usage:\n" +
" DescribeTextTranslationJob <id> \n\n" +
"Where:\n" +
" id - a translation job ID value. You can obtain this value from the BatchTranslation example.\n";
if (args.length != 1) {
System.out.println(USAGE);
System.exit(1);
}
String id = args[0];
Region region = Region.US_WEST_2;
TranslateClient translateClient = TranslateClient.builder()
.region(region)
.build();
describeTextTranslationJob(translateClient, id);
translateClient.close();
}
// snippet-start:[translate.java2._describe_jobs.main]
public static void describeTextTranslationJob(TranslateClient translateClient, String id) {
try {
DescribeTextTranslationJobRequest textTranslationJobRequest = DescribeTextTranslationJobRequest.builder()
.jobId(id)
.build();
DescribeTextTranslationJobResponse jobResponse = translateClient.describeTextTranslationJob(textTranslationJobRequest);
System.out.println("The job status is "+jobResponse.textTranslationJobProperties().jobStatus());
System.out.println("The source language is "+jobResponse.textTranslationJobProperties().sourceLanguageCode());
System.out.println("The target language is "+jobResponse.textTranslationJobProperties().targetLanguageCodes());
} catch (TranslateException e) {
System.err.println(e.getMessage());
System.exit(1);
}
// snippet-end:[translate.java2._describe_jobs.main]
}
}
To see all the data you can get back using this code, see this JavaDoc - https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/translate/model/TextTranslationJobProperties.html

AmazonClientException: More data read has a different length the expected

When I try to upload content to an Amazon S3 bucket, I get an AmazonClientException: Data read has a different length than the expected.
Here is my code.
public Object uploadFile(MultipartFile file) {
String fileName = System.currentTimeMillis() + "_" + file.getOriginalFilename();
log.info("uploadFile-> starting file upload " + fileName);
Path path = Paths.get(file.getOriginalFilename());
File fileObj = new File(file.getOriginalFilename());
try (FileOutputStream os = new FileOutputStream(fileObj)) {
os.write(file.getBytes());
os.close();
String uploadFilePath = bucketName + "/" + uploadPath;
s3Client.putObject(new PutObjectRequest(uploadFilePath, fileName, fileObj));
Files.delete(path);
} catch (IOException ex) {
log.error("error [" + ex.getMessage() + "] occurred while uploading [" + fileName + "] ");
}
log.info("uploadFile-> file uploaded process completed at: " + LocalDateTime.now() + " for - " + fileName);
return "File uploaded : " + fileName;
}
Amazon recommends using the Amazon S3 Java V2 API over use of V1.
The AWS SDK for Java 2.x is a major rewrite of the version 1.x code base. It’s built on top of Java 8+ and adds several frequently requested features. These include support for non-blocking I/O and the ability to plug in a different HTTP implementation at run time.
To upload content to an Amazon S3 bucket, use this V2 code.
public static String putS3Object(S3Client s3,
String bucketName,
String objectKey,
String objectPath) {
try {
Map<String, String> metadata = new HashMap<>();
metadata.put("myVal", "test");
PutObjectRequest putOb = PutObjectRequest.builder()
.bucket(bucketName)
.key(objectKey)
.metadata(metadata)
.build();
PutObjectResponse response = s3.putObject(putOb,
RequestBody.fromBytes(getObjectFile(objectPath)));
return response.eTag();
} catch (S3Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
return "";
}
Full example here.
If you are not familiar with V2, please refer to this doc topic:
https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/get-started.html
AWS Java SDK 1.x
s3.putObject(bucketName, key, new File(filePath));
AWS Java SDK 2.X
PutObjectRequest putObjectRequest = PutObjectRequest
.builder()
.bucket(bucketName)
.key(key)
.build();
s3Client.putObject(putObjectRequest, Paths.get(filePath));

Pentaho ETL Transformation Using Lamda

Is it possible to run Pentaho ETL Jobs/transformation using AWS Lamda functions?
I have Pentaho ETL jobs running on schedule on the Windows server, we are planning to migrate to AWS. I am considering the Lambda function. just to understand if it is possible to schedule the Pentaho ETL Jobs using AWS Lamdba
Here is the snippet of code that I was able to successfully run in AWS Lambda Function.
handleRequest Function is called from AWS Lambda Function
public Integer handleRequest(String input, Context context) {
parseInput(input);
return executeKtr(transName);
}
parseInput: This function is used to parse out a string parameter passed by Lambda Function to extract KTR name and its parameters with value. Format of the input is "ktrfilename param1=value1 param2=value2"
public static void parseInput(String input) {
String[] tokens = input.split(" ");
transName = tokens[0].replace(".ktr", "") + ".ktr";
for (int i=1; i<tokens.length; i++) {
params.add(tokens[i]);
}
}
Executing KTR: I am using git repo to store all my KTR files and based on the name passed as a parameter KTR is executed
public static Integer executeKtr(String ktrName) {
try {
System.out.println("Present Project Directory : " + System.getProperty("user.dir"));
String transName = ktrName.replace(".ktr", "") + ".ktr";
String gitURI = awsSSM.getParaValue("kattle-trans-git-url");
String repoLocalPath = clonePDIrepo.cloneRepo(gitURI);
String path = new File(repoLocalPath + "/" + transName).getAbsolutePath();
File ktrFile = new File(path);
System.out.println("KTR Path: " + path);
try {
/**
* IMPORTANT NOTE FOR LAMBDA FUNCTION MUST CREATE .KEETLE DIRECOTRY OTHERWISE
* CODE WILL FAIL IN LAMBDA FUNCTION WITH ERROR CANT CREATE
* .kettle/kettle.properties file.
*
* ALSO SET ENVIRNOMENT VARIABLE ON LAMBDA FUNCTION TO POINT
* KETTLE_HOME=/tmp/.kettle
*/
Files.createDirectories(Paths.get("/tmp/.kettle"));
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException("Error Creating /tmp/.kettle directory");
}
if (ktrFile.exists()) {
KettleEnvironment.init();
TransMeta metaData = new TransMeta(path);
Trans trans = new Trans(metaData);
// SETTING PARAMETERS
trans = parameterSetting(trans);
trans.execute( null );
trans.waitUntilFinished();
if (trans.getErrors() > 0) {
System.out.print("Error Executing transformation");
throw new RuntimeException("There are errors in running transformations");
} else {
System.out.print("Successfully Executed Transformation");
return 1;
}
} else {
System.out.print("KTR File:" + path + " not found in repo");
throw new RuntimeException("KTR File:" + path + " not found in repo");
}
} catch (KettleException e) {
e.printStackTrace();
throw new RuntimeException(e.getMessage());
}
}
parameterSetting: If KTR is accepting parameter and it is passed while calling AWS Lambda function, it is set using parameterSetting function.
public static Trans parameterSetting(Trans trans) {
String[] transParams = trans.listParameters();
for (String param : transParams) {
for (String p: params) {
String name = p.split("=")[0];
String val = p.split("=")[1];
if (name.trim().equals(param.trim())) {
try {
System.out.println("Setting Parameter:"+ name + "=" + val);
trans.setParameterValue(name, val);
} catch (UnknownParamException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
trans.activateParameters();
return trans;
}
CloneGitRepo:
public class clonePDIrepo {
/**
* Clones the given repo to local folder
*
* #param pathWithPwd Gir repo URL with access token included in the url. e.g.
* https://token_name:token_value#github.com/ktr-git-repo.git
* #return returns Local Repository String Path
*/
public static String cloneRepo(String pathWithPwd) {
try {
/**
* CREATING TEMP DIR TO AVOID FOLDER EXISTS ERROR, THIS TEMP DIRECTORY LATER CAN
* BE USED TO GET ABSOLETE PATH FOR FILES IN DIRECTORY
*/
File pdiLocalPath = Files.createTempDirectory("repodir").toFile();
Git git = Git.cloneRepository().setURI(pathWithPwd).setDirectory(pdiLocalPath).call();
System.out.println("Git repository cloned successfully");
System.out.println("Local Repository Path:" + pdiLocalPath.getAbsolutePath());
// }
return pdiLocalPath.getAbsolutePath();
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
}
AWSSSMgetParaValue: Gets string value of the parameter passed.
public static String getParaValue(String paraName) {
try {
Region region = Region.US_EAST_1;
SsmClient ssmClient = SsmClient.builder()
.region(region)
.build();
GetParameterRequest parameterRequest = GetParameterRequest.builder()
.name(paraName)
.withDecryption(true)
.build();
GetParameterResponse parameterResponse = ssmClient.getParameter(parameterRequest);
System.out.println(paraName+ " value retreived from AWS SSM");
ssmClient.close();
return parameterResponse.parameter().value();
} catch (SsmException e) {
System.err.println(e.getMessage());
return null;
}
}
Assumptions:
Git repo is created with KTR files in the root of the repo
git repo url exists on the aws SSM with valid tokens to clone the repo
Input string contains name of the KTR file
Environment Variable is configured on Lambda Function for KETTLE_HOME=/tmp/.kettle
Lambda Function has necessary permissions for SSM and S3 VPC Network
Proper Security Group rules are setup to allow required network access for the KTR File
I am planning to upload complete code to git. I will update this post with the URL of the repository.

AWS QuickSight programmatic access

I have been recently involved in a project where I have to leverage the QuickSight APIs and update a dashboard programmatically. I can perform all the other actions but I am unable to update the dashboard from a template. I have tried a couple of different ideas, but all in vain.
Is there anyone who has already worked with the UpdateDashboard API or point me to some detailed documentation where I can understand if I am actually missing anything?
Thanks.
I got this to work using the AWS QuickSight Java V2 API. TO make this work, you need to follow the quick start instructions here:
https://docs.aws.amazon.com/quicksight/latest/user/getting-started.html
You need to get these values:
account - your account number
dashboardId - the dashboard id value
dataSetArn -- the data set ID value
analysisArn - the analysis Arn value
Once you go through the above topics - you will have all of these resource and ready to call UpdateDashboard . Here is the Java example that updates a Dashboard.
package com.example.quicksight;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.quicksight.QuickSightClient;
import software.amazon.awssdk.services.quicksight.model.*;
/*
Before running this code example, follow the Getting Started with Data Analysis in Amazon QuickSight located here:
https://docs.aws.amazon.com/quicksight/latest/user/getting-started.html
This code example uses resources that you created by following that topic such as the DataSet Arn value.
*/
public class UpdateDashboard {
public static void main(String[] args) {
final String USAGE = "\n" +
"Usage: UpdateDashboard <account> <dashboardId> <>\n\n" +
"Where:\n" +
" account - the account to use.\n\n" +
" dashboardId - the dashboard id value to use.\n\n" +
" dataSetArn - the ARN of the dataset.\n\n" +
" analysisArn - the ARN of an existing analysis";
String account = "<account id>";
String dashboardId = "<dashboardId>";
String dataSetArn = "<dataSetArn>";
String analysisArn = "<Analysis Arn>";
QuickSightClient qsClient = QuickSightClient.builder()
.region(Region.US_EAST_1)
.build();
try {
DataSetReference dataSetReference = DataSetReference.builder()
.dataSetArn(dataSetArn)
.dataSetPlaceholder("Dataset placeholder2")
.build();
// Get a template ARN to use.
String arn = getTemplateARN(qsClient, account, dataSetArn, analysisArn);
DashboardSourceTemplate sourceTemplate = DashboardSourceTemplate.builder()
.dataSetReferences(dataSetReference)
.arn(arn)
.build();
DashboardSourceEntity sourceEntity = DashboardSourceEntity.builder()
.sourceTemplate(sourceTemplate)
.build();
UpdateDashboardRequest dashboardRequest = UpdateDashboardRequest.builder()
.awsAccountId(account)
.dashboardId(dashboardId)
.name("UpdateTest")
.sourceEntity(sourceEntity)
.themeArn("arn:aws:quicksight::aws:theme/SEASIDE")
.build();
UpdateDashboardResponse response = qsClient.updateDashboard(dashboardRequest);
System.out.println("Dashboard " + response.dashboardId() + " has been updated");
} catch (QuickSightException e) {
System.err.println(e.awsErrorDetails().errorMessage());
System.exit(1);
}
}
private static String getTemplateARN(QuickSightClient qsClient, String account, String dataset, String analysisArn) {
String arn = "";
try {
DataSetReference setReference = DataSetReference.builder()
.dataSetArn(dataset)
.dataSetPlaceholder("Dataset placeholder2")
.build();
TemplateSourceAnalysis templateSourceAnalysis = TemplateSourceAnalysis.builder()
.dataSetReferences(setReference)
.arn(analysisArn)
.build();
TemplateSourceEntity sourceEntity = TemplateSourceEntity.builder()
.sourceAnalysis(templateSourceAnalysis)
.build();
CreateTemplateRequest createTemplateRequest = CreateTemplateRequest.builder()
.awsAccountId(account)
.name("NewTemplate")
.sourceEntity(sourceEntity)
.templateId("a9a277fb-7239-4890-bc7a-8a3e82d67a37") // Specify a GUID value
.build();
CreateTemplateResponse response = qsClient.createTemplate(createTemplateRequest);
arn = response.arn();
} catch (QuickSightException e) {
System.err.println(e.awsErrorDetails().errorMessage());
System.exit(1);
}
return arn;
}
}

Accessing specified key from s3 bucket?

I have a S3 bucket xxx. I wrote one lambda function to access data from s3 bucket and writing those details to a RDS PostgreSQL instance. I can do it with my code. I added one trigger to the lambda function for invoking the same when a file falls on s3.
But from my code I can only read file having name 'sampleData.csv'. consider my code given below
public class LambdaFunctionHandler implements RequestHandler<S3Event, String> {
private AmazonS3 s3 = AmazonS3ClientBuilder.standard().build();
public LambdaFunctionHandler() {}
// Test purpose only.
LambdaFunctionHandler(AmazonS3 s3) {
this.s3 = s3;
}
#Override
public String handleRequest(S3Event event, Context context) {
context.getLogger().log("Received event: " + event);
String bucket = "xxx";
String key = "SampleData.csv";
System.out.println(key);
try {
S3Object response = s3.getObject(new GetObjectRequest(bucket, key));
String contentType = response.getObjectMetadata().getContentType();
context.getLogger().log("CONTENT TYPE: " + contentType);
// Read the source file as text
AmazonS3 s3Client = new AmazonS3Client();
String body = s3Client.getObjectAsString(bucket, key);
System.out.println("Body: " + body);
System.out.println();
System.out.println("Reading as stream.....");
System.out.println();
BufferedReader br = new BufferedReader(new InputStreamReader(response.getObjectContent()));
// just saving the excel sheet data to the DataBase
String csvOutput;
try {
Class.forName("org.postgresql.Driver");
Connection con = DriverManager.getConnection("jdbc:postgresql://ENDPOINT:5432/DBNAME","USER", "PASSWORD");
System.out.println("Connected");
// Checking EOF
while ((csvOutput = br.readLine()) != null) {
String[] str = csvOutput.split(",");
String name = str[1];
String query = "insert into schema.tablename(name) values('"+name+"')";
Statement statement = con.createStatement();
statement.executeUpdate(query);
}
System.out.println("Inserted Successfully!!!");
}catch (Exception ase) {
context.getLogger().log(String.format(
"Error getting object %s from bucket %s. Make sure they exist and"
+ " your bucket is in the same region as this function.", key, bucket));
// throw ase;
}
return contentType;
} catch (Exception e) {
e.printStackTrace();
context.getLogger().log(String.format(
"Error getting object %s from bucket %s. Make sure they exist and"
+ " your bucket is in the same region as this function.", key, bucket));
throw e;
}
}
From my code you can see that I mentioned key="SampleData.csv"; is there any way to get the key inside a bucket without specifying a specific file name?
These couple of links would be of help.
http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysHierarchy.html
http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingObjectKeysUsingJava.html
You can list objects using prefix and delimiter to find the key you are looking for without passing a specific filename.
If you need to get the event details on S3, you can actually enable the s3 event notifier to lambda function. Refer the link
You can enable this by,
Click on 'Properties' inside your bucket
Click on 'Events '
Click 'Add notification'
Give a name and select the type of event (eg. Put, delete etc.)
Give prefix and suffix if necessary or else leave blank which consider all events
Then 'Sent to' Lambda function and provide the Lambda ARN.
Now the event details will be sent lambda function as a json format. You can fetch the details from that json. The input will be like this:
{"Records":[{"eventVersion":"2.0","eventSource":"aws:s3","awsRegion":"ap-south-1","eventTime":"2017-11-23T09:25:54.845Z","eventName":"ObjectRemoved:Delete","userIdentity":{"principalId":"AWS:AIDAJASDFGZTLA6UZ7YAK"},"requestParameters":{"sourceIPAddress":"52.95.72.70"},"responseElements":{"x-amz-request-id":"A235BER45D4974E","x-amz-id-2":"glUK9ZyNDCjMQrgjFGH0t7Dz19eBrJeIbTCBNI+Pe9tQugeHk88zHOY90DEBcVgruB9BdU0vV8="},"s3":{"s3SchemaVersion":"1.0","configurationId":"sns","bucket":{"name":"example-bucket1","ownerIdentity":{"principalId":"AQFXV36adJU8"},"arn":"arn:aws:s3:::example-bucket1"},"object":{"key":"SampleData.csv","sequencer":"005A169422CA7CDF66"}}}]}
You can access the key as objectname = event['Records'][0]['s3']['object']['key'](Oops, this is for python)
and then sent this info to RDS.