I am creating a method that needs a S3 client as a parameter. I do not know what type should I declare it to be.
this is the doc for S3Client https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Client.html
Ignore since answered (this is the doc for AmazonS3Client
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html My question is which type is recommended and what are difference between them? Thank you! )
Update:
I find another S3 Client here: AmazonS3 interface.
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3.html
However, setObjectTagging is supported in type AmazonS3 not but in type S3Client .
Does AmazonS3 provide more functionality than S3Client?
What if I need some function in AmazonS3 not in S3Client, or some in S3Client not in AmazonS3?
The AWS SDK for Java has two versions: V1 and V2. AmazonS3Client is the older V1 version while S3Client is the newer V2 version.
Amazon recommends using V2:
The AWS SDK for Java 2.x is a major rewrite of the version 1.x code base. It’s built on top of Java 8+ and adds several frequently requested features. These include support for non-blocking I/O and the ability to plug in a different HTTP implementation at run time.
You can find Amazon S3 V2 code examples in the Java Developer V2 DEV Guide here:
Developer guide - AWS SDK for Java 2.x
(At this point, the Amazon S3 Service guide does not have V2 examples in it.)
In addition, you can find all Amazon S3 V2 code examples in AWS Github here:
https://github.com/awsdocs/aws-doc-sdk-examples/tree/master/javav2/example_code/s3
If you are not familiar developing apps by using the AWS SDK for Java V2, it's recommended that you start here:
Get started with the AWS SDK for Java 2.x
(This getting started topic happens to use the Amazon S3 Java V2 API to help get you up and running with using the AWS SDK for Java V2)
Update:
You stated: However, setObjectTagging is supported in type AmazonS3 not but in type S3Client .
The way to tag an Object in an Amazon S3 bucket by using Java V2 API is to use this code:
// First need to get existing tag set; otherwise the existing tags are overwritten.
GetObjectTaggingRequest getObjectTaggingRequest = GetObjectTaggingRequest.builder()
.bucket(bucketName)
.key(key)
.build();
GetObjectTaggingResponse response = s3.getObjectTagging(getObjectTaggingRequest);
// Get the existing immutable list - cannot modify this list.
List<Tag> existingList = response.tagSet();
ArrayList<Tag> newTagList = new ArrayList(new ArrayList<>(existingList));
// Create a new tag.
Tag myTag = Tag.builder()
.key(label)
.value(LabelValue)
.build();
// push new tag to list.
newTagList.add(myTag);
Tagging tagging = Tagging.builder()
.tagSet(newTagList)
.build();
PutObjectTaggingRequest taggingRequest = PutObjectTaggingRequest.builder()
.key(key)
.bucket(bucketName)
.tagging(tagging)
.build();
s3.putObjectTagging(taggingRequest);
Related
I'm trying to implement multi-part upload to Google Storage but to my surprise it does not seem to be straightforward (I could not find java example).
Only mention I found was in the XML API https://cloud.google.com/storage/docs/multipart-uploads
Also found some discussion around a compose API StorageExample.java#L446 mentioned here google-cloud-java issues 1440
Any recommendations how to do multipart upload?
I got the multi-part upload working with #Koblan suggestion. (for details check blog post)
This is how I create the S3 Client and point it to Google Storage
def createClient(accessKey: String, secretKey: String, region: String = "us"): AmazonS3 = {
val endpointConfig = new EndpointConfiguration("https://storage.googleapis.com", region)
val credentials = new BasicAWSCredentials(accessKey, secretKey)
val credentialsProvider = new AWSStaticCredentialsProvider(credentials)
val clientConfig = new ClientConfiguration()
clientConfig.setUseGzip(true)
clientConfig.setMaxConnections(200)
clientConfig.setMaxErrorRetry(1)
val clientBuilder = AmazonS3ClientBuilder.standard()
clientBuilder.setEndpointConfiguration(endpointConfig)
clientBuilder.withCredentials(credentialsProvider)
clientBuilder.withClientConfiguration(clientConfig)
clientBuilder.build()
}
Because I'm doing the upload from the frontend (after I generate signled URLs for each part using the AmazonS3 client) I needed to enable CORS.
For testing, I enabled everything for now
$ gsutil cors get gs://bucket
$ echo '[{"origin": ["*"],"responseHeader": ["Content-Type", "ETag"],"method": ["GET", "HEAD", "PUT", "DELETE", "PATCH"],"maxAgeSeconds": 3600}]' > cors-config.json
$ gsutil cors set cors-config.json gs://bucket
See https://cloud.google.com/storage/docs/configuring-cors#gsutil_1
Currently Java Client library for multi part upload in Cloud Storage is not available. You can raise a feature request for the same in this link. As mentioned by John Hanley, the next best thing you can do is, do a parallel composite upload with gsutil (CLI), JSON and XML support/ resumable upload with Java libraries.
In parallel compose, the parallel writes can be done by using the JSON or XML API for Google Cloud Storage. Specifically, you would write a number of smaller objects in parallel and then (once all of those objects have been written) call the Compose request to compose them into one larger object.
If you're using the JSON API the compose documentation is at : https://cloud.google.com/storage/docs/json_api/v1/objects/compose
If you're using the XML API the compose documentation is at : https://cloud.google.com/storage/docs/reference-methods#putobject (see the compose query parameter).
Also there is an interesting document link provided by Kolban which you can try and work out. Also I would like to mention that you can have multi part uploads in Java, if you use the Google Drive API(v3). Here is the code example where we use the files.create method with uploadType=multipart.
I am using the TapKey SDK in combination with a FlinkeyBox.
So far (SDK 2.12.7), I used to be able to set the BleServiceUuid in the TapkeyEnvironmentConfigBuilder.
Now I've upgraded to the newest SDK version and the method TapkeyEnvironmentConfigBuilder.setBleServiceUuid is simply gone. I can't find it in any migration guide either.
Can someone help?
Indeed this information is missing. We will cover this in the migration guide.
To change the BLE Service UUID you have now to use the TapkeyBleAdvertisingFormatBuilder.
TapkeyBleAdvertisingFormat advertisingFormat = new TapkeyBleAdvertisingFormatBuilder()
.addV1Format("[serviceUUID]")
//.addV2Format([domainID])
.build();
TapkeyServiceFactory tapkeyServiceFactory = new TapkeyServiceFactoryBuilder(this)
.setBluetoothAdvertisingFormat(advertisingFormat)
...
.build();
New hardware generations will use a new Bluetooth Advertising, which has then to be configured with the V2 format. But for now it will be sufficient just to configure the V1 format. For more informations about how to configure the TapkeyBleAdvertisingFormat please contact your service provider.
I am trying to save an rdd on S3 with server side encryption using KMS key (SSE-KMS), But I am getting the following exception:
Exception in thread "main"
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400,
AWS Service: Amazon S3, AWS Request ID: 695E32175EBA568A, AWS Error
Code: InvalidArgument, AWS Error Message: The encryption method
specified is not supported, S3 Extended Request ID:
Pi+HFLg0WsAWtkdI2S/xViOcRPMCi7zdHiaO5n1f7tiwpJe2z0lPY1C2Cr53PnnUCj3358Gx3AQ=
Following is the piece of my test code to write an rdd on S3 by using SSE-KMS for encryption:
val sparkConf = new SparkConf().
setMaster("local[*]").
setAppName("aws-encryption")
val sc = new SparkContext(sparkConf)
sc.hadoopConfiguration.set("fs.s3a.access.key", AWS_ACCESS_KEY)
sc.hadoopConfiguration.set("fs.s3a.secret.key", AWS_SECRET_KEY)
sc.hadoopConfiguration.setBoolean("fs.s3a.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3a.server-side-encryption-algorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3a.sse.kms.keyId", KMS_ID)
val s3a = new org.apache.hadoop.fs.s3a.S3AFileSystem
val s3aName = s3a.getClass.getName
sc.hadoopConfiguration.set("fs.s3a.impl", s3aName)
val rdd = sc.parallelize(Seq("one", "two", "three", "four"))
println("rdd is: " + rdd.collect())
rdd.saveAsTextFile(s"s3a://$bucket/$objKey")
Although, I am able to write rdd on s3 with AES256 encryption.
Does spark/hadoop have a different value for KMS key encryption instead of "SSE-KMS"?
Can anyone please suggest what I am missing here or doing wrong.
Environment details as follow:
Spark: 1.6.1
Hadoop: 2.6.0
Aws-Java-Sdk: 1.7.4
Thank you in advance.
Unfortunately, It seems like existing version of Hadoop i.e. 2.8 does not support SSE-KMS :(
Following is the observation:
SSE-KMS is not supported till Hadoop 2.8.1
SSE-KMS supposed to be introduced in Hadoop 2.9
In Hadoop 3.0.0alpha version, SSE-KMS is supported.
Same observation w.r.t. AWS SDK for Java
SSE-KMS was introduced in aws-java-sdk 1.9.5
I need a way to allow a 3rd party app to upload a txt file (350KB and slowly growing) to an s3 bucket in AWS. I'm hoping for a solution involving an endpoint they can PUT to with some authorization key or the like in the header. The bucket can't be public to all.
I've read this: http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html
and this: http://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html
but can't quite seem to find the solution I'm seeking.
I'd suggests using a combination of the AWS API gateway, a lambda function and finally S3.
You clients will call the API Gateway endpoint.
The endpoint will execute an AWS lambda function that will then write out the file to S3.
Only the lambda function will need rights to the bucket, so the bucket will remain non-public and protected.
If you already have an EC2 instance running, you could replace the lambda piece with custom code running on your EC2 instance, but using lambda will allow you to have a 'serverless' solution that scales automatically and has no min. monthly cost.
I ended up using the AWS SDK. It's available for Java, .NET, PHP, and Ruby, so there's very high probability the 3rd party app is using one of those. See here: http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadObjSingleOpNET.html
In that case, it's just a matter of them using the SDK to upload the file. I wrote a sample version in .NET running on my local machine. First, install the AWSSDK Nuget package. Then, here is the code (taken from AWS sample):
C#:
var bucketName = "my-bucket";
var keyName = "what-you-want-the-name-of-S3-object-to-be";
var filePath = "C:\\Users\\scott\\Desktop\\test_upload.txt";
var client = new AmazonS3Client(Amazon.RegionEndpoint.USWest2);
try
{
PutObjectRequest putRequest2 = new PutObjectRequest
{
BucketName = bucketName,
Key = keyName,
FilePath = filePath,
ContentType = "text/plain"
};
putRequest2.Metadata.Add("x-amz-meta-title", "someTitle");
PutObjectResponse response2 = client.PutObject(putRequest2);
}
catch (AmazonS3Exception amazonS3Exception)
{
if (amazonS3Exception.ErrorCode != null &&
(amazonS3Exception.ErrorCode.Equals("InvalidAccessKeyId")
||
amazonS3Exception.ErrorCode.Equals("InvalidSecurity")))
{
Console.WriteLine("Check the provided AWS Credentials.");
Console.WriteLine(
"For service sign up go to http://aws.amazon.com/s3");
}
else
{
Console.WriteLine(
"Error occurred. Message:'{0}' when writing an object"
, amazonS3Exception.Message);
}
}
Web.config:
<add key="AWSAccessKey" value="your-access-key"/>
<add key="AWSSecretKey" value="your-secret-key"/>
You get the accesskey and secret key by creating a new user in your AWS account. When you do so, they'll generate those for you and provide them for download. You can then attach the AmazonS3FullAccess policy to that user and the document will be uploaded to S3.
NOTE: this was a POC. In the actual 3rd party app using this, they won't want to hardcode the credentials in the web config for security purposes. See here: http://docs.aws.amazon.com/AWSSdkDocsNET/latest/V2/DeveloperGuide/net-dg-config-creds.html
Typically emrfs consistency is enabled via emrfs-site.xml
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emrfs-configure-consistent-view.html
Does anyone know if these setting can be accessed via the SDK?
To enable EMRFS with the Java SDK, an "emrfs-site" configuration needs to be added to the RunJobFlowRequest and the fs.s3.consistent property must be set to true. Like this:
Map<String, String> emrfsProperties = new HashMap<>();
emrfsProperties.put("fs.s3.consistent", "true");
RunJobFlowRequest request = new RunJobFlowRequest()
....
.withServiceRole(SERVICE_ROLE)
.withJobFlowRole(JOB_FLOW_ROLE)
.withConfigurations(
new Configuration().withClassification("yarn-site").withProperties(yarnProperties),
new Configuration().withClassification("emrfs-site").withProperties(emrfsProperties)
)
.withInstances(new JobFlowInstancesConfig()
.withEc2KeyName(EC2_KEYPAIR)
....
A full list of EMRFS configuration parameters can be found here
Yes, you have a full documentation here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/Welcome.html
You need to authorize connection to your AWS first, than you can configure you application to your needs, using API.
Look also here:
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/CommonParameters.html
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/EmrConfigurations.html