How to connect different Regional S3 Bucket from a Spring Boot Application? - amazon-web-services

I have a Spring Boot application that has a POST end-point that accepts 2 types of files. Based on the file category, I need to write them to S3 buckets which are in different regions. Example: Category 1 file should be written to Frankfurt (eu-central-1) and Category 2 file should be written to Ohio (us-east-2) S3 buckets. Spring boot accepts a static region (cloud.aws.region.static=eu-central-1) through property configuration and the connection is established when starting the Spring boot so the AmazoneS3 Client Bean is already created with a connection to Frankfurt itself.
I need to containerize this entire setup and deploy it in a K8 Pod.
What is the recommendation for establishing connections and writing objects to different regional buckets? How do I need to implement this? Looking for a dynamic region finding solution rather statically created Bean per region.
Below is a working piece of code that connects to Frankfurt bucket and PUT the object.
#Service
public class S3Service {
#Autowired
private AmazonS3 amazonS3Client;
public void putObject(MultipartFile multipartFile) {
ObjectMetadata objectMetaData = new ObjectMetadata();
objectMetaData.setContentType(multipartFile.getContentType());
objectMetaData.setContentLength(multipartFile.getSize());
try {
PutObjectRequest putObjectRequest = new PutObjectRequest("example-bucket", multipartFile.getOriginalFilename(), multipartFile.getInputStream(), objectMetaData);
this.amazonS3Client.putObject(putObjectRequest);
} catch (IOException e) {
/* Handle Exception */
}
}
}
Updated Code (20/08/2021)
#Component
public class AmazoneS3ConnectionFactory {
private static final Logger LOGGER = LoggerFactory.getLogger(AmazoneS3ConnectionFactory.class);
#Value("${example.aws.s3.regions}")
private String[] regions;
#Autowired
private DefaultListableBeanFactory beanFactory;
#Autowired
private AWSCredentialsProvider credentialProvider;
#PostConstruct
public void init() {
for(String region: this.regions) {
String amazonS3BeanName = region + "_" + "amazonS3";
if (!this.beanFactory.containsBean(amazonS3BeanName)) {
AmazonS3ClientBuilder builder = AmazonS3ClientBuilder.standard().withPathStyleAccessEnabled(true)
.withCredentials(this.credentialProvider).withRegion(region).withChunkedEncodingDisabled(true);
AmazonS3 awsS3 = builder.build();
this.beanFactory.registerSingleton(amazonS3BeanName, awsS3);
LOGGER.info("Bean " + amazonS3BeanName + " - Not exist. Created a bean and registered the same");
}
}
}
/**
* Returns {#link AmazonS3} for a region. Uses the default {#link AWSCredentialsProvider}
*/
public AmazonS3 getConnection(String region) {
String amazonS3BeanName = region + "_" + "amazonS3";
return (AmazonS3Client)this.beanFactory.getBean(amazonS3BeanName, AmazonS3.class);
}
}
My Service layer will call the "getConnection()" and get the AmazonS3 Object to operate on it.

The only option that I am aware is to create different S3Client with S3ClientBuilder, one for each different region. You would need to register them as Spring Beans with different names so that you can later autowire them.
Update (19/08/2021)
The following should work (sorry for the Kotlin code but it is faster to write):
Class that may contain your configuration for each region.
class AmazonS3Properties(val accessKeyId: String,
val secretAccessKey: String,
val region: String,
val bucket: String)
Configuration for S3 that will create 2 S3Clients and stored the buckets for each region (later needed).
#Configuration
class AmazonS3Configuration(private val s3Properties: Map<String, AmazonS3Properties>) {
lateinit var buckets: Map<String, String>
#PostConstruct
fun init() {
buckets = s3Properties.mapValues { it.bucket }
}
#Bean(name = "regionA")
fun regionA(): S3Client {
val regionAProperties = s3Properties["region-a"]
val awsCredentials = AwsBasicCredentials.create(regionAProperties.accessKeyId, regionAProperties.secretAccessKey)
return S3Client.builder().region(Region.of(regionAProperties.region)).credentialsProvider { awsCredentials }.build()
}
#Bean(name = "regionB")
fun regionB(): S3Client {
val regionBProperties = s3Properties["region-b"]
val awsCredentials = AwsBasicCredentials.create(regionBProperties.accessKeyId, regionBProperties.secretAccessKey)
return S3Client.builder().region(Region.of(regionBProperties.region)).credentialsProvider { awsCredentials }.build()
}
}
Service that will target one of the regions (Region A)
#Service
class RegionAS3Service(private val amazonS3Configuration: AmazonS3Configuration,
#field:Qualifier("regionA") private val amazonS3Client: S3Client) {
fun save(region: String, byteArrayOutputStream: ByteArrayOutputStream) {
val inputStream = ByteArrayInputStream(byteArrayOutputStream.toByteArray())
val contentLength = byteArrayOutputStream.size().toLong()
amazonS3Client.putObject(PutObjectRequest.builder().bucket(amazonS3Configuration.buckets[region]).key("whatever-key").build(), RequestBody.fromInputStream(inputStream, contentLength))
}
}

Related

DynamoDB client with auto refresh credentials

I'm creating DynamoDB client using IAM credentials obtained via STS assume role.
#Provides
public DynamoDbEnhancedClient DdbClientProvider() {
final AWSSecurityTokenServiceClientBuilder stsClientBuilder = AWSSecurityTokenServiceClientBuilder.standard()
.withClientConfiguration(new ClientConfiguration());
final AssumeRoleRequest assumeRoleRequest = new AssumeRoleRequest().withRoleSessionName("some.session.name");
assumeRoleRequest.setRoleArn("arnRole");
final AssumeRoleResult assumeRoleResult = stsClientBuilder.build().assumeRole(assumeRoleRequest);
final Credentials creds = assumeRoleResult.getCredentials();
final AwsSessionCredentials sessionCredentials = AwsSessionCredentials.create(creds.getAccessKeyId()
, creds.getSecretAccessKey(), creds.getSessionToken());
final AwsCredentialsProviderChain credsProvider = AwsCredentialsProviderChain.builder()
.credentialsProviders(StaticCredentialsProvider.create(sessionCredentials))
.build();
final DynamoDbClient ddbClient = DynamoDbClient.builder().region(Region.US_EAST_1)
.credentialsProvider(credsProvider).build();
final DynamoDbEnhancedClient ddbEnhancedclient =
DynamoDbEnhancedClient.builder().dynamoDbClient(ddbClient).build();
return ddbEnhancedClient;
}
The main lambda handler looks like below:
public void LambdaMainHandler {
DynamoDbEnhancedClient ddbClient;
#Inject
public LambdaMainHandler(final DynamoDbEnhancedClient client) {
this.ddbClient = client;
}
public LambdaResponse processRequest(final LambdaRequest request) {
QueryResponse queryResponse = client.query("...")
return LambdaResponse.builder().setContent(queryResponse.getByteBuffer()).build();
}
}
I'm using the DDB client in LambdaMain constructor.
Since this is running in Lambda behind APIGateway, how do I make sure the credentials are refreshed when they expire while executing LambdaMain handler?

Dataflow job is failing

I'm trying to trigger a dataflow job to process a csv file then save the data into postgresql.
the pipeline is written in Java
this is my pipeline code:
public class DataMappingService {
DataflowPipelineOptions options;
Pipeline pipeline;
private String jdbcUrl;
private String DB_NAME;
private String DB_PRIVATE_IP;
private String DB_USERNAME;
private String DB_PASSWORD;
private String PROJECT_ID;
private String SERVICE_ACCOUNT;
private static final String SQL_INSERT = "INSERT INTO upfit(upfitter_id, model_number, upfit_name, upfit_description, manufacturer, length, width, height, dimension_unit, weight, weight_unit, color, price, stock_number) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?)";
public DataMappingService() {
DB_NAME = System.getenv("DB_NAME");
DB_PRIVATE_IP = System.getenv("DB_PRIVATEIP");
DB_USERNAME = System.getenv("DB_USERNAME");
DB_PASSWORD = System.getenv("DB_PASSWORD");
PROJECT_ID = System.getenv("GOOGLE_PROJECT_ID");
SERVICE_ACCOUNT = System.getenv("SERVICE_ACCOUNT_EMAIL");
jdbcUrl = "jdbc:postgresql://" + DB_PRIVATE_IP + ":5432/" + DB_NAME;
System.out.println("jdbcUrl: " + jdbcUrl);
System.out.println("dbUsername: " + DB_USERNAME);
System.out.println("dbPassword: " + DB_PASSWORD);
System.out.println("dbName: " + DB_NAME);
System.out.println("projectId: " + PROJECT_ID);
System.out.println("service account: " + SERVICE_ACCOUNT);
options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
options.setRunner(DataflowRunner.class);
options.setProject(PROJECT_ID);
options.setServiceAccount(SERVICE_ACCOUNT);
options.setWorkerRegion("us-east4");
options.setTempLocation("gs://upfit_data_flow_bucket/temp");
options.setStagingLocation("gs://upfit_data_flow_bucket/binaries");
options.setRegion("us-east4");
options.setSubnetwork("regions/us-east-4/subnetworks/us-east4-public");
options.setMaxNumWorkers(3);
}
public void processData(String gcsFilePath) {
try {
pipeline = Pipeline.create(options);
System.out.println("pipelineOptions: " +pipeline.getOptions());
pipeline.apply("ReadLines", TextIO.read().from(gcsFilePath))
.apply("SplitLines", new SplitLines())
.apply("SplitRecords", new SplitRecord())
.apply("ReadUpfits", new ReadUpfits());
.apply("write upfits", JdbcIO.<Upfit>write()
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
"org.postgresql.Driver", jdbcUrl)
.withUsername(DB_USERNAME)
.withPassword(DB_PASSWORD))
.withStatement(SQL_INSERT)
.withPreparedStatementSetter(new JdbcIO.PreparedStatementSetter<Upfit>() {
private static final long serialVersionUID = 1L;
#Override
public void setParameters(Upfit element, PreparedStatement query) throws SQLException {
query.setInt(1, element.getUpfitterId());
query.setString(2, element.getModelNumber());
query.setString(3, element.getUpfitName());
query.setString(4, element.getUpfitDescription());
query.setString(5, element.getManufacturer());
query.setString(6, element.getLength());
query.setString(7, element.getWidth());
query.setString(8, element.getHeight());
query.setString(9, element.getDimensionsUnit());
query.setString(10, element.getWeight());
query.setString(11, element.getWeightUnit());
query.setString(12, element.getColor());
query.setString(13, element.getPrice());
query.setInt(14, element.getStockAmmount());
}
}
));
pipeline.run();
} catch (Exception e) {
e.printStackTrace();
}
}
}
I have created a user managed service account as the documentation says: https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#specifying_a_user-managed_worker_service_account and I'm providing the service account email in the pipeline options.
The service account has the following roles:
roles/dataflow.worker
roles/storage.admin
iam.serviceAccounts.actAs
roles/dataflow.admin
Service Account Token Creator
when I upload the csv file, the pipeline is triggered but I'm getting the following error:
Workflow failed. Causes: Subnetwork https://www.googleapis.com/compute/v1/projects/project-name/regions/us-east-4/subnetworks/us-east4-public is not accessible to Dataflow Service account or does not exist
I know that the subnetwork exists so I'm assuming its a permission error. The vpc network is created by my organization as we're not allowed to create our own.

AmazonS3ClientBuilder is stuck on build function

I'm using the code suggested on amazon documentation for uploading files to amazon buckets. The code is running on some machines, but on others, it doesn't pass the build() line.
Here is the code:
private static AWSBucketManager instance = null;
private final AmazonS3 s3;
private String clientRegion= Settings.getSettingValue("AWS_REGION");
private String secretKey = Settings.getSettingValue("AWS_SECRETKEY");
private String accesssKey = Settings.getSettingValue("AWS_ACCESSKEY");
private AWSBucketManager()
{
LoggingService.writeToLog("Login to aws bucket with basic creds",LogModule.Gateway, LogLevel.Info);
BasicAWSCredentials creds = new BasicAWSCredentials(accesssKey, secretKey);
LoggingService.writeToLog("Success Login to aws bucket with basic creds",LogModule.Gateway, LogLevel.Info);
s3 = AmazonS3ClientBuilder.standard()
.withRegion(clientRegion)
.withCredentials(new AWSStaticCredentialsProvider(creds))
.build();
LoggingService.writeToLog("Login successfully to aws bucket with basic creds",LogModule.Gateway, LogLevel.Info);
}
public static AWSBucketManager getInstance()
{
if(instance == null)
{
instance = new AWSBucketManager();
}
return instance;
}
any idea what is going wrong? or how I can debug it with logs?
Problem occurred because my code is obfuscated with proguard.
adding the following rules solved this issue:
-keep class org.apache.commons.logging.** { *; }
-keepattributes Signature,Annotation
-keep class com.amazonaws.** { *; }

How to upload file in AWS S3 Bucket through endpoint in ASP.NET Core

I am using ASP.NET Core and AWSSDK.S3 nuget package.
I am able to upload file by providing accessKeyID, secretKey, bucketName and region
Like this:
var credentials = new BasicAWSCredentials(accessKeyID, secretKey);
using (var client = new AmazonS3Client(credentials, RegionEndpoint.USEast1))
{
var request = new PutObjectRequest
{
AutoCloseStream = true,
BucketName = bucketName,
InputStream = storageStream,
Key = fileName
}
}
But I am given only an URL to upload file
11.11.11.111:/aa-bb-cc-dd-useast1
How to upload file through the URL? I am new to AWS,I will be grateful to get some help.
using Amazon.S3;
using Amazon.S3.Transfer;
using System;
using System.IO;
using System.Threading.Tasks;
namespace Amazon.DocSamples.S3
{
class UploadFileMPUHighLevelAPITest
{
private const string bucketName = "*** provide bucket name ***";
private const string filePath = "*** provide the full path name of the file to upload ***";
// Specify your bucket region (an example region is shown).
private static readonly RegionEndpoint bucketRegion = RegionEndpoint.USWest2;
private static IAmazonS3 s3Client;
public static void Main()
{
s3Client = new AmazonS3Client(bucketRegion);
UploadFileAsync().Wait();
}
private static async Task UploadFileAsync()
{
try
{
var fileTransferUtility =
new TransferUtility(s3Client);
// Option 1. Upload a file. The file name is used as the object key name.
await fileTransferUtility.UploadAsync(filePath, bucketName);
Console.WriteLine("Upload 1 completed");
}
}
}
}
https://docs.aws.amazon.com/AmazonS3/latest/dev/HLuploadFileDotNet.html
You can use the provided access point in place of the bucket name.
https://docs.aws.amazon.com/sdkfornet/v3/apidocs/items/S3/TPutObjectRequest.html

Exception when trying to connect to AWS Athena using JAVA API

I'm trying to execute query in AWS Athena using Java API:
public class AthenaClientFactory
{
String accessKey = "access";
String secretKey = "secret";
BasicAWSCredentials awsCredentials = new
BasicAWSCredentials(accessKey, secretKey);
private final AmazonAthenaClientBuilder builder = AmazonAthenaClientBuilder.standard()
.withRegion(Regions.US_WEST_1)
.withCredentials(new AWSStaticCredentialsProvider(awsCredentials))
.withClientConfiguration(new ClientConfiguration().withClientExecutionTimeout(10));
public AmazonAthena createClient()
{
return builder.build();
}
}
private static String submitAthenaQuery(AmazonAthena client) {
QueryExecutionContext queryExecutionContext = new QueryExecutionContext().withDatabase("my_db");
ResultConfiguration resultConfiguration = new ResultConfiguration().withOutputLocation("my_bucket");
StartQueryExecutionRequest startQueryExecutionRequest = new StartQueryExecutionRequest()
.withQueryString("select * from my_db limit 3;")
.withQueryExecutionContext(queryExecutionContext)
.withResultConfiguration(resultConfiguration);
StartQueryExecutionResult startQueryExecutionResult = client.startQueryExecution(startQueryExecutionRequest);
return startQueryExecutionResult.getQueryExecutionId();
}
public void run() throws InterruptedException {
AthenaClientFactory factory = new AthenaClientFactory();
AmazonAthena client = factory.createClient();
String queryExecutionId = submitAthenaQuery(client);
}
But I get an exception from startQueryExecutionResult.
The exception is:
Client execution did not complete before the specified timeout
configuration.
Has anyone encountered something similar?
The problem was in withClientExecutionTimeout(10).
Increasing this number to 5000 solved the issue