How can I use Apache Flink to read parquet file in HDFS? - hdfs

I only find TextInputFormat and CsvInputFormat. So how can I use Apache Flink to read parquet file in HDFS?

Ok. I have already found a way to read parquet file in HDFS through Apache Flink.
You should add below dependencies in your pom.xml
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-hadoop-compatibility_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-avro</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.10.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
Create an avsc file to define the schema. Exp:
{"namespace": "com.flinklearn.models",
"type": "record",
"name": "AvroTamAlert",
"fields": [
{"name": "raw_data", "type": ["string","null"]}
]
}
Run "java -jar D:\avro-tools-1.8.2.jar compile schema alert.avsc ." to generate Java class and copy AvroTamAlert.java to your project.
Use AvroParquetInputFormat to read parquet file in hdfs:
class Main {
def startApp(): Unit ={
val env = ExecutionEnvironment.getExecutionEnvironment
val job = Job.getInstance()
val dIf = new HadoopInputFormat[Void, AvroTamAlert](new AvroParquetInputFormat(), classOf[Void], classOf[AvroTamAlert], job)
FileInputFormat.addInputPath(job, new Path("/user/hive/warehouse/testpath"))
val dataset = env.createInput(dIf)
println(dataset.count())
env.execute("start hdfs parquet test")
}
}
object Main {
def main(args:Array[String]):Unit = {
new Main().startApp()
}
}

Related

I am migrating spring boot 2.7.3 version to spring boot 3.0.2 where SQSConnectionFactory is not compatible with spring boot 3.0.2 version

below is configuration class where we are creating bean of DefaultJmsListenerContainerFactory class
import com.amazon.sqs.javamessaging.SQSConnectionFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jms.annotation.EnableJms;
import org.springframework.jms.config.DefaultJmsListenerContainerFactory;
import org.springframework.jms.support.destination.DynamicDestinationResolver;
import javax.jms.Session;
#Configuration
#EnableJms
public class JmsConfiguration {
#Bean
public DefaultJmsListenerContainerFactory jmsListenerContainerFactory(SQSConnectionFactory con) {
DefaultJmsListenerContainerFactory factory =
new DefaultJmsListenerContainerFactory();
factory.setConnectionFactory(connectionFactory);
factory.setDestinationResolver(new DynamicDestinationResolver());
factory.setConcurrency("3-10");
factory.setSessionAcknowledgeMode(Session.AUTO_ACKNOWLEDGE);
return factory;
}
}
gradle.properties file where we have given all versions
sqsVersion=1.0.8
#sqsVersion=2.0.1
stsVersion=1.11.759
awsCoreVersion=1.11.759
#awsCoreVersion=2.0.1
hibernateVersion=6.1.6.Final
secretsManagerJdbcVersion=1.0.5
secretsManagerCacheVersion=1.0.1
springdocOpenApiUiVersion=1.6.6
ssmVersion=1.11.755
method setConnectionFactory() is not accepting SQSConnectionFactory object
I have tried with different version but no luck
Please suggest appropriate version of SQSConnectionFactory which can work with spring boot 3.0.2
version
The AWS SDK for Java V1 is not recommended to use as per the guidelines on the AWS Page here:
https://github.com/awsdocs/aws-doc-sdk-examples (look at the table near end of the page)
The recommended version is AWS SDK for Java V2. Here is the SQSConnectionFactory class you want to use in V2:
https://github.com/awslabs/amazon-sqs-java-messaging-lib/blob/master/src/main/java/com/amazon/sqs/javamessaging/SQSConnectionFactory.java
You can use AWS Java API v2 with Spring BOOT 3 that requires JDK 17. I have verfified that AWS SDK Java v2 works with Spring Boot 3 using a custom project. My Spring BOOT 3 project's POM is:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.0.2</version>
<relativePath/> <!-- lookup parent from repository -->
</parent>
<groupId>aws-spring3</groupId>
<artifactId>ItemTrackerRDSRest3</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>ItemTrackerRDSRest3</name>
<description>Demo project for Spring Boot 3 and AWS</description>
<properties>
<java.version>17</java.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>bom</artifactId>
<version>2.19.14</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.9.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>ses</artifactId>
</dependency>
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<version>3.23.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>rdsdata</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>protocol-core</artifactId>
</dependency>
<dependency>
<groupId>jakarta.mail</groupId>
<artifactId>jakarta.mail-api</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>com.sun.mail</groupId>
<artifactId>jakarta.mail</artifactId>
<version>1.6.5</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>net.sourceforge.jexcelapi</groupId>
<artifactId>jxl</artifactId>
<version>2.6.12</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-commons</artifactId>
<version>2.7.3</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
</plugin>
</plugins>
</build>
</project>
Now I am able to use AWS SDK for Java V2 API in a Spring BOOT 3 project. My database example successfully queries data from an Amazon Aurora Serverless database.

Scala Spark Read from AWS S3 - com.amazonaws.SdkClientException: Unable to load credentials from service endpoint

I'm currently trying to do a simple read from an S3 bucket I've set up, using Spark 3.0.0 (implementation via Scala 2.12.10). However, I am receiving this error when submit the script:
No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider :
com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
I'm implementing the current spark script:
package org.knd
import scala.util.Properties
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.{SparkSession, SQLContext}
import io.delta.tables._
import org.apache.spark.sql.functions._
object App {
def main(args: Array[String]) : Unit = {
val spark = SparkSession
.builder()
.appName("covid-delta-lake")
.master("local")
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
.getOrCreate()
val aws_access_key = scala.util.Properties.envOrElse("AWS_ACCESS_KEY", "notAvailable" )
val aws_secret = scala.util.Properties.envOrElse("AWS_SECRET_ACCESS_KEY_ID", "notAvailable" )
spark.sparkContext.hadoopConfiguration.set("fs.s3a.awsAccessKeyId", aws_access_key)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.awsSecretAccessKey", aws_secret)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.endpoint", "s3.amazonaws.com")
print("\n" + "====================HERE====================" + "\n")
val data = spark.read.parquet("s3a://[url-to-my-s3]/*.parquet")
data.show(10)
}
}
I've double checked my AWS keys and s3 URL, so I'm certain those aren't the issue. I've tried reading from other buckets and am receiving the same error. I've included my POM file below:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.knd</groupId>
<artifactId>delta-lake-scala</artifactId>
<version>1.0-SNAPSHOT</version>
<inceptionYear>2020</inceptionYear>
<properties>
<scala.version>2.12.10</scala.version>
</properties>
<repositories>
<repository>
<id>scala-tools.org</id>
<name>Scala-Tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-Tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_2.12</artifactId>
<version>0.7.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>3.0.0</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-eclipse-plugin</artifactId>
<configuration>
<downloadSources>true</downloadSources>
<buildcommands>
<buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>
</buildcommands>
<additionalProjectnatures>
<projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>
</additionalProjectnatures>
<classpathContainers>
<classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
<classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>
</classpathContainers>
</configuration>
</plugin>
</plugins>
</build>
<reporting>
<plugins>
</plugins>
</reporting>
</project>
According to this source, you can set environment variables using these keys
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
or set the following hadoop configurations:
fs.s3n.awsAccessKeyId
fs.s3n.awsSecretAccessKey
These are different than the environment and keys you use.

How to create a codepipeline to build jar file from java code stored at github and deploy it to lambda function?

I want to build a codepipeline that will get the code(java) from github build a jar file and deploy it to aws lamda(or store the jar in a specific S3 bucket). I only want to use tools provided by AWS platform only.
If I am using only Codebuild I am able to build jar from the github code and store it to S3(https://docs.aws.amazon.com/codebuild/latest/userguide/getting-started.html) and I am using a deployer lamda function to deploy the code to my service lamda. Whenever there is any change in the S3 bucket deployer lamda gets triggred.
DrawBack: Problem with this is I have to run codebuild manually everytime after commiting changes to github. I want this codebuild to detect changes automatically from github.
To solve the above issue I have made a code pipeline which detect code changes using github webhooks but here it is creating zip file instead of jar
So what I am actually trying is:
GitHub(changes)--->codebuild-->store jar file to specific S3 bucket with specific name or deploy to lambda
buildspec.yml
version: 0.2
phases:
build:
commands:
- echo Build started on `date`
- mvn test
post_build:
commands:
- echo Build completed on `date`
- mvn package
artifacts:
files:
- target/testfunction-1.0.0-jar-with-dependencies.jar
First off CodeDeploy is baffling when it comes to setting up a simple pipeline to update the lambda when a GitHub commit happens. It shouldn't be this hard. We created the following Lambda function that can process the CodePipeline job build artifact (ZIP) and push the JAR update to Lambda using updateFunctionCode.
import com.amazonaws.services.codepipeline.AWSCodePipeline;
import com.amazonaws.services.codepipeline.AWSCodePipelineClientBuilder;
import com.amazonaws.services.codepipeline.model.FailureDetails;
import com.amazonaws.services.codepipeline.model.PutJobFailureResultRequest;
import com.amazonaws.services.codepipeline.model.PutJobSuccessResultRequest;
import com.amazonaws.services.lambda.AWSLambda;
import com.amazonaws.services.lambda.AWSLambdaClientBuilder;
import com.amazonaws.services.lambda.model.UpdateFunctionCodeRequest;
import com.amazonaws.services.lambda.runtime.Context;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.GetObjectRequest;
import com.amazonaws.services.s3.model.S3Object;
import org.json.JSONObject;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.nio.ByteBuffer;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
/**
* Created by jonathan and josh on 1/22/2019.
* <p>
* Process Code Pipeline Job
*/
#SuppressWarnings("unused")
public class CodePipelineLambdaUpdater {
private static AWSCodePipeline codepipeline = null;
private static AmazonS3 s3 = null;
private static AWSLambda lambda = null;
#SuppressWarnings("UnusedParameters")
public void handler(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
// Read the the job JSON object
String json = new String(readStreamToByteArray(inputStream), "UTF-8");
JSONObject eventJsonObject = new JSONObject(json);
// Extract the jobId first
JSONObject codePiplineJobJsonObject = eventJsonObject.getJSONObject("CodePipeline.job");
String jobId = codePiplineJobJsonObject.getString("id");
// Initialize the code pipeline client if necessary
if (codepipeline == null) {
codepipeline = AWSCodePipelineClientBuilder.defaultClient();
}
if (s3 == null) {
s3 = AmazonS3ClientBuilder.defaultClient();
}
if (lambda == null) {
lambda = AWSLambdaClientBuilder.defaultClient();
}
try {
// The bucketName and objectKey refer to the intermediate ZIP file produced by CodePipeline
String bucketName = codePiplineJobJsonObject.getJSONObject("data").getJSONArray("inputArtifacts").getJSONObject(0).getJSONObject("location").getJSONObject("s3Location").getString("bucketName");
String objectKey = codePiplineJobJsonObject.getJSONObject("data").getJSONArray("inputArtifacts").getJSONObject(0).getJSONObject("location").getJSONObject("s3Location").getString("objectKey");
// The user parameter is the Lambda function name that we want to update. This is configured when adding the CodePipeline Action
String functionName = codePiplineJobJsonObject.getJSONObject("data").getJSONObject("actionConfiguration").getJSONObject("configuration").getString("UserParameters");
System.out.println("bucketName: " + bucketName);
System.out.println("objectKey: " + objectKey);
System.out.println("functionName: " + functionName);
// Download the object
S3Object s3Object = s3.getObject(new GetObjectRequest(bucketName, objectKey));
// Read the JAR out of the ZIP file. Should be the only file for our Java code
ZipInputStream zis = new ZipInputStream(s3Object.getObjectContent());
ZipEntry zipEntry;
byte[] data = null;
//noinspection LoopStatementThatDoesntLoop
while ((zipEntry = zis.getNextEntry()) != null) {
if (zipEntry.getName().endsWith(".jar")) {
System.out.println("zip file: " + zipEntry.getName());
data = readStreamToByteArray(zis);
System.out.println("Length: " + data.length);
break;
}
}
// If we have data then update the function
if (data != null) {
// Update the lambda function
UpdateFunctionCodeRequest updateFunctionCodeRequest = new UpdateFunctionCodeRequest();
updateFunctionCodeRequest.setFunctionName(functionName);
updateFunctionCodeRequest.setPublish(true);
updateFunctionCodeRequest.setZipFile(ByteBuffer.wrap(data));
lambda.updateFunctionCode(updateFunctionCodeRequest);
System.out.println("Updated function: " + functionName);
// Indicate success
PutJobSuccessResultRequest putJobSuccessResultRequest = new PutJobSuccessResultRequest();
putJobSuccessResultRequest.setJobId(jobId);
codepipeline.putJobSuccessResult(putJobSuccessResultRequest);
} else {
// Failre the job
PutJobFailureResultRequest putJobFailureResultRequest = new PutJobFailureResultRequest();
putJobFailureResultRequest.setJobId(jobId);
FailureDetails failureDetails = new FailureDetails();
failureDetails.setMessage("No data available to update function with.");
putJobFailureResultRequest.setFailureDetails(failureDetails);
codepipeline.putJobFailureResult(putJobFailureResultRequest);
}
System.out.println("Finished");
} catch (Throwable e) {
// Handle all other exceptions
System.out.println("Well that ended badly...");
e.printStackTrace();
PutJobFailureResultRequest putJobFailureResultRequest = new PutJobFailureResultRequest();
putJobFailureResultRequest.setJobId(jobId);
FailureDetails failureDetails = new FailureDetails();
failureDetails.setMessage("Failed with error: " + e.getMessage());
putJobFailureResultRequest.setFailureDetails(failureDetails);
codepipeline.putJobFailureResult(putJobFailureResultRequest);
}
}
private static void copy(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[100000];
for (; ; ) {
int rc = in.read(buffer);
if (rc == -1) break;
out.write(buffer, 0, rc);
}
out.flush();
}
private static byte[] readStreamToByteArray(InputStream in) throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
copy(in, baos);
} finally {
safeClose(in);
}
return baos.toByteArray();
}
private static InputStream safeClose(InputStream in) {
try {
if (in != null) in.close();
} catch (Throwable ignored) {
}
return null;
}
}
This is the project Maven file.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.yourcompany</groupId>
<artifactId>codepipeline-lambda-updater</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-bom</artifactId>
<version>1.11.487</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-lambda-java-core</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-lambda</artifactId>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-core</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.11.487</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-codepipeline -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-codepipeline</artifactId>
<version>1.11.487</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.10.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.10.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-lambda-java-log4j2</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
<version>15.0</version>
</dependency>
<!--<dependency>-->
<!--<groupId>com.google.code.gson</groupId>-->
<!--<artifactId>gson</artifactId>-->
<!--<version>2.8.2</version>-->
<!--</dependency>-->
<!-- https://mvnrepository.com/artifact/org.json/json -->
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20180813</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.1</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer
implementation="com.github.edwgiz.mavenShadePlugin.log4j2CacheTransformer.PluginsCacheFileTransformer">
</transformer>
</transformers>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>com.github.edwgiz</groupId>
<artifactId>maven-shade-plugin.log4j2-cachefile-transformer</artifactId>
<version>2.8.1</version>
</dependency>
</dependencies>
</plugin>
</plugins>
</build>
</project>
This baseline should get you started. Embellish the code to do fancier deployments using further SDK calls as you see fit.
CodePipeline artifact locations are different for each pipeline execution so they're isolated.
I think what you'll want to do is produce a JAR file in CodeBuild, which will end up in a CodePipeline artifact with a ZIP format. You can add a second CodeBuild action that accepts the output of the first CodeBuild action (the CodeBuild action will unzip the input artifact for you) and deploys to S3 (this is pretty trivial to script with the the AWS CLI).
It's entirely possible to combine both CodeBuild actions, but I like to keep the "build" and "deploy" steps separate.

Make Reactive Kafka work with Confluent Schema Registry Avro Schema

How to make Reactive Kafka (https://github.com/akka/reactive-kafka) work with Confluent Schema Registry Avro Schema? Here is my sample code:
def create(groupId: String)(implicit system: ActorSystem): Source[ConsumerMessage.CommittableMessage[String, Array[Byte]], Consumer.Control] = {
val consumerSettings = ConsumerSettings(system, new StringDeserializer, new ByteArrayDeserializer)
.withBootstrapServers(bootstrap)
.withGroupId(groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
.withProperty("schema.registry.url", schemaRegistryUrl)
.withProperty(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, classOf[String].getName)
.withProperty(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, classOf[KafkaAvroDeserializer].getName)
.withProperty(KafkaAvroDeserializerConfig.SPECIFIC_AVRO_READER_CONFIG, "true")
Consumer.committableSource(consumerSettings, Subscriptions.topics(createDataSetJobRequestTopic))
}
You just need to configure the consumer to use the Confluent deserializer : io.confluent.kafka.serializers.KafkaAvroDeserializer.class
Set the following properties :
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG
In addition, you have to add the property "schema.registry.url" in order to specify at least one address pointing to your schema registry instance.
Finally, you have to add the following dependecy to your project :
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>${io.confluent.version}</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
Confluent Documentation

Why the arquillian gives me that : Could not read active container configuration?

I try to integrate the arquillian solution into my maven EJB project which contains juste the EJBs which I uses in other separate projects.
I use Jboss EAP6.
So i have make it as the following :
I made the arquillian.xml into ejbModule/src/test/resources/:
<container qualifier="jboss" default="true">
<configuration>
<property name="jbossHome">D:\jbdevstudio\jboss-eap-6.2</property>
</configuration>
</container>
in the pom of my project i added the following dependencies:
<dependency>
<groupId>org.jboss.arquillian.container</groupId>
<artifactId>arquillian-jbossas-embedded-6</artifactId>
<version>1.0.0.Alpha5</version>
</dependency>
<dependency>
<groupId>org.jboss.arquillian</groupId>
<artifactId>arquillian-junit</artifactId>
<version>1.0.0.Alpha5</version>
</dependency>
<dependency>
<groupId>org.jboss.arquillian.junit</groupId>
<artifactId>arquillian-junit-container</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.jboss.arquillian.protocol</groupId>
<artifactId>arquillian-protocol-servlet</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.8.2</version>
</dependency>
<profiles>
<profile>
<id>jbossas-embedded-6</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
</profile>
<profile>
<id>arq-jbossas-managed</id>
<dependencies>
<dependency>
<groupId>org.jboss.as</groupId>
<artifactId>jboss-as-arquillian-containermanaged</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</profile>
<profile>
<id>arq-jbossas-remote</id>
<dependencies>
<dependency>
<groupId>org.jboss.as</groupId>
<artifactId>jboss-as-arquillian-containerremote</artifactId>
<scope>test</scope>
</dependency>
</dependencies>
</profile>
</profiles>
The Test Class :
import javax.inject.Inject;
import org.jboss.arquillian.api.Deployment;
import org.jboss.arquillian.junit.Arquillian;
import org.jboss.shrinkwrap.api.ShrinkWrap;
import org.jboss.shrinkwrap.api.asset.EmptyAsset;
import org.jboss.shrinkwrap.api.spec.JavaArchive;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.oap.subscription.AbstractSubscription;
#RunWith(Arquillian.class)
public class SubscriptionFactoryTest {
#Inject
private SubscriptionFactory subscriptionFactory;
#Deployment
public static JavaArchive getDeployement() {
System.out.println("### testSayHelloEJB");
return ShrinkWrap.create(JavaArchive.class, "subscriptionFactory.jar")
.addClasses(AbstractSubscription.class,SubscriptionFactory.class);
}
#Test
public void getSubscriptionByIdTest() {
System.out.println("### testSayHelloEJB");
}
}
The EJB Class:
#Remote(ISubscriptionFactoryRemote.class)
#Local(ISubscriptionFactoryLocal.class)
#Stateless
public class SubscriptionFactory extends AbstractSubscription implements ISubscriptionFactoryRemote {
#Override
public AbstractSubscription getSubscriptionById(final Integer id) {
AbstractSubscription ret = null;
if (id != null) {
// create query
final StringBuilder queryString = new StringBuilder("select c from AbstractSubscription c ");
try {
queryString.append("where c.id = :id");
// create query
Query query = this.getEntityManager().createQuery(queryString.toString());
// set parameter
query = query.setParameter("id", id);
// recovers refCountry
ret = (AbstractSubscription) query.getSingleResult();
} catch (final Exception exc) {
}
}
return ret;
}
}
When i run the class test as Junit test , it gives me the errors :
janv. 20, 2015 12:15:34 PM org.jboss.arquillian.impl.client.container.ContainerRegistryCreator getActivatedConfiguration
Infos: Could not read active container configuration: null
the Faillure Trace:
java.lang.NoClassDefFoundError: Lorg/jboss/embedded/api/server/JBossASEmbeddedServer;
at java.lang.Class.getDeclaredFields0(Native Method)
at java.lang.Class.privateGetDeclaredFields(Class.java:2397)
at java.lang.Class.getDeclaredFields(Class.java:1806)
at org.jboss.arquillian.impl.core.Reflections.getFieldInjectionPoints(Reflections.java:74)
... 79 more
Any idea.
You're using a very old version of Arquillian, I'd use at least version 1.1.0.Final. I also don't think you need a few of the Arquillian dependencies you have defined.
Remove the arquillian-jbossas-embedded-6 and arquillian-junit depdendencies.
There are plenty of quickstart examples of how to use Arquillian with JBoss EAP. Have a look at some of the pom's there as it might help.