How to test Elasticsearch index creation? - unit-testing

I would like to write a JUnit Test using any kind of embedded Elasticsearch engine in order to test my services which should create indexes with mappings on start-up. What is the best way to do it?
Probably, it would be also enough to use ESTestCase. Unfortunately, I cannot find simple usage examples. Could anyone provide one?

There is no embedded Elasticsearch since 5.x any more. I would use Testcontainers for this: https://github.com/dadoonet/testcontainers-java-module-elasticsearch
PS: This code will soon move to the Testcontainers repo.

Related

Work with Redis in c++ locally, can I use redis-cpp library with complex configuration?

I want to work with Redis in C++. My idea is to use a library as simple as possible, so I'm working with: https://github.com/tdv/redis-cpp . What I need to know and I'm not sure where I can get this information from, is that if I have a provided complex configuration in Redis (with Redis Cluster and Redis Sentinel), can I use this library for simple gets/sets when my configuration is using nodes, master, slaves, etc.? Is it easy, or should I use a more complex library like redis-plus-plus to do the gets/sets.
I also wanted to know if it's possible to make a local installation and creation of a cluster (I know that for that I have to use redis-plus-plus) and then try my simple library (redis-cpp) to access and modify data. Everything locally, just to make a try.
Thank you so much!

How to unit test kafka streams dsl when using schema registry

Lets say I want write a unit test for the example show here :
https://github.com/confluentinc/kafka-streams-examples/blob/5.1.2-post/src/main/java/io/confluent/examples/streams/WikipediaFeedAvroLambdaExample.java
I tried the following methods, both which did not work out for me:
1) Use TopologyTestDriver.
This class is pretty useful as long as schema registry is not involved.
I tried making use of MockSchemaRegistryClient but it didn't work out.
And even if it does work out, it requires that I create my own serializers which kind of defeats the purpose of schema registry.
2) Use EmbeddedSingleNodeKafkaCluster defined in the same project.
https://github.com/confluentinc/kafka-streams-examples/blob/5.1.2-post/src/test/java/io/confluent/examples/streams/kafka/EmbeddedSingleNodeKafkaCluster.java
Now this class is really handy and seems to have embedded kafka cluster and schema registry. But it does not seem to be available in any artifact. Consequently I tried copying the class but ran into further import issues.
Unable to download this particular artifact : io.confluent:kafka-schema-registry-client:5.0.0:tests
Has anyone able to make progress with the above mentioned options? Or even a completely different solution?
For doing this I ended up doing this small test library based on testcontainers: https://github.com/vspiliop/embedded-kafka-cluster. Starts a fully configurable docker based Kafka cluster (broker, zookeeper and Confluent Schema Registry) as part of your tests. Check out the example unit and cucumber tests.

understand how opsworks and custom cookbooks work together

I have my stack on opsworks and app is deploying fine (cake php).
Now I have to configure some things like chmod, php versions, etc etc... I'm reading about this but don't know exactly whats the best way to do this.
Question 1 - Should I do this with custom deploy JSON or via custom cookbooks?
Question 2 - Whats the correctly way to work with custom cookbooks? Fork original AWS repositories, update recipes and then use it in my stack?
depends on what you would like to achieve, you may implement many things, such as:
a recipe, which is invoked only once during a chef-client run.
a lightweight resource provider, which supports notifies and can be invoked zero or more times.
a definition, which is available before resource collection and can be invoked zero or more times.
for your second question, first checkout berkshelf -- a cookbook manager.
i would suggest forking a project only if the project is dead, otherwise i would consider to contribute to already implemented project so everybody will benefit from it; and you can always write your own wrapper cookbook, you can also refer to Chef wrapper cookbook best practices.

Is there any open source framework that can provide a update service?

I provide a web service, but sometimes I need to update my web service. When i update the source code, i need to stop the service, then install the new service package.
Here is the problem:
1 I really don't want to stop the service when updating
2 I just need to update some specific features. But now I have to update the whole package. This makes the updating process long and heavy.
I think EJB would be a solution. But i need more advice.
Any advice is appreciated
You may want to think on lines of dynamic class loading tools. One of the tools i am aware is JRebel. You might want to have a look at it.
Other opensource alternative is springloaded. But its still naive and under constant development.

Automated Testing in Apache Hive

I am about to embark on a project using Apache Hadoop/Hive which will involve a collection of hive query scripts to produce data feeds for various down stream applications. These scripts seem like ideal candidates for some unit testing - they represent the fulfillment of an API contract between my data store and client applications, and as such, it's trivial to write what the expected results should be for a given set of starting data. My issue is how to run these tests.
If I was working with SQL queries, I could use something like SQLlite or Derby to quickly bring up test databases, load test data and run a collection of query tests against them. Unfortunately, I am unaware of any such tools for Hive. At the moment, my best thought is to have the test framework bring up a hadoop local instance and run Hive against that, but I've never done that before and I'm not sure it will work, or be the right path.
Also, I'm not interested in a pedantic discussion about if what I am doing is unit testing or integration testing - I just need to be able to prove my code works.
Hive has special standalone mode, specifically design for the testing purposes. In this case it can run without hadoop. I think it is exactly what you need.
There is a link to the documentation:
http://wiki.apache.org/hadoop/Hive/HiveServer
I'm working as part of a team to support a big data and analytics platform, and we also have this kind of issue.
We've been searching for a while and we found two pretty promising tools: https://github.com/klarna/HiveRunner https://github.com/bobfreitas/HadoopMiniCluster
HiveRunner is a framework built on top of JUnit to test Hive Queries. It starts a standalone HiveServer with in memory HSQL as the metastore. With it you can stub tables, views, mock samples, etc.
There are some limitations on Hive versions though, but I definitely recommend it
Hope it helps you =)
You may also want to consider the following blog post which describes automating unit testing using a custom utility class and ant: http://dev.bizo.com/2011/04/hive-unit-testing.html
I know this is an old thread, but just in case someone comes across it. I have followed up on the whole minicluster & hive testing, and found that things have changed with MR2 and YARN, but in a good way. I have put together an article and github repo to give some help in it:
http://www.lopakalogic.com/articles/hadoop-articles/hive-testing/
Hope it helps!