Testing Solr via Embedded Server

Testing Solr via Embedded Server - unit-testing

I'm coding some tests for my solr-indexer application. Following testing best practices, I want to write code self-dependant, just loading the schema.xml and solrconfig.xml and creating a temporary data tree for the indexing-searching tests.
As the application is most written in java, I'm dealing with SolrJ library, but I'm getting problems (well, I'm lost in the universe of corecontainers-coredescriptor-coreconfig-solrcore ...)
Anyone can place here some code to create an Embedded Server that loads the config and also writes to a parameter-pased data-dir?

You can start with the SolrExampleTests which extends SolrExampleTestBase which extends AbstractSolrTestCase .
Also this SampleTest.
Also take a look at this and this threads.

This is an example for a simple test case. solr is the directory that contains your solr configuration files:
import java.io.IOException;
import org.apache.solr.client.solrj.embedded.EmbeddedSolrServer;
import org.apache.solr.util.AbstractSolrTestCase;
import org.apache.solr.client.solrj.SolrQuery;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.response.QueryResponse;
import org.apache.solr.common.SolrInputDocument;
import org.apache.solr.common.params.SolrParams;
import org.junit.Before;
import org.junit.Test;
import static org.junit.Assert.assertEquals;
public class SolrSearchConfigTest extends AbstractSolrTestCase {
private SolrServer server;
#Override
public String getSchemaFile() {
return "solr/conf/schema.xml";
}
#Override
public String getSolrConfigFile() {
return "solr/conf/solrconfig.xml";
}
#Before
#Override
public void setUp() throws Exception {
super.setUp();
server = new EmbeddedSolrServer(h.getCoreContainer(), h.getCore().getName());
}
#Test
public void testThatNoResultsAreReturned() throws SolrServerException {
SolrParams params = new SolrQuery("text that is not found");
QueryResponse response = server.query(params);
assertEquals(0L, response.getResults().getNumFound());
}
#Test
public void testThatDocumentIsFound() throws SolrServerException, IOException {
SolrInputDocument document = new SolrInputDocument();
document.addField("id", "1");
document.addField("name", "my name");
server.add(document);
server.commit();
SolrParams params = new SolrQuery("name");
QueryResponse response = server.query(params);
assertEquals(1L, response.getResults().getNumFound());
assertEquals("1", response.getResults().get(0).get("id"));
}
}
See this blogpost for more info:Solr Integration Tests

First you need to set your Solr Home Directory which contains solr.xml and conf folder containing solrconfig.xml, schema.xml etc.
After that you can use this simple and basic code for Solrj.
File solrHome = new File("Your/Solr/Home/Dir/");
File configFile = new File(solrHome, "solr.xml");
CoreContainer coreContainer = new CoreContainer(solrHome.toString(), configFile);
SolrServer solrServer = new EmbeddedSolrServer(coreContainer, "Your-Core-Name-in-solr.xml");
SolrQuery query = new SolrQuery("Your Solr Query");
QueryResponse rsp = solrServer.query(query);
SolrDocumentList docs = rsp.getResults();
Iterator<SolrDocument> i = docs.iterator();
while (i.hasNext()) {
System.out.println(i.next().toString());
}
I hope this helps.

Related

Running MapReduce on Hbase Exported Table thorws Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result

I have taken the Hbase table backup using Hbase Export utility tool .
hbase org.apache.hadoop.hbase.mapreduce.Export "FinancialLineItem" "/project/fricadev/ESGTRF/EXPORT"
This has kicked in mapreduce and transferred all my table data into Output folder .
As per the document the file format will of the ouotput file is sequence file .
So i ran below code to extract my key and value from the file .
Now i want to run mapreduce to read the key value from the output file but getting below exception
java.lang.Exception: java.io.IOException: Could not find a
deserializer for the Value class:
'org.apache.hadoop.hbase.client.Result'. Please ensure that the
configuration 'io.serializations' is properly configured, if you're
using custom serialization.
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:406)
Caused by: java.io.IOException: Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result'. Please
ensure that the configuration 'io.serializations' is properly
configured, if you're using custom serialization.
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1964)
at org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1811)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1760)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1774)
at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:50)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:478)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:671)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
Here is my driver code
package SEQ;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class SeqDriver extends Configured implements Tool
{
public static void main(String[] args) throws Exception{
int exitCode = ToolRunner.run(new SeqDriver(), args);
System.exit(exitCode);
}
public int run(String[] args) throws Exception {
if (args.length != 2) {
System.err.printf("Usage: %s needs two arguments files\n",
getClass().getSimpleName());
return -1;
}
String outputPath = args[1];
FileSystem hfs = FileSystem.get(getConf());
Job job = new Job();
job.setJarByClass(SeqDriver.class);
job.setJobName("SequenceFileReader");
HDFSUtil.removeHdfsSubDirIfExists(hfs, new Path(outputPath), true);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Result.class);
job.setInputFormatClass(SequenceFileInputFormat.class);
job.setMapperClass(MySeqMapper.class);
job.setNumReduceTasks(0);
int returnValue = job.waitForCompletion(true) ? 0:1;
if(job.isSuccessful()) {
System.out.println("Job was successful");
} else if(!job.isSuccessful()) {
System.out.println("Job was not successful");
}
return returnValue;
}
}
Here is my mapper code
package SEQ;
import java.io.IOException;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MySeqMapper extends Mapper <ImmutableBytesWritable, Result, Text, Text>{
#Override
public void map(ImmutableBytesWritable row, Result value,Context context)
throws IOException, InterruptedException {
}
}

So i will answer my question
here is what was needed to make it work
Because we use HBase to store our data and this reducer outputs its result to HBase table, Hadoop is telling us that he doesn’t know how to serialize our data. That is why we need to help it. Inside setUp set the io.serializations variable
hbaseConf.setStrings("io.serializations", new String[]{hbaseConf.get("io.serializations"), MutationSerialization.class.getName(), ResultSerialization.class.getName()});

Tests pass with Playframework 1.2.x but fails with Playframework 1.4.x

I am migrating my application from Play1.2+Java7 to Play1.4+Java8
Play1.2+Java7 my test passes OK
Play1.4+Java8 my test fails.
I have reduced the code to the minimum and reproduced the problem. Here is the main line
The model is
package models;
import play.db.jpa.Model;
import javax.persistence.Entity;
#Entity
public class Token extends Model {
public String name;
public String role;
}
The controller is
package controllers;
import models.Token;
import play.mvc.Controller;
public class Application extends Controller {
public static void index() {
renderJSON(Token.all().fetch());
}
}
The DB test configuration is
%test.application.mode=dev
%test.db.url=jdbc:h2:mem:play;MODE=MYSQL;LOCK_MODE=0
%test.jpa.ddl=create
The test is
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import org.junit.*;
import org.junit.Before;
import play.test.*;
import play.mvc.*;
import play.mvc.Http.*;
import models.*;
public class ApplicationTest extends FunctionalTest {
#Before
public void before() {
Token.deleteAll();
}
#Test
public void testThatIndexPageWorks() {
{
Response response = GET("/");
assertIsOk(response);
String content = getContent(response);
System.out.println(content);
assertFalse(content.contains("le nom"));
assertFalse(content.contains("identifier"));
}
Token t = new Token();
t.name="le nom";
t.role="identifier";
t.save();
{
Response response = GET("/");
assertIsOk(response);
String content = getContent(response);
System.out.println(content);
assertTrue(content.contains("le nom"));
assertTrue(content.contains("identifier"));
}
}
}
The behaviour is not predictable. It seems that saving entities in the tests are committed async and calling the controller depends on the threads while it did not in release 1.2
I can provide the whole project if necessary

As I do not want to use the fixtures, I have to manually sync the DB: test call of model.save() is done within a local transaction. The transaction is not closed when GET is called. the data is not flushed yet.
I thought that it was covered by
jpa FlushModeType COMMIT
It seems that it is the case in 1.2.x, but not the case in 1.4.x
I modified the test adding the code snippet below after save() and deleteAll(), and it works fine
if ( play.db.jpa.JPA.em().getTransaction().isActive()) {
play.db.jpa.JPA.em().getTransaction().commit();
play.db.jpa.JPA.em().getTransaction().begin();
}

How to checkout and checkin any document outside alfresco using rest API?

I have created one Web Application using Servlets and JSP. Through that I have connected to alfresco repository. I am also able be to upload document in Alfresco and view document in external web application.
Now my requirement is, I have to give checkin and checkout option to those documents.
I found below rest apis for this purpuse.
But I am not getting how to use these apis in servlets to full-fill my requirment.
POST /alfresco/service/slingshot/doclib/action/cancel-checkout/site/{site}/{container}/{path}
POST /alfresco/service/slingshot/doclib/action/cancel-checkout/node/{store_type}/{store_id}/{id}
Can anyone please provide the simple steps or some piece of code to do this task?
Thanks in advance.

Please do not use the internal slingshot URLs for this. Instead, use OpenCMIS from Apache Chemistry. It will save you a lot of time and headaches and it is more portable to other repositories besides Alfresco.
The example below grabs an existing document by path, performs a checkout, then checks in a new major version of the plain text document.
package com.someco.cmis.examples;
import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.chemistry.opencmis.client.api.Document;
import org.apache.chemistry.opencmis.client.api.ObjectId;
import org.apache.chemistry.opencmis.client.api.Repository;
import org.apache.chemistry.opencmis.client.api.Session;
import org.apache.chemistry.opencmis.client.api.SessionFactory;
import org.apache.chemistry.opencmis.client.runtime.SessionFactoryImpl;
import org.apache.chemistry.opencmis.commons.SessionParameter;
import org.apache.chemistry.opencmis.commons.data.ContentStream;
import org.apache.chemistry.opencmis.commons.enums.BindingType;
public class CheckoutCheckinExample {
private String serviceUrl = "http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom"; // Uncomment for Atom Pub binding
private Session session = null;
public static void main(String[] args) {
CheckoutCheckinExample cce = new CheckoutCheckinExample();
cce.doExample();
}
public void doExample() {
Document doc = (Document) getSession().getObjectByPath("/test/test-plain-1.txt");
String fileName = doc.getName();
ObjectId pwcId = doc.checkOut(); // Checkout the document
Document pwc = (Document) getSession().getObject(pwcId); // Get the working copy
// Set up an updated content stream
String docText = "This is a new major version.";
byte[] content = docText.getBytes();
InputStream stream = new ByteArrayInputStream(content);
ContentStream contentStream = session.getObjectFactory().createContentStream(fileName, Long.valueOf(content.length), "text/plain", stream);
// Check in the working copy as a major version with a comment
ObjectId updatedId = pwc.checkIn(true, null, contentStream, "My new version comment");
doc = (Document) getSession().getObject(updatedId);
System.out.println("Doc is now version: " + doc.getProperty("cmis:versionLabel").getValueAsString());
}
public Session getSession() {
if (session == null) {
// default factory implementation
SessionFactory factory = SessionFactoryImpl.newInstance();
Map<String, String> parameter = new HashMap<String, String>();
// user credentials
parameter.put(SessionParameter.USER, "admin"); // <-- Replace
parameter.put(SessionParameter.PASSWORD, "admin"); // <-- Replace
// connection settings
parameter.put(SessionParameter.ATOMPUB_URL, this.serviceUrl); // Uncomment for Atom Pub binding
parameter.put(SessionParameter.BINDING_TYPE, BindingType.ATOMPUB.value()); // Uncomment for Atom Pub binding
List<Repository> repositories = factory.getRepositories(parameter);
this.session = repositories.get(0).createSession();
}
return this.session;
}
}
Note that on the version of Alfresco I tested with (5.1.e) the document must already have the versionable aspect applied for the version label to get incremented, otherwise the checkin will simply override the original.

How to write Elastic unit tests to test query building

I want to write unit tests that test the Elastic query building. I want to test that certain param values produce certain queries.
I started looking into ESTestCase. I see that you can mock a client using ESTestCase. I don't really need to mock the ES node, I just need to reproduce the query building part, but that requires the client.
Has anybody dealt with such issue?
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Map;
import org.elasticsearch.action.search.SearchRequestBuilder;
import org.elasticsearch.client.Client;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.unit.DistanceUnit;
import org.elasticsearch.test.ESIntegTestCase;
import org.elasticsearch.test.ESTestCase;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Ignore;
import org.junit.Test;
import com.google.common.collect.Lists;
public class SearchRequestBuilderTests extends ESTestCase {
private static Client client;
#BeforeClass
public static void initClient() {
//this client will not be hit by any request, but it needs to be a non null proper client
//that is why we create it but we don't add any transport address to it
Settings settings = Settings.builder()
.put("", createTempDir().toString())
.build();
client = TransportClient.builder().settings(settings).build();
}
#AfterClass
public static void closeClient() {
client.close();
client = null;
}
public static Map<String, String> createSampleSearchParams() {
Map<String, String> searchParams = new HashMap<>();
searchParams.put(SenseneConstants.ADC_PARAM, "US");
searchParams.put(SenseneConstants.FETCH_SIZE_QUERY_PARAM, "10");
searchParams.put(SenseneConstants.QUERY_PARAM, "some query");
searchParams.put(SenseneConstants.LOCATION_QUERY_PARAM, "");
searchParams.put(SenseneConstants.RADIUS_QUERY_PARAM, "20");
searchParams.put(SenseneConstants.DISTANCE_UNIT_PARAM, DistanceUnit.MILES.name());
searchParams.put(SenseneConstants.GEO_DISTANCE_PARAM, "true");
return searchParams;
}
#Test
public void test() {
BasicSearcher searcher = new BasicSearcher(client); // this is my application's searcher
Map<String, String> searchParams = createSampleSearchParams();
ArrayList<String> filterQueries = Lists.newArrayList();
SearchRequest searchRequest = SearchRequest.create(searchParams, filterQueries);
MySearchRequestBuilder medleyReqBuilder = new MySearchRequestBuilder.Builder(client, "my_index", searchRequest).build();
SearchRequestBuilder searchRequestBuilder = medleyReqBuilder.constructSearchRequestBuilder();
System.out.print(searchRequestBuilder.toString());
// Here I want to assert that the search request builder output is what it should be for the above client params
}
}
I get this, and nothing in the code runs:
Assertions mismatch: -ea was not specified but -Dtests.asserts=true
REPRODUCE WITH: mvn test -Pdev -Dtests.seed=5F09BEDD71BBD14E - Dtests.class=SearchRequestBuilderTests -Dtests.locale=en_US -Dtests.timezone=America/Los_Angeles
NOTE: test params are: codec=null, sim=null, locale=null, timezone=(null)
NOTE: Mac OS X 10.10.5 x86_64/Oracle Corporation 1.7.0_80 (64-bit)/cpus=4,threads=1,free=122894936,total=128974848
NOTE: All tests run in this JVM: [SearchRequestBuilderTests]

Obviously a bit late but...
So this actually has nothing to do with the ES Testing framework but rather your run settings. Assuming you are running this in eclipse, this is actually a duplicate of Assertions mismatch: -ea was not specified but -Dtests.asserts=true.
eclipse preference -> junit -> Add -ea checkbox enable.
right click on the eclipse project -> run as -> run configure -> arguments tab -> add the -ea option in vm arguments

How to force an Apache Mahout application read directly from the HDFS

I have implemented an Apache Mahout application (attached bellow) which does some basic computations. To do so it is required to load the dataset from my local machine. This application comes in the form of a jar file, but then its being executed within a hadoop pseudo-distributed cluster. The terminal command for that is: $ hadoop jar /home/eualin/ApacheMahout/tdunning-MiA-5b8956f/target/mia-0.1-jar-with-dependencies.jar mia.recommender.ch03.IREvaluatorBooleanPrefIntro2 "/home/eualin/Desktop/links-final"
Now, my question is how to do the same, but this time by reading the dataset from the HDFS (we, of course, suppose that the dataset is already stored in HDFS, e.g. in /user/eualin/output/links-final}. What should change in that case? This might help: hdfs://localhost:50010/user/eualin/output/links-final
package mia.recommender.ch03;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.eval.DataModelBuilder;
import org.apache.mahout.cf.taste.eval.IRStatistics;
import org.apache.mahout.cf.taste.eval.RecommenderBuilder;
import org.apache.mahout.cf.taste.eval.RecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.common.FastByIDMap;
import org.apache.mahout.cf.taste.impl.eval.GenericRecommenderIRStatsEvaluator;
import org.apache.mahout.cf.taste.impl.model.GenericBooleanPrefDataModel;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.neighborhood.NearestNUserNeighborhood;
import org.apache.mahout.cf.taste.impl.recommender.GenericBooleanPrefUserBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.model.PreferenceArray;
import org.apache.mahout.cf.taste.neighborhood.UserNeighborhood;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.UserSimilarity;
import java.io.File;
public class IREvaluatorBooleanPrefIntro2 {
private IREvaluatorBooleanPrefIntro2() {
}
public static void main(String[] args) throws Exception {
if (args.length != 1) {
System.out.println("give file's HDFS path");
System.exit(1);
}
DataModel model = new GenericBooleanPrefDataModel(
GenericBooleanPrefDataModel.toDataMap(
new GenericBooleanPrefDataModel(new FileDataModel(new File(args[0])))));
RecommenderIRStatsEvaluator evaluator =
new GenericRecommenderIRStatsEvaluator();
RecommenderBuilder recommenderBuilder = new RecommenderBuilder() {
#Override
public Recommender buildRecommender(DataModel model) throws TasteException {
UserSimilarity similarity = new LogLikelihoodSimilarity(model);
UserNeighborhood neighborhood =
new NearestNUserNeighborhood(10, similarity, model);
return new GenericBooleanPrefUserBasedRecommender(model, neighborhood, similarity);
}
};
DataModelBuilder modelBuilder = new DataModelBuilder() {
#Override
public DataModel buildDataModel(FastByIDMap<PreferenceArray> trainingData) {
return new GenericBooleanPrefDataModel(
GenericBooleanPrefDataModel.toDataMap(trainingData));
}
};
IRStatistics stats = evaluator.evaluate(
recommenderBuilder, modelBuilder, model, null, 10,
GenericRecommenderIRStatsEvaluator.CHOOSE_THRESHOLD,
1.0);
System.out.println(stats.getPrecision());
System.out.println(stats.getRecall());
}
}

You can't, directly, since the non-distributed code has no knowledge of HDFS. Instead, copy the file to a local location in setup() and then read it from a local file.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Testing Solr via Embedded Server - unit-testing

You can start with the SolrExampleTests which extends SolrExampleTestBase which extends AbstractSolrTestCase . Also this SampleTest. Also take a look at this and this threads.

Related

Running MapReduce on Hbase Exported Table thorws Could not find a deserializer for the Value class: 'org.apache.hadoop.hbase.client.Result

Tests pass with Playframework 1.2.x but fails with Playframework 1.4.x

How to checkout and checkin any document outside alfresco using rest API?

How to write Elastic unit tests to test query building

How to force an Apache Mahout application read directly from the HDFS

Categories

Resources