Using Mockito to test Java Hbase API - unit-testing

This is the method that I am testing. This method gets some Bytes from a Hbase Database based on an specific id, in this case called dtmid. The reason I why I want to return some specific values is because I realized that there is no way to know if an id will always be in Hbase. Also, the column Family and column name could change.
#Override
public void execute(Tuple tuple, BasicOutputCollector collector) {
try {
if (tuple.size() > 0) {
Long dtmid = tuple.getLong(0);
byte[] rowKey = HBaseRowKeyDistributor.getDistributedKey(dtmid);
Get get = new Get(rowKey);
get.addFamily("a".getBytes());
Result result = table.get(get);
byte[] bidUser = result.getValue("a".getBytes(),
"co_created_5076".getBytes());
collector.emit(new Values(dtmid, bidUser));
}
} catch (IOException e) {
e.printStackTrace();
}
}
On my main class when this method is called I want to return a specific value. The method should return some bytes.
byte[] bidUser = result.getValue("a".getBytes(),
"co_created_5076".getBytes());
This is what I have on my Unit Test.
#Test
public void testExecute() throws IOException {
long dtmId = 350000000770902930L;
final byte[] COL_FAMILY = "a".getBytes();
final byte[] COL_QUALIFIER = "co_created_5076".getBytes();
//setting a key value pair to put in result
List<KeyValue> kvs = new ArrayList<KeyValue>();
kvs.add(new KeyValue("--350000000770902930".getBytes(), COL_FAMILY, COL_QUALIFIER, Bytes.toBytes("ExpedtedBytes")));
// I create an Instance of result
Result result = new Result(kvs);
// A mock tuple with a single dtmid
Tuple tuple = mock(Tuple.class);
bolt.table = mock(HTable.class);
Result mcResult = mock(Result.class);
when(tuple.size()).thenReturn(1);
when(tuple.getLong(0)).thenReturn(dtmId);
when(bolt.table.get(any(Get.class))).thenReturn(result);
when(mcResult.getValue(any(byte[].class), any(byte[].class))).thenReturn(Bytes.toBytes("Bytes"));
BasicOutputCollector collector = mock(BasicOutputCollector.class);
// Execute the bolt.
bolt.execute(tuple, collector);
ArgumentCaptor<Values> valuesArg = ArgumentCaptor
.forClass(Values.class);
verify(collector).emit(valuesArg.capture());
Values d = valuesArg.getValue();
//casting this object in to a byteArray.
byte[] i = (byte[]) d.get(1);
assertEquals(dtmId, d.get(0));
}
I am using this down here to return my bytes.For some reason is not working.
when(mcResult.getValue(any(byte[].class), any(byte[].class))).thenReturn(Bytes
.toBytes("myBytes"));
For some reason when I capture the values, I still get the bytes that I specified here:
List<KeyValue> kvs = new ArrayList<KeyValue>();
kvs.add(new KeyValue("--350000000770902930".getBytes(),COL_FAMILY, COL_QUALIFIER, Bytes
.toBytes("ExpedtedBytes")));
Result result = new Result(kvs);

How about replacing
when(bolt.table.get(any(Get.class))).thenReturn(result);
with...
when(bolt.table.get(any(Get.class))).thenReturn(mcResult);

Related

Spring RestTemplate.execute(), how to stub the response that gets passed in to my callback function?

I have the following code. Dictionary is just a wrapper for a List of type String.
public Dictionary getDictionary(int size, String text) {
return restTemplate.execute(url, HttpMethod.GET, null, response -> {
BufferedReader br = new BufferedReader(new InputStreamReader(response.getBody()));
List<String> words = new ArrayList<>();
String line;
while((line = br.readLine()) != null){
if (isMatch(line, size, text)){
words.add(line.toLowerCase());
}
}
br.close();
return new Dictionary(words);
});
}
private boolean isMatch(String word, int size, String text) {
if(word.length() != size) {
return false;
}
return wordUtil.isAnagram(word, text);
}
I'm having a hard time test this method at the moment. The HTTP call just returns a list of words in plain text with new line separators.
I want to write a test where I can stub the response.getBody().
I.e. I want response.getBody() to return a bunch of words, and I'll assert that the returned Dictionary only contains the words that are of size size and that are an anagram of the string text.
Is this possible?
Thanks
It is possible to stub a method taking a callback, and execute the callback when the stub is called.
The idea is to:
use when / thenAnswer to execute code when the stubbed method is called
use invocationOnMock passed to thenAnswer to get the callback instance
call the callback, providing necessary params
#Test
void testExecute() {
String responseBody = "line1\nline2";
InputStream responseBodyStream = new ByteArrayInputStream(responseBody.getBytes());
ClientHttpResponse httpResponse = new MockClientHttpResponse(responseBodyStream, 200);
when(restTemplate.execute(any(URI.class), eq(HttpMethod.GET), eq(null), any())).thenAnswer(
invocationOnMock -> {
ResponseExtractor<MyDictionary> responseExtractor = invocationOnMock.getArgument(3);
return responseExtractor.extractData(httpResponse);
}
);
MyDictionary ret = aController.getDictionary(1, "text");
// assert ret against your expecations
}
Having said that, this seems to be a bit complicated for the task at hand. IMHO you will be better off if you separate the logic of dealing with Http from your business logic. Extract a method taking your inputStream, and test that separately.

AWS SDK2 java s3 select example - how to get result bytes

I am trying to use aws sdk2 java for s3 select operations but not able to get extract the final data. Looking for an example if someone has implemented it. I got some idea from [this post][1] but not able to figure out how to get and read the full data .
Fetching specific fields from an S3 document
Basically, equivalent of v1 sdk:
``` InputStream resultInputStream = result.getPayload().getRecordsInputStream(
new SelectObjectContentEventVisitor() {
#Override
public void visit(SelectObjectContentEvent.StatsEvent event)
{
System.out.println(
"Received Stats, Bytes Scanned: " + event.getDetails().getBytesScanned()
+ " Bytes Processed: " + event.getDetails().getBytesProcessed());
}
/*
* An End Event informs that the request has finished successfully.
*/
#Override
public void visit(SelectObjectContentEvent.EndEvent event)
{
isResultComplete.set(true);
System.out.println("Received End Event. Result is complete.");
}
}
);```
///IN AWS SDK2, how do get ResultOutputStream ?
```public byte[] getQueryResults() {
logger.info("V2 query");
S3AsyncClient s3Client = null;
s3Client = S3AsyncClient.builder()
.region(Region.US_WEST_2)
.build();
String fileObjKeyName = "upload/" + filePath;
try{
logger.info("Filepath: " + fileObjKeyName);
ListObjectsV2Request listObjects = ListObjectsV2Request
.builder()
.bucket(Constants.bucketName)
.build();
......
InputSerialization inputSerialization = InputSerialization.builder().
json(JSONInput.builder().type(JSONType.LINES).build()).build()
OutputSerialization outputSerialization = null;
outputSerialization = OutputSerialization.builder().
json(JSONOutput.builder()
.build()
).build();
SelectObjectContentRequest selectObjectContentRequest = SelectObjectContentRequest.builder()
.bucket(Constants.bucketName)
.key(partFilename)
.expression(query)
.expressionType(ExpressionType.SQL)
.inputSerialization(inputSerialization)
.outputSerialization(outputSerialization)
.scanRange(ScanRange.builder().start(0L).end(Constants.limitBytes).build())
.build();
final DataHandler handler = new DataHandler();
CompletableFuture future = s3Client.selectObjectContent(selectObjectContentRequest, handler);
//hold it till we get a end event
EndEvent endEvent = (EndEvent) handler.receivedEvents.stream()
.filter(e -> e.sdkEventType() == SelectObjectContentEventStream.EventType.END)
.findFirst()
.orElse(null);```
//Now, from here how do I get the response bytes ?
///////---> ISSUE: How do I get ResultStream bytes ????
return <bytes>
}```
// handler
private static class DataHandler implements SelectObjectContentResponseHandler {
private SelectObjectContentResponse response;
private List receivedEvents = new ArrayList<>();
private Throwable exception;
#Override
public void responseReceived(SelectObjectContentResponse response) {
this.response = response;
}
#Override
public void onEventStream(SdkPublisher<SelectObjectContentEventStream> publisher) {
publisher.subscribe(receivedEvents::add);
}
#Override
public void exceptionOccurred(Throwable throwable) {
exception = throwable;
}
#Override
public void complete() {
}
} ```
[1]: https://stackoverflow.com/questions/67315601/fetching-specific-fields-from-an-s3-document
i came to your post since I was working on the same issue as to avoid V1.
After hours of searching i ended up with finding the answer at. https://github.com/aws/aws-sdk-java-v2/pull/2943/files
The answer is located at SelectObjectContentIntegrationTest.java File
services/s3/src/it/java/software/amazon/awssdk/services/SelectObjectContentIntegrationTest.java
The way to get the bytes is by using the RecordsEvent class, please note for my use case I used CSV, not sure if this would be different for a different file type.
in the complete method you have access to the receivedEvents. this is where you get the first index to get the filtered returned results and casting it to the RecordsEvent class. then this class provides the payload as bytes
#Override
public void complete() {
RecordsEvent records = (RecordsEvent) this.receivedEvents.get(0)
String result = records.payload().asUtf8String();
}

Running BeamSql WithoutCoder or Making Coder Dynamic

I am reading data from file and converting it to BeamRecord But While i am Doing Query on that it Show Error-:
Exception in thread "main" java.lang.ClassCastException: org.apache.beam.sdk.coders.SerializableCoder cannot be cast to org.apache.beam.sdk.coders.BeamRecordCoder
at org.apache.beam.sdk.extensions.sql.BeamSql$QueryTransform.registerTables(BeamSql.java:173)
at org.apache.beam.sdk.extensions.sql.BeamSql$QueryTransform.expand(BeamSql.java:153)
at org.apache.beam.sdk.extensions.sql.BeamSql$QueryTransform.expand(BeamSql.java:116)
at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:533)
at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:465)
at org.apache.beam.sdk.values.PCollectionTuple.apply(PCollectionTuple.java:160)
at TestingClass.main(TestingClass.java:75)
But When I am Providing it a Coder Then It Runs Perfectly.
I am little confused that if I am reading data from a file the file data schema changes on every run because I am using templates so is there any way I can use Default Coder or Without Coder, i can Run the Query.
For Reference Code is Below Please Check.
PCollection<String> ReadFile1 = PBegin.in(p).apply(TextIO.read().from("gs://Bucket_Name/FileName.csv"));
PCollection<BeamRecord> File1_BeamRecord = ReadFile1.apply(new StringToBeamRecord()).setCoder(new Temp().test().getRecordCoder());
PCollection<String> ReadFile2= p.apply(TextIO.read().from("gs://Bucket_Name/FileName.csv"));
PCollection<BeamRecord> File2_Beam_Record = ReadFile2.apply(new StringToBeamRecord()).setCoder(new Temp().test1().getRecordCoder());
new Temp().test1().getRecordCoder() --> Returning HardCoded BeamRecordCoder Values Which I need to fetch at runtime
Conversion From PColletion<String> to PCollection<TableRow> is Below-:
Public class StringToBeamRecord extends PTransform<PCollection<String>,PCollection<BeamRecord>> {
private static final Logger LOG = LoggerFactory.getLogger(StringToBeamRecord.class);
#Override
public PCollection<BeamRecord> expand(PCollection<String> arg0) {
return arg0.apply("Conversion",ParDo.of(new ConversionOfData()));
}
static class ConversionOfData extends DoFn<String,BeamRecord> implements Serializable{
#ProcessElement
public void processElement(ProcessContext c){
String Data = c.element().replaceAll(",,",",blank,");
String[] array = Data.split(",");
List<String> fieldNames = new ArrayList<>();
List<Integer> fieldTypes = new ArrayList<>();
List<Object> Data_Conversion = new ArrayList<>();
int Count = 0;
for(int i = 0 ; i < array.length;i++){
fieldNames.add(new String("R"+Count).toString());
Count++;
fieldTypes.add(Types.VARCHAR); //Using Schema I can Set it
Data_Conversion.add(array[i].toString());
}
LOG.info("The Size is : "+Data_Conversion.size());
BeamRecordSqlType type = BeamRecordSqlType.create(fieldNames, fieldTypes);
c.output(new BeamRecord(type,Data_Conversion));
}
}
}
Query is -:
PCollectionTuple test = PCollectionTuple.of(
new TupleTag<BeamRecord>("File1_BeamRecord"),File1_BeamRecord)
.and(new TupleTag<BeamRecord>("File2_BeamRecord"), File2_BeamRecord);
PCollection<BeamRecord> output = test.apply(BeamSql.queryMulti(
"Select * From File1_BeamRecord JOIN File2_BeamRecord "));
Is thier anyway i can make Coder Dynamic or I can Run Query with Default Coder.

How to get all rows containing (or equaling) a particular ID from an HBase table?

I have a method which select the row whose rowkey contains the parameter passed into.
HTable table = new HTable(Bytes.toBytes(objectsTableName), connection);
public List<ObjectId> lookUp(String partialId) {
if (partialId.matches("[a-fA-F0-9]+")) {
// create a regular expression from partialId, which can
//match any rowkey that contains partialId as a substring,
//and then get all the row with the specified rowkey
} else {
throw new IllegalArgumentException(
"query must be done with hexadecimal values only");
}
}
I don't know how to finish code above.
I just know the following code can get the row with specified rowkey in Hbase.
String rowkey = "123";
Get get = new Get(Bytes.toBytes(rowkey));
Result result = table.get(get);
You can use RowFilter filter with RegexStringComparator to do that. Or, if it is just to fetch the rows which match a given substring you can use RowFilter with SubstringComparator. This is how you use HBase filters :
public static void main(String[] args) throws IOException {
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, "demo");
Scan s = new Scan();
Filter f = new RowFilter(CompareOp.EQUAL, new SubstringComparator("abc"));
s.setFilter(f);
ResultScanner rs = table.getScanner(s);
for(Result r : rs){
System.out.println("RowKey : " + Bytes.toString(r.getRow()));
//rest of your logic
}
rs.close();
table.close();
}
The above piece of code will give you all the rows which contain abc as a part of their rowkeys.
HTH

Hbase Map/reduce-How to access individual columns of the table?

I have a table called User with two columns, one called visitorId and the other called friend which is a list of strings. I want to check whether the VisitorId is in the friendlist. Can anyone direct me as to how to access the table columns in a map function?
I'm not able to picture how data is output from a map function in hbase.
My code is as follows:
ublic class MapReduce {
static class Mapper1 extends TableMapper<ImmutableBytesWritable, Text> {
private int numRecords = 0;
private static final IntWritable one = new IntWritable(1);
private final IntWritable ONE = new IntWritable(1);
private Text text = new Text();
#Override
public void map(ImmutableBytesWritable row, Result values, Context context) throws IOException {
//What should i do here??
ImmutableBytesWritable userKey = new ImmutableBytesWritable(row.get(), 0, Bytes.SIZEOF_INT);
context.write(userkey,One);
}
//context.write(text, ONE);
} catch (InterruptedException e) {
throw new IOException(e);
}
}
}
public static void main(String[] args) throws Exception {
Configuration conf = HBaseConfiguration.create();
Job job = new Job(conf, "CheckVisitor");
job.setJarByClass(MapReduce.class);
Scan scan = new Scan();
Filter f = new RowFilter(CompareOp.EQUAL,new SubstringComparator("mId2"));
scan.setFilter(f);
scan.addFamily(Bytes.toBytes("visitor"));
scan.addFamily(Bytes.toBytes("friend"));
TableMapReduceUtil.initTableMapperJob("User", scan, Mapper1.class, ImmutableBytesWritable.class,Text.class, job);
}
}
So Result values instance would contain the full row from the scanner.
To get the appropriate columns from the Result I would do something like :-
VisitorIdVal = value.getColumnLatest(Bytes.toBytes(columnFamily1), Bytes.toBytes("VisitorId"))
friendlistVal = value.getColumnLatest(Bytes.toBytes(columnFamily2), Bytes.toBytes("friendlist"))
Here VisitorIdVal and friendlistVal are of the type keyValue http://archive.cloudera.com/cdh/3/hbase/apidocs/org/apache/hadoop/hbase/KeyValue.html, to get their values out you can do a Bytes.toString(VisitorIdVal.getValue())
Once you have extracted the values from columns you can check for "VisitorId" in "friendlist"