Load bayesian network based on BIF-file WEKA - weka

I want to load a bayesian network that is stored in a BIF-file in WEKA to validate it against a specific test-set.
However it keeps crashing with a null-pointer because m_MissinValuesFilter is null.
How do I set this filter correctly?
My current code looks as follows:
BayesNet network = new BayesNet()
BIFReader reader = new BIFReader()
network reader.processFile(path)
Evaluation eval = new Evaluation(testData)
eval.evaluatemodel(network,testData)

Despite loading the network from a BIF XML file, you still need to call the buildClassifier method to initialize all the data structures.
Here is a minimal code example:
import weka.classifiers.Evaluation;
import weka.classifiers.bayes.BayesNet;
import weka.classifiers.bayes.net.search.fixed.FromFile;
import weka.core.Instances;
import weka.core.converters.ConverterUtils;
public class Bif {
public static void main(String[] args) throws Exception {
// load data
Instances data = ConverterUtils.DataSource.read("/somewhere/data.arff");
data.setClassIndex(data.numAttributes() - 1);
// configure classifier
FromFile fromFile = new FromFile();
fromFile.setBIFFile("/somewhere/model.bif");
BayesNet bayesNet = new BayesNet();
bayesNet.setSearchAlgorithm(fromFile);
// build classifier to initialize data structures
// NB: this won't change the model itself, as it is fixed
bayesNet.buildClassifier(data);
// evaluate it
Evaluation eval = new Evaluation(data);
eval.evaluateModel(bayesNet, data);
System.out.println(eval.toSummaryString());
// you can output the BIF XML like this
//System.out.println(bayesNet.graph());
}
}
This approach will, however, only use the structure of the network stored in the BIF file and not the CPDs. The training process will reinitialize the CPDs.
If you also want to use the CPDs from the BIF file, you need to use this slightly hacky approach:
import weka.classifiers.Evaluation;
import weka.classifiers.bayes.net.BIFReader;
import weka.core.Instances;
import weka.core.converters.ConverterUtils;
public class Bif {
public static void main(String[] args) throws Exception {
Instances test = ConverterUtils.DataSource.read("/somewhere/data.arff");
test.setClassIndex(test.numAttributes() - 1);
String bifFile = "/somewhere/model.bif";
BIFReader reader = new BIFReader();
// initialize internal filters etc
reader.buildClassifier(test);
// override data structures using BIF file
reader.processFile(bifFile);
// output graph to confirm correct model
//System.out.println(reader.graph());
// evaluate the model against the test set
Evaluation eval = new Evaluation(test);
eval.evaluateModel(reader, test);
System.out.println(eval.toSummaryString());
}
}

Related

JUnit/Mockito to test a class containing "code table" like static data loaded from a DB

I have a Java class that contains all my code table like data; e.g. code table is called "Status" and has three values like so:
1 => Good
2 => Bad
3 => Ugly
Simplified it looks like this
public class Codes
{
private static Map<String, Map<String, String>> CODES;
public static void init(List<String> ctList) throws Exception
{
try (Connection con = ...)
{
// the SQL executed has an IN(?) where the ctList defines a list of the tables to load
ResultSet results = prepStmt.execute();
while(results.next())
{
String id = results.getString("TYP");
String code = results.getString("CODE");
String desc = results.getString("DESC");
if(!CODES.containsKey(id))
CODES.put(id, new HashMap<>());
Map<String, String> ct = CODES.get(id);
ct.put(code, code);
ct.put(desc.toUpperCase().trim(), code); // allows searching for code by description
}
}
}
public static String getCodeStr(String table, String key)
{
return CODES.get(table).get(key);
}
}
Now inside my application I frequently refer to this class. Simple example
String test = Codes.getCodeStr("test_table_id", "test_code_value");
When mocking I won't be able to connect to a database. So what must I do to be able to test with this class?
One thing I have considered is collect all the code table data into a file and then initialize from the file based on some switch. I just have to make sure the file is kept up to date...
Any ideas or suggestions are welcome but I do prefer to avoid/limit injection when I can.

Class for storing `GeoPoint` in GCP Datastore (Java)

I am trying to store a geo_point type data in datastore via GCP Java client library. I figured out how to do for a Date type data, but could not get a clue which GeoPoint class I use for this.
import com.google.datastore.v1.Entity;
import static com.google.datastore.v1.client.DatastoreHelper.makeValue;
import java.util.Date;
...
public class WriteToDatastoreFromTwitter {
private static Value dValue(Date k) {
return makeValue(k).setExcludeFromIndexes(true).build();
}
public static void main(String[] args) throws TwitterException {
final Builder builder = Entity.newBuilder().
setKey(key).
putProperties("timestamp", dValue(tweet.getCreatedAt()));
// How can I add a `geo_point` data?
I am simply not sure if I should use classes outside of the datastore package, such as this: https://cloud.google.com/appengine/docs/standard/java/javadoc/com/google/appengine/api/search/GeoPoint
I figured out by myself. There is a class LatLng in a dependent package com.google.type to the datastore package and I could use this to successfully store geo_point data. Here's how you initialize the object:
import com.google.type.LatLng;
...
LatLng x = LatLng.
newBuilder().
setLatitude(loc.getLatitude()).
setLongitude(loc.getLongitude()).
build();
and in my case, I stored it by doing
private static Value gValue(LatLng k) {
return makeValue(k).setExcludeFromIndexes(true).build();
}
builder.putProperties("geo_point", gValue(x));

What is the right way to access to instance of class methods outside scope?

I have got follow code:
import std.stdio;
import database;
import router;
import config;
import vibe.d;
void main()
{
Config config = new Config();
auto settings = new HTTPServerSettings;
settings.port = 8081;
settings.bindAddresses = ["::1", "127.0.0.1"];
auto router = new URLRouter();
router.get("/*", serveStaticFiles("./html"));
Database database = new Database(config);
database.MySQLConnect(); // all DB methods are declared here
router.registerRestInterface(new MyRouter(database));
router.get("*", &myStuff); // all other request
listenHTTP(settings, router);
logInfo("Please open http://127.0.0.1:8081/ in your browser.");
runApplication();
}
void myStuff(HTTPServerRequest req, HTTPServerResponse res) // I need this to handle any accessed URLs
{
writeln(req.path); // getting URL that was request on server
// here I need access to DB methods to do processing and return some DB data
}
I was needed create router.get("*", &myStuff); to process any urls, that do not relate to any REST instance.
The problem that I do not know how to get access to DB methods from myStuff()
Haven't tried it but using 'partial' might be a solution.
https://dlang.org/phobos/std_functional.html#partial
void myStuff(Database db, HTTPServerRequest req, HTTPServerResponse res) { ... }
void main()
{
import std.functional : partial;
...
router.get("*", partial!(myStuff, database));
...
}
Partial creates a function with the first parameter bound to a given value - so the caller does not need to know about it. Personally I don't like globals/, singletons/ etc. and try to inject dependencies. Although the implementation might become a bit more complex this really simplifies testing a lot.
The example above injects dependencies in a way similar to Constructor Injection as mentioned here:
https://en.wikipedia.org/wiki/Dependency_injection#Constructor_injection
When injecting dependencies like this you also get a quick overview about the required components to call this function. If the number of dependencies increases think of using other approaches - eg. inject a ServiceLocator.
https://martinfowler.com/articles/injection.html#UsingAServiceLocator
Ronny
As an alternative to partial, you could achieve partial application with a closure:
router.get("*", (req, resp) => myStuff(database, req, resp));
// ...
void myStuff(Database db, HTTPServerRequest req, HTTPServerResponse res)
// ...
myStuff now has database injected from the surrounding scope.
I have no experience with vibe.d, but this may be one solution:
Database database;
shared static this(){
Config config = new Config();
database = new Database(config);
}
void main(){
(...)
void myStuff(HTTPServerRequest req, HTTPServerResponse res){
database.whatever;
}

How to retrieve a list of all classes in a module in Haxe? (aka: helper classes)

Considering a Haxe file defines a series of classes like so:
import flash.display.MovieClip;
import flash.display.Sprite;
import haxe.unit.TestCase;
class MainTest extends Sprite {
public var testcase:Array<Class<TestCase>> = ???;
}
class TestSprite extends TestCase {
function testBasic() {
var sprite = new Sprite();
sprite.x = 0;
assertEquals(sprite.x, 0);
}
}
class TestMovieClip extends TestCase {
function testMovieClip() {
var mc = new MovieClip();
mc.nextFrame();
assertEquals(mc.currentFrame, 2);
}
}
Is there a way to obtain a list of all the helper classes (ex: TestSprite and TestMovieClip)? Preferably at runtime, but a macro that would return an Array<Class<TestCase>> would work fine too.
I have a small macro helper library called compiletime that can get all classes in a package, or all classes that extend a certain class.
haxelib install compiletime
And then get classes either by base class or by package:
var testcases = CompileTime.getAllClasses("my.package");
var testcases = CompileTime.getAllClasses(TestCase);
var testcases = CompileTime.getAllClasses("my.package",TestCase); // Both!
Now, that is getting them by package, not by module. Getting it by module might work, I'm not sure off the top of my head. But if you were to edit this part of the code:
https://github.com/jasononeil/compiletime/blob/master/src/CompileTime.hx#L208
And change it to also support getting by module, and send me a pull request, I would most certainly merge it :)
Good luck!

Mock a method call with void return type using JMockit or Mockito

I have a very different kind of method call which I need to test using JMockit testing framework. First let us look at the code.
public class MyClass{
MyPort port;
public registerMethod(){
Holder<String> status=null;
Holder<String> message=null;
//below method call is a call to a webservice in the real implementation using apache cxf framework. This method has a void return type. Read below for better explanation.
port.registerService("name", "country", "email", status, message);
// do some stuff with status and message here.....
HashMap response = new HashMap();
response.put(status);
response.put(message);
return response;
}
}
Now let me explain the a little bit. This class is basically having a port instance variable which is used to connect to a webservice. The webservice implementation uses auto generated apache cxf framework classes to make connection to the webservice and get the response back. My job is to implement the mocking of this webservice call for writing testcases for lot many similar calls that are there in the real application.
The problem here is - If you notice that call to the webservice is actually made by the method port.registerService by sending name, country and email as the parameters. Now we also pass the status and message variables as the parameters themselves to this method. So this method instead of returning some value for status and message, it FILLS IN values in these two passed parameters which is very different from the "RETURN" approach.
Now the problem is when I m trying to mock this call using jmockit, I can always mock this call but what is to be expected ?? as there is no return at all, it turns out to be a void call which fills in values in the parameters passed to it. So I will always get status, and message as null if I mock this call as I cannot state any return expectation in the jmockit implementation.
Please if anybody has any solutions/suggestions to the above problem, do respond and try to help me. Thanks.
I was not sure what the Holder interface looked like so I made some assumptions. But, this is how you mock a method with a void return type using Mockito:
#SuppressWarnings("unchecked")
#Test
public final void test() {
// given
final String expectedStatus = "status";
final String expectedMessage = "message";
final MyPort mockPort = mock(MyPort.class);
final Answer<Void> registerAnswer = new Answer<Void>() { // actual parameter type doesn't matter because it's a void method
public Void answer(final InvocationOnMock invocation) throws Throwable {
// Here I'm stubbing out the behaviour of registerService
final Object[] arguments = invocation.getArguments();
// I don't actually care about these, but if you wanted the other parameters, this is how you would get them
// if you wanted to, you could perform assertions on them
final String name = (String) arguments[0];
final String country = (String) arguments[1];
final String email = (String) arguments[2];
final Holder<String> statusHolder = (Holder<String>) arguments[3];
final Holder<String> messageHolder = (Holder<String>) arguments[4];
statusHolder.put(expectedStatus);
messageHolder.put(expectedMessage);
// even though it's a void method, we need to return something
return null;
}
};
doAnswer(registerAnswer).when(mockPort).registerService(anyString(),
anyString(), anyString(), any(Holder.class), any(Holder.class));
final MyClass object = new MyClass();
object.port = mockPort;
// when
final Map<String, String> result = object.registerMethod();
// then
assertEquals(expectedStatus, result.get("status"));
assertEquals(expectedMessage, result.get("message"));
}
For reference, these are my imports:
import static org.junit.Assert.assertEquals;
import static org.mockito.Matchers.any;
import static org.mockito.Matchers.anyString;
import static org.mockito.Mockito.doAnswer;
import static org.mockito.Mockito.mock;
import java.util.HashMap;
import java.util.Map;
import org.junit.Test;
import org.mockito.invocation.InvocationOnMock;
import org.mockito.stubbing.Answer;