Comparing entities while unit testing with Hibernate - unit-testing

I am running JUnit tests using in memory HSQLDB. Let's say I have a method that inserts some values to the DB and I am checking if the method inserted the values correctly. Note that order of the insertion is not important.
#Test
public void should_insert_correctly() {
MyEntity[] expectedEntities = new MyEntity[2];
// init expected entities
Inserter out = new Inserter(session); // out: object under test
out.insert();
List list = session.createCriteria(MyEntity.class).list();
assertTrue(list.contains(expectedEntities[0]));
assertTrue(list.contains(expectedEntities[1]));
}
The problem is I cannot compare expected entities to actual ones because the expected's id and the actual's id are different. Since setId() of MyEntity is private (to prevent setting id explicitly), I cannot set all of the entities' id to 0 and compare like that.
How can I compare two result set regardless of their ids?

I found this more practical. Instead of fetching all results at once, I am fetching results according to the criterias and asserting they are not null.
public void should_insert_correctly() {
Inserter out = new Inserter(session); // out: object under test
out.insert();
Criteria criteria;
criteria = getCriteria(session, 0);
assertNotNull(criteria.uniqueResult());
criteria = getCriteria(session, 1);
assertNotNull(criteria.uniqueResult());
}
private Criteria getCriteria(Session session, int i) {
Criteria criteria = session.createCriteria(MyEntity.class);
criteria.add(Restrictions.eq("x", expectedX[i]));
criteria.add(Restrictions.eq("y", expectedY[i]));
return criteria;
}

A stateful entity should not override equals -- that is, entities should be compared for equality by reference identity -- so List.contains will not work as you want.
What I do is use reflection to compare the fields of the original and reloaded entities. The function that walks over the fields of the objects ignores transient fields and those annotated as #Transient.
I don't find I need to ignore the id. When the object is first flushed to the database, Hibernate allocates it an id. When it is reloaded, the object will have the same id.
The flaw in your test is that you have not set transaction boundaries. You need to save the objects in one transaction. When you commit that transaction, Hibernate will flush the objects to the database and allocate their ids. Then in another transaction load the entities back from the database. You will get another set of objects that should have the same ids and persistent (i.e. non-transient) state.

I would try to implement Object.equals(Object) method in your MyEntity class.
List.contains(Object) uses Object.equals(Object) (Source: Java 6 API) to determine if an Object is in this list.
The method session.createCriteria(MyEntity.class).list(); returns a list of new instances with the values you inserted (hopefully).
So you need to compare the values. This is easily done via the implementation of Object.equals(Object).
Clarification edit:
You could ignore the ids in your equals method, so that the comparison only cares about "real values".
YAE (Yet Another Edit):
I recommend reading this article about the equals() method: Angelika Langer: Secrets Of Equal. It explains all background information very well.

Related

How to apply filters in DynamoDB in DAO layer?

I used this method to get data from dynamodb. But how i can apply filters or (Where clause to get data by name and address)
public class CustomerDAO extends AbstractDAO implements DynamodbDAO {
#Override
public Class<Customer> getClazz() {
return Customer.class;
}
public List<Customer> getByUserId(String userId) {
return getDynamoDB().scan(
getClazz(),
new DynamoDBScanExpression().withFilterExpression(
"USER_ID = :userId")
.addExpressionAttributeValuesEntry(":userId",
new AttributeValue().withS(userId)));
}
}
}
You need to apply a FilterExpression to your scan. Using this expression, you can apply filtering logic to any attribute on each item, described further here. This filtering is applied server side by DynamoDB so only the results that match will be returned to you, but you still "pay" for the read IO it took to fetch the data.
On a related matter, scanning a DynamoDB table should typically be avoided. It requires reading every single item in the table and unless your table is very small you will incur a large amount of read IO and it will take a long time to return. You should try to accomplish your queries by using a hash or range key that allows you to directly get to the data you want. If this cannot apply to the primary hash/range key of the table you should look into creating global secondary indexes to allow you to access the data you want directly.

Pagination with DynamoDBMapper Java AWS SDK

From the API docs dynamo db does support pagination for scan and query operations. The catch here is to set the ExclusiveStartIndex of current request to the value of the LastEvaluatedIndex of previous request to get next set (logical page) of results.
I'm trying to implement the same but I'm using DynamoDBMapper, which seems to have lot more advantages like tight coupling with data models. So if I wanted to do the above, I'm assuming I would do something like below:
// Mapping of hashkey of the last item in previous query operation
Map<String, AttributeValue> lastHashKey = ..
DynamoDBQueryExpression expression = new DynamoDBQueryExpression();
...
expression.setExclusiveStartKey();
List<Table> nextPageResults = mapper.query(Table.class, expression);
I hope my above understanding is correct on paginating using DynamoDBMapper.
Secondly, how would I know that I've reached the end of results. From the docs if I use the following api:
QueryResult result = dynamoDBClient.query((QueryRequest) request);
boolean isEndOfResults = StringUtils.isEmpty(result.getLastEvaluatedKey());
Coming back to using DynamoDBMapper, how can I know if I've reached end of results in this case.
You have a couple different options with the DynamoDBMapper, depending on which way you want go.
query - returns a PaginatedQueryList
queryPage - returns a QueryResultPage
scan - returns a PaginatedScanList
scanPage - returns a ScanResultPage
The part here is understanding the difference between the methods, and what functionality their returned objects encapsulate.
I'll go over PaginatedScanList and ScanResultPage, but these methods/objects basically mirror each other.
The PaginatedScanList says the following, emphasis mine:
Implementation of the List interface that represents the results from a scan in AWS DynamoDB. Paginated results are loaded on demand when the user executes an operation that requires them. Some operations, such as size(), must fetch the entire list, but results are lazily fetched page by page when possible.
This says that results are loaded as you iterate through the list. When you get through the first page, the second page is automatically fetched with out you having to explicitly make another request. Lazy loading the results is the default method, but it can be overridden if you call the overloaded methods and supply a DynamoDBMapperConfig with a different DynamoDBMapperConfig.PaginationLoadingStrategy.
This is different from the ScanResultPage. You are given a page of results, and it is up to you to deal with the pagination yourself.
Here is quick code sample showing an example usage of both methods that I ran with a table of 5 items using DynamoDBLocal:
final DynamoDBMapper mapper = new DynamoDBMapper(client);
// Using 'PaginatedScanList'
final DynamoDBScanExpression paginatedScanListExpression = new DynamoDBScanExpression()
.withLimit(limit);
final PaginatedScanList<MyClass> paginatedList = mapper.scan(MyClass.class, paginatedScanListExpression);
paginatedList.forEach(System.out::println);
System.out.println();
// using 'ScanResultPage'
final DynamoDBScanExpression scanPageExpression = new DynamoDBScanExpression()
.withLimit(limit);
do {
ScanResultPage<MyClass> scanPage = mapper.scanPage(MyClass.class, scanPageExpression);
scanPage.getResults().forEach(System.out::println);
System.out.println("LastEvaluatedKey=" + scanPage.getLastEvaluatedKey());
scanPageExpression.setExclusiveStartKey(scanPage.getLastEvaluatedKey());
} while (scanPageExpression.getExclusiveStartKey() != null);
And the output:
MyClass{hash=2}
MyClass{hash=1}
MyClass{hash=3}
MyClass{hash=0}
MyClass{hash=4}
MyClass{hash=2}
MyClass{hash=1}
LastEvaluatedKey={hash={N: 1,}}
MyClass{hash=3}
MyClass{hash=0}
LastEvaluatedKey={hash={N: 0,}}
MyClass{hash=4}
LastEvaluatedKey=null

Where are Ember Data's commitDetails' created/updated/deleted OrderedSets set?

Ember Data's Adapter saves edited records in different groups of Ember.OrderedSets, namely: commitDetails.created, commitDetails.updated, and commitDetails.deleted.
model.save() from model controller's createRecord() will be placed in the commitDetails.created group. model.save() from model controller's acceptChanges will placed be in the commitDetails.updated group. But I can't find in code where the placement association happens.
I know that they are instantiated in Ember Transaction's commit function (which calls Adapter's commit, in turn calling Adapter's save). Throughout this process, I can't figure out where exactly the records are sorted according to the created/updated/deleted criteria.
I'm not quite clear what you're asking, but if you're looking for where records get added to their appropriate commitDetails set, I believe this is the line you're looking for, in the commitDetails property itself.
Here's the relevant code.
forEach(records, function(record) {
if(!get(record, 'isDirty')) return;
record.send('willCommit');
var adapter = store.adapterForType(record.constructor);
commitDetails.get(adapter)[get(record, 'dirtyType')].add(record);
});
Let's walk through it.
forEach(records, function(record) {
if(!get(record, 'isDirty')) return;
The above says, for each record in the transaction, if it's not dirty, ignore it.
record.send('willCommit');
Otherwise, update its state to inFlight.
var adapter = store.adapterForType(record.constructor);
Get the record's adapter.
commitDetails.get(adapter)
Look up the adapter's created/updated/deleted trio object, which was instantiated at the top of this method here. It's simply an object with the 3 properties created, updated, and deleted, whose values are empty OrderedSets.
[get(record, 'dirtyType')]
Get the appropriate OrderedSet from the object we just obtained. For example, if the record we're on has been updated, get(record, 'dirtyType') will return the string updated. The brackets are just standard JavaScript property lookup, and so it grabs the updated OrderedSet from our trio object in the previous step.
.add(record);
Finally, add the record to the OrderedSet. On subsequent iterations of the loop, we'll add other records of the same type, so all created records get added to one set, all updated records get added to another set, and all deleted records get added to the third set.
What we end up with at the end of the entire method and return from the property is a Map whose keys are adapters, and whose values are these objects with the 3 properties created, updated, and deleted. Each of those, in turn, are OrderedSets of all the records in the transaction that have been created for that adapter, updated for that adapter, and deleted for that adapter, respectively.
Notice that this computed property is marked volatile, so it will get recomputed each time that someone gets the commitDetails property.

EclipseLink JPA: Can I run multiple queries from one builder?

I have a method that builds and runs a Criteria query. The query does what I want it to, specifically it filters (and sorts) records based on user input.
Also, the query size is restricted to the number of records on the screen. This is important because the data table can be potentially very large.
However, if filters are applied, I want to count the number of records that would be returned if the query was not limited. So this means running two queries: one to fetch the records and then one to count the records that are in the overall set. It looks like this:
public List<Log> runQuery(TableQueryParameters tqp) {
// get the builder, query, and root
CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<Log> query = builder.createQuery(Log.class);
Root<Log> root = query.from(Log.class);
// build the requested filters
Predicate filter = null;
for (TableQueryParameters.FilterTerm ft : tqp.getFilterTerms()) {
// this section runs trough the user input and constructs the
// predicate
}
if (filter != null) query.where(filter);
// attach the requested ordering
List<Order> orders = new ArrayList<Order>();
for (TableQueryParameters.SortTerm st : tqp.getActiveSortTerms()) {
// this section constructs the Order objects
}
if (!orders.isEmpty()) query.orderBy(orders);
// run the query
TypedQuery<Log> typedQuery = em.createQuery(query);
typedQuery.setFirstResult((int) tqp.getStartRecord());
typedQuery.setMaxResults(tqp.getPageSize());
List<Log> list = typedQuery.getResultList();
// if we need the result size, fetch it now
if (tqp.isNeedResultSize()) {
CriteriaQuery<Long> countQuery = builder.createQuery(Long.class);
countQuery.select(builder.count(countQuery.from(Log.class)));
if (filter != null) countQuery.where(filter);
tqp.setResultSize(em.createQuery(countQuery).getSingleResult().intValue());
}
return list;
}
As a result, I call createQuery twice on the same CriteriaBuilder and I share the Predicate object (filter) between both of them. When I run the second query, I sometimes get the following message:
Exception [EclipseLink-6089] (Eclipse Persistence Services -
2.2.0.v20110202-r8913):
org.eclipse.persistence.exceptions.QueryException Exception
Description: The expression has not been initialized correctly. Only
a single ExpressionBuilder should be used for a query. For parallel
expressions, the query class must be provided to the ExpressionBuilder
constructor, and the query's ExpressionBuilder must always be on the
left side of the expression. Expression: [ Base
com.myqwip.database.Log] Query: ReportQuery(referenceClass=Log ) at
org.eclipse.persistence.exceptions.QueryException.noExpressionBuilderFound(QueryException.java:874)
at
org.eclipse.persistence.expressions.ExpressionBuilder.getDescriptor(ExpressionBuilder.java:195)
at
org.eclipse.persistence.internal.expressions.DataExpression.getMapping(DataExpression.java:214)
Can someone tell me why this error shows up intermittently, and what I should do to fix this?
Short answer to the question : Yes you can, but only sequentially.
In the method above, you start creating the first query, then start creating the second, the execute the second, then execute the first.
I had the exact same problem. I don't know why it's intermittent tough.
I other words, you start creating your first query, and before having finished it, you start creating and executing another.
Hibernate doesn't complain but eclipselink doesn't like it.
If you just start by the query count, execute it, and then create and execute the other query (what you've done by splitting it in 2 methods), eclipselink won't complain.
see https://issues.jboss.org/browse/SEAMSECURITY-91
It looks like this posting isn't going to draw much more response, so I will answer this in how I resolved it.
Ultimately I ended up breaking my runQuery() method into two methods: runQuery() that fetches the records and runQueryCount() that fetches the count of records without sort parameters. Each method has its own call to em.getCriteriaBuilder(). I have no idea what effect that has on the EntityManager, but the problem has not appeared since.
Also, the DAO object that has these methods used to be #ApplicationScoped. It now has no declared scope, so it is now constructed on demand from the various #RequestScoped and #ConversationScoped beans that use it. I don't know if this has any effect on the problem but since it has not appeared since I will use this as my code pattern from now on. Suggestions welcome.

Unit Testing & Primary Keys

I am new to Unit Testing and think I might have dug myself into a corner.
In your Unit Tests, what is the better way to handle primary keys?
Hopefully an example will paint some context. If create several instances of an object (Lets' say Person).
My unit test is to test the correct relationships are being created.
My code is to create Homer, he children Bart and Lisa. He also has a friend Barney, Karl & Lenny.
I've seperated my data layer with an Interface. My preference is to keep the primary key simple. Eg On Save, Person.ProductID = new Random().Next(10000); instead of say Barney.PersonID = 9110 Homer.PersonID = 3243 etc.
It doesn't matter what the primary key is, it just needs to be unique.
Any thoughts???
EDIT:
Sorry I haven't made it clear. My project is setup to use Dependency Injection. The data layer is totally separate. The focus of my question is, what is practical?
I have a class called "Unique" which produces unique objects (strings, integers, etc). It makes sure they're unique per-test by keeping a internal static counter. That counter value is incremented per key generated, and included in the key somehow.
So when I'm setting up my test
var Foo = {
ID = Unique.Integer()
}
I like this as it communicates that the value is not important for this test, just the uniqueness.
I have a similar class 'Some' that does not guarantee uniqueness. I use it when I need an arbitrary value for a test. Its useful for enums and entity objects.
None of these are threadsafe or anything like that, its strictly test code.
There are several possible corners you may have dug yourself into that could ultimately lead to the question that you're asking.
Maybe you're worried about re-using primary keys and overwriting or incorrectly loading data that's already in the database (say, if you're testing against a dev database as opposed to a clean test database). In that case, I'd recommend you set up your unit tests to create their records' PKs using whatever sequence a normal application would or to test in a clean, dedicated testing database.
Maybe you're concerned about the efficacy of your code with PKs beyond a simple 1,2,3. Rest assured, this isn't something one would typically test for in a straightforward application, because most of it is outside the concern of your application: generating a number from a sequence is the DB vendor's problem, keeping track of a number in memory is the runtime/VM's problem.
Maybe you're just trying to learn what the best practice is for this sort of thing. I would suggest you set up the database by inserting records before executing your test cases using the same facilities that your application itself will use to insert records; presumably your application code will rely on a database-vended sequence number for PKs, and if so, use that. Finally, after your test cases have executed, your tests should roll back any changes they made to the database to ensure the test is idempotent over multiple executions. This is my sorry attempt of describing a design pattern called test fixtures.
Consider using GUIDs. They're unique across space and time, meaning that even if two different computers generated them at the same exact instance in time, they will be different. In other words, they're guaranteed to be unique. Random numbers are never good, there is a considerable risk of collision.
You can generate a Guid using the static class and method:
Guid.NewGuid();
Assuming this is C#.
Edit:
Another thing, if you just want to generate a lot of test data without having to code it by hand or write a bunch of for loops, look into NBuilder. It might be a bit tough to get started with (Fluent methods with method chaining aren't always better for readability), but it's a great way to create a huge amount of test data.
Why use random numbers? Does the numeric value of the key matter? I would just use a sequence in the database and call nextval.
The essential problem with database unit testing is that primary keys do not get reused. Rather, the database creates a new key each time you create a new record, even if you delete the record with the original key.
There are two basic ways to deal with this:
Read the generated Primary Key, from the database and use it in your tests, or
Use a fresh copy of the database each time you test.
You could put each test in a transaction and roll the transaction back when the test completes, but rolling back transactions doesn't always work with Primary Keys; the database engine will still not reuse keys that have been generated once (in SQL Server anyway).
When a test executes against a database through another piece of code it ceases to be an unit test. It is called an "integration test" because you are testing the interactions of different pieces of code and how they "integrate" together. Not that it really matters, but its fun to know.
When you perform a test, the following things should occur:
Begin a db transaction
Insert known (possibly bogus) test items/entities
Call the (one and only one) function to be tested
Test the results
Rollback the transaction
These things should happen for each and every test. With NUnit, you can get away with writing step 1 and 5 just once in a base class and then inheriting from that in each test class. NUnit will execute Setup and Teardown decorated methods in a base class.
In step 2, if you're using SQL, you'll have to write your queries such that they return the PK numbers back to your test code.
INSERT INTO Person(FirstName, LastName)
VALUES ('Fred', 'Flintstone');
SELECT SCOPE_IDENTITY(); --SQL Server example, other db vendors vary on this.
Then you can do this
INSERT INTO Person(FirstName, LastName, SpouseId)
VALUES('Wilma', 'Flintstone', #husbandId);
SET #wifeId = SCOPE_IDENTITY();
UPDATE Person SET SpouseId = #wifeId
WHERE Person.Id = #husbandId;
SELECT #wifeId;
or whatever else you need.
In step 4, if you use SQL, you have to re-SELECT your data and test the values returned.
Steps 2 and 4 are less complicated if you are lucky enough to be able to use a decent ORM like (N)Hibernate (or whatever).