neo4j functionality & unit testing - resetting the database - unit-testing

I've created an application in node.js and have Mocha tests to perform automated unit and functionality testing.
I'm now trying to test database functionality, and want the database to be reset between each test for consistency.
Solution 1
Before each test I was running:
MATCH (n) OPTIONAL MATCH (n)-[r]-() DELETE n,r
and then populating the database with cyphers queries obtained using the neo4j-shell dump command. However, the problem with this is that those cypher queries utilise the internal neo4j ids to create links between nodes and relationships, and because the delete query above doesn't reset the internal neo4j id counter to 0 it all goes wrong when you try to run it!
Solution 2
I then looked at physically shutting down the neo4j server, removing the database directory and then rebooting it and populating it. This works, but it takes around 15 seconds, which is useless when I've got 200+ unit tests to run!
Solution 3
I've also looked at transactions in order to be able to roll the database back once the test had completed, but it seems that all queries have to go through the transaction endpoint. I don't think this is feasible.
.
Are there any other ways of doing this? I think solution 1 shows the most promise, but it'd mean going through and changing all my exported cypher queries to avoid using the internal neo4j ids.
For example I'd have to change:
create (_113:`User` {`firstname`:"John", `lastname`:"Smith", `uuid`:"f843c210-26e3-11e5-af31-297c662c0848"})
create (_114:`Instrument` {`name`:"Drums", `uuid`:"f84521a0-26e3-11e5-af31-297c662c0848"})
create _113-[:`PLAYS`]->_114
To:
create (_113:`User` {`firstname`:"John", `lastname`:"Smith", `uuid`:"f843c210-26e3-11e5-af31-297c662c0848"})
create (_114:`Instrument` {`name`:"Drums", `uuid`:"f84521a0-26e3-11e5-af31-297c662c0848"})
MATCH (a:User),(b:Instrument) WHERE a.uuid = 'f843c210-26e3-11e5-af31-297c662c0848' AND b.uuid = 'f84521a0-26e3-11e5-af31-297c662c0848' CREATE UNIQUE (a)-[r:`PLAYS`]->(b) RETURN r
Which is a real pain with a large dataset..
Any thoughts?

As FrobberOfBits kindly suggested, have a look at GraphAware RestTest built precisely for your purpose.

Related

ZF2 Doctrine2 App is very slow because of doctrine method calls

The request time for the homepage of my app is about 5 seconds although there are only 6 database queries. So I decided to install xdebug with webgrind on my local server to profile my app. There I can see, that I have a huge amount of doctrine method calls, but I don't know really how to interpret this to minify the number of that calls. Maybe someone could give me a hint.
RestaurantRepository
public function findByCity(City $city) {
$queryBuilder = $this->createQueryBuilder('restaurant');
$queryBuilder->addSelect('cuisines')
->addSelect('openingHours')
->addSelect('address')
->addSelect('zipCode')
->addSelect('city')
->leftJoin('restaurant.cuisines', 'cuisines')
->leftJoin('restaurant.openingHours', 'openingHours')
->leftJoin('restaurant.meals', 'meals')
->innerJoin('restaurant.address', 'address')
->innerJoin('address.zipCode', 'zipCode')
->innerJoin('zipCode.city', 'city')
->where('zipCode.city = :city')
->andWhere('restaurant.state <= :state')
->setParameter('city', $city)
->setParameter('state', Restaurant::STATE_ENABLED)
->orderBy('restaurant.state', 'ASC')
->addOrderBy('restaurant.name', 'ASC');
return $queryBuilder->getQuery()->getResult();
}
You probably load all associations of some of your entities. It is hard to say where the problem is exactly without any more information about your entity definitions and the queries you are executing.
In the doctrine documentation are some suggestions for improving performance (one of them is about lazy loading associations) that might help you to get on your way.
Install and enable the ZendDeveloperToolbar module. There you will have a possibility to check how many DB calls you are making with each action.
As you can see on the image, there's a lot of hydration going on under the hood. There's a lot of tutorials on the net how NOT to use Doctrine. I can't tell anything without looking at what you are doing with your entities.
Also make sure you have enabled cache when in production mode so Doctrine don't have to parse mapping information with each request, which is very heavy. You probably are using the Annotation Driver, which is the slowest one.
I can also see you are using the Zend autoloader which is inefficient comparing to Composer. Simply add your modules/src's to the autoload section of composer.json file and let the Composer do the autoloading.

Entity Framework error during unit test

I'm using the entity framework.
In one of my unit tests I have a line like:
this.Set<T>().Add(entity);
On executing that line I get:
System.InvalidOperationException : The model backing the
'InvoiceNewDataContext' context has changed since the database was
created. Either manually delete/update the database, or call
Database.SetInitializer with an IDatabaseInitializer instance. For
example, the DropCreateDatabaseIfModelChanges strategy will
automatically delete and recreate the database, and optionally seed it
with new data.
Well I've actually deleted the database and removed the connection string.
I'm surprised this error is happening on adding as I wouldn't expect it to happen until I saved the data and it discovered there was no database.
In previous projects/solutions I created during unit tests I have been able to add to the context for test purposes without actually calling SaveChanges.
Would anyone know why this would be happening in my latest projects/solutions?
Are you sure it really didn't use database in your previous projects? If you do not specify any connection string it will silently use a default one to SQLExpress database with local .mdf file so make sure that isn't happening now.

Check for live Data Source Name Before proceeding

Would it be ok to get a CF app to check for a valid database before proceeding to process that request?
This is because there may be instances where the database server may be down or being upgraded, hence an error comes when a db dependant request is made.
If there is no connection to the db server, the user can be safely redirected to a safe page.
Or can cfcatch work?
How can this check be done?
Thank you.
in your onRequestStart method of your Application.cfc file or in an Application.cfm file you can run a simple query to check that the database is available. Wrap the query in cftry/cfcatch. If the query fails, you can redirect the user in the cfcatch, if it succeeds, you can be reasonably sure that your database is "alive".
I've used such a check in one project. Code may looks as follows (not sure if it will work in versions of ColdFusion lower than 8), consider this sample as chunk of UDF written in CFScript:
// service factory object instance
factory = CreateObject("java","coldfusion.server.ServiceFactory");
// the datasource service
dsService = factory.DatasourceService;
// verify the dsn
return dsService.verifyDataSource(arguments.dsn);
Oh, I have even found small note in the code I wrote on my old laptop couple of years ago:
// [performance note] this server check takes 1-3ms at local PC (Kubuntu 7.10, CF8 + Apache2, Sempron 3500+, 1GB RAM)
While time looks like small I have found out that doing this check on each request is not really useful for my application. Any way I have a habit to use the try/catch extensively for errors handling. But if your datasources may cheange frequently it may have more sense.
Adding an extra query to every request to make sure that the database is up is a patently bad idea. A better approach would be to build a "maintenance mode" switch into your application, that you would manually enable when you are doing planned maintenance (upgrades, etc).
If you want to have a "friendly" page displayed when an error (like database issues) occur, then use the onError() method in Application.cfc and/or the <cferror .../> tag in Application.cfm, as a global error handler.
If you are worried the db could vanish, I would implement a "SELECT 1 AS A" query in your OnRequestStart handler that runs only every N minutes. This can be accomplished by using the query caching feature. I'd start with performing the query every 30 min.

JPA - How to truncate tables between unit tests

I want to cleanup the database after every test case without rolling back the transaction. I have tried DBUnit's DatabaseOperation.DELETE_ALL, but it does not work if a deletion violates a foreign key constraint. I know that I can disable foreign key checks, but that would also disable the checks for the tests (which I want to prevent).
I'm using JUnit 4, JPA 2.0 (Eclipselink), and Derby's in-memory database. Any ideas?
Thanks,
Theo
The simplest way to do this is probably using the nativeQuery jpa method.
#After
public void cleanup() {
EntityManager em = entityManagerFactory.createEntityManager();
em.getTransaction().begin();
em.createNativeQuery("truncate table person").executeUpdate();
em.createNativeQuery("truncate table preferences").executeUpdate();
em.getTransaction().commit();
}
Simple: Before each test, start a new transaction and after the test, roll it back. That will give you the same database that you had before.
Make sure the tests don't create new transactions; instead reuse the existing one.
I am a bit confused as DBUnit will reinitialize the database to a known state before every test.
They also recommend as a best practice not to cleanup or otherwise change the data after the test.
So if it is cleanup you're after to prepare the db for the next test, I would not bother.
Yes, in-transaction test would make your life much easier, but if transaction is your thing then you need to implement compensating transaction(s) during cleanup (in #After). It sounds laborious and it might be but if properly approached you may end up with a set of helper methods (in tests) that compensate (cleanup) data accumulated during #Before and tests (using JPA or straight JDBC - whatever makes sense).
For example, if you use JPA and call create methods on entities during tests you may utilize (using AOP if you fancy or just helper test methods like us) a pattern across all tests to:
track ids of all entities that have been created during test
accumulate them in order created
replay entity deletes for these entities in reverse order in #After
My setup is quite similar: it's Derby (embedded) + OpenJPA 1.2.2 + DBUnit. Here's how I handle integration tests for my current task: in every #Before method I run 3 scripts:
Drop DB — an SQL script that drops all tables.
Create DB — an SQL script that recreates them.
A test-specific DB unit XML script to populate the data.
My database has only 12 tables and the test data set is not very big, either — about 50 records. Each script takes about 500 ms to run and I maintain them manually when tables are added or modified.
This approach is probably not recommended for testing big databases, and perhaps it cannot even be considered good practice for small ones; however, it has one important advantage over rolling back the transaction in the #After method: you can actually detect what happens at commit (like persisting detached entities or optimistic lock exceptions).
Better late then never ...
I just had the same problem and came around a pretty simple solution:
set the property "...database.action" to the value "drop-and-create" in your persistence-unit config
close the entity-manager and the entity-manager factory after each test
persistence.xml
<persistence-unit name="Mapping4" transaction-type="RESOURCE_LOCAL" >
<provider>org.eclipse.persistence.jpa.PersistenceProvider</provider>
<class>...</class>
<class>...</class>
<properties>
...
<property name="javax.persistence.schema-generation.database.action" value="drop-and-create" />
...
</properties>
</persistence-unit>
unit-test:
...
#Before
public void setup() {
factory = Persistence.createEntityManagerFactory(PERSISTENCE_UNIT_NAME);
entityManager = factory.createEntityManager();
}
#After
public void tearDown() {
entityManager.clear();
entityManager.close();
factory.close();
}
...
I delete the DB file after each run:
boolean deleted = Files.deleteIfExists(Paths.get("pathToDbFile"));
A little dirty but works for me.
Regards
Option 1: You can disable foreign key checks before truncating tables, and enable them again after truncation. You will still have checks in tests in this way.
Option 2: H2 database destroys the in-memory database when the last connection closed. I guess Derby DB supports something similar, or you can switch to H2.
See also: I wrote a code to truncate tables before each test using Hibernate in a related question: https://stackoverflow.com/a/63747005/471214

What is a sane way to perform a radical Django Model migration in a production environment?

I have an existing django web app that is in use. I have to radically migrate one key model in my design to a completely new design, but I want to cache all of the existing data for that model and migrate them to the new records in production when ready to deploy.
I can afford to bring my website down for a few hours one night and do whatever I need to do to migrate. What are some sane ways I can do this migration?
It seems any migration would need to:
1) Dump all of the existing data into some format, such as SQL, JSON, XML
2) Migrate the model to the new format
3) Reload the data into the new model using a conversion script
I also thought of trying to store all of the existing data in some other model called "OldModel" (if Model is the name of the existing model) and then migrating the data live.
There is a project to help with migrations that I've heard of: South.
Having said that, I admit we've not used it. We still plan our migrations using a file of SQL statements. Madness, I know, but it has the advantage of testability. You can run it as many times as necessary during development and staging testing before the "big deploy". It can be source controlled, diffed, etc. It can also, therefore, be called from a larger deployment script. Of course, we back up production before running it :-)
If your database does journaling, using the old-fashioned method has the added advantage that there is a transaction history that can be rolled back.
Experiments we've run with JSON, XML and "OldModel" -> "NewModel" style dumps have scaled pretty poorly. Mind you, YMMV... we have quite a large database. By using a script, you can run on your production database without having to offload or reload vast amounts of data. This way even a complicated migration can take seconds, rather than hours.
There are around 5 or 6 tools to help automate some portion of migrations. Several of them are listed in this question and I'll add the others just for completeness.
Next, see S. Lott's answer to this question about migration workflows for a great idea on using version numbers in the model name to make migrations easier, including structuring a standalone script to properly convert the tables. To my mind this is vastly superior to serializing the data for export and then trying to build your new tables by importing.
Finally, I haven't been able to think of a way to do a hot migration properly and haven't seen any hints from anywhere else either, so maintenance downtime is inevitable.
Make all migrations in steps!
If you need to add a field, go ahead and add it, with a default value or being optional. This is safe.
If you need to make an existing optional field required, give it a default first.
If you need to make an existing field with a default not have a default, drop the default after fixing all the code that creates instances.
If you need to change the type of a field, add a new field that inherits the value from the current one, first. Then, run a script to update the existing instances to populate the new field. Thirdly, Remove all the code that uses the old field to use the new one. Finally, which no code is left using the original, you can drop it.
For every situation there is a small step you can make. For every bigger change, you can break it down into little ones. This is one place iterative development pays off. Keep good backups in place and don't be afraid to push often! Make the small changes quickly to see if they work.
If you are more comfortable with the Django ORM than with raw SQL, you might consider using Model -> BackupModel -> TestModel -> Model, where all but the last step can be performed without dropping data.
def backup(InModel,OutModel):
in_objs = InModel.objects.all()
for obj in in_objs:
out_obj = OutModel.convert_from(InModel,obj)
out_obj.save()
Here, you would just make sure that all your models have convert_from methods implemented. These should all be trivial conversions except for BackupModel -> TestModel. In the other cases, nothing but the class would change, all data being identically preserved.
The advantage to this is that before you go rewriting all your interfaces, you can play around with TestModel and make sure that your conversions were what you thought they'd be. If everything goes wrong, you convert from BackupModel->Model, and everything is okay. In a worst-case scenario, you give up on Django's ORM, run back to SQL, and simply rename all your tables that begin with backupmodel__* to model__* in your database.
Disclaimer: I've never done this.