How to remove a many-to-many relationship with spring-data-neo4j? - spring-data-neo4j

I have the following entities:
#NodeEntity(label = "A")
public class A {
#Property(name = "something")
private String someProperty;
//... getters and setters
}
#NodeEntity(label = "B")
public class B {
#Property(name = "someOtherThing")
private String otherProperty;
//... getters and setters
}
#RelationshipEntity(type = "AB")
public class AB {
#StartNode
private A start;
#EndNode
private B end;
#Property(name = "evenOtherThing")
private String prop;
//... getters and setters
}
So, in this situation I have (:A)-[:AB]->(:B). I can have several ABs (meaning I can connect A to B several times, having different properties each time).
With that configuration I can save AB instances without problems, but when it comes to deleting just the relationship, I couldn't find a way to do so, using the spring-data-neo4j methods.
Things that I tried:
1- Custom query:
#Repository
public interface ABRepository extends GraphRepository<AB> {
#Query("MATCH (a:A)-[ab:AB]->(b:B) WHERE a.something={something} DELETE ab")
void deleteBySomething(#Param("something") String something);
}
Usage:
#Autowired
ABRepository repository;
//...
repository.deleteBySomething(something);
It didn't work as expected. The A node is removed altogether with the AB relationship. If I run the query directly at the database, it works as expected.
2- Delete from the repository:
#Repository
public interface ABRepository extends GraphRepository<AB> {
#Query("MATCH (a:A)-[ab:AB]->(b:B) WHERE a.something={something} RETURN a,ab,b")
Iterable<AB> findBySomething(#Param("something") String something);
}
Usage:
Iterable<AB> it = repository.findBySomething(something);
repository.delete(it);
Same stuff. The nodes are removed. I tried to iterate over the Iterable<AB> and remove the relationships one by one, without success as well.
3- Nulling the references of A and B inside AB and saving AB:
Same code of the repository, with a different usage:
Iterable<AB> it = repository.findBySomething(something);
for (AB ab : it) {
ab.setA(null);
ab.setB(null);
}
repository.save(it);
Here I'm just trying random stuff. It didn't work as expected. The framework rises an exception stating that the start and end nodes can't be null.
So, what am I doing wrong? What does it take to remove a simple relationship from the database using spring-data-neo4j, without removing the linking nodes?
For the record: my neo4j database is v.3.0.4 and my spring-data-neo4j is v.4.1.4.RELEASE. Running Java 8.

In the end the problem was a sum of two factors.
First: not mentioned in the question, but the way I saved the AB entity wasn't ideal. I was using repository.save(ab) directly, and that can make the framework do some magic with the A and B entities inside. To save just the relationship, without touching the related entities, the repository.save(ab, 0) should be used.
Second: removing entities using a custom query is intuitively faster than fetching the entities and then removing them, so using that approach was my first goal. And here again I was confused by some magic behind the scenes, better described at this question: Spring Data Neo4j 4returning cached results?
In summary, after removing entities or relationships using custom queries, I should clear the session:
#Autowired
Session session;
//...
repository.deleteBySomething(something);
session.clear();
These two tweaks fixed the weird behavior I was having with the framework.

Related

Extending SimpleNeo4jRepository in SDN 6

In SDN+OGM I used the following method to extend the base repository with additional functionality, specifically I want a way to find or create entities of different types (labels):
#NoRepositoryBean
public class MyBaseRepository<T> extends SimpleNeo4jRepository<T, String> {
private final Class<T> domainClass;
private final Session session;
public SpacBaseRepository(Class<T> domainClass, Session session) {
super(domainClass, session);
this.domainClass = domainClass;
this.session = session;
}
#Transactional
public T findOrCreateByName(String name) {
HashMap<String, String> params = new HashMap<>();
params.put("name", name);
params.put("uuid", UUID.randomUUID().toString());
// we do not use queryForObject in case of broken data with non-unique names
return this.session.query(
domainClass,
String.format("MERGE (x:%s {name:$name}) " +
"ON CREATE SET x.creationDate = timestamp(), x.uuid = $uuid " +
"RETURN x", domainClass.getSimpleName()),
params
).iterator().next();
}
}
This makes it so that I can simply add findOrCreateByName to any of my repository interfaces without the need to duplicate a query annotation.
I know that SDN 6 supports the automatic creation of a UUID very nicely through #GeneratedValue(UUIDStringGenerator.class) but I also want to add the creation date in a generic way. The method above allows to do that in OGM but in SDN the API changed and I am a bit lost.
Well, sometimes it helps to write down things. I figured out that the API did not change that much. Basically the Session is replaced with Neo4jOperations and the Class is replaced with Neo4jEntityInformation.
But even more important is that SDN 6 has #CreatedDate which makes my entire custom code redundant.

Initializing empty relationships in entities

I have entities with 1:1 or 1:M relations to other entities. All relations however are nullable.
I want to proxy some operations to the related entity. I'm giving example below. The problem is that if the relation still does not exist, I have null, so I'm ending up constantly checking for nulls, which obviously is wrong. What I would like to do is to hydrate my entities with empty objects. Reasons:
Doctrine knows what instance should be created for the field anyway. So it should just provide empty instance instead of null
I don't want to fill my code with initializations, like
$object->setSettings(new SettingsEntity)
If the requests should be proxied is somehow disputable, but I want to hide the DB representation from the client code. If my direction however is totally wrong, please point me to the right direction. I may accept that this is responsibility of the model, not of the entity, but Doctrine always returns entities to me
Sure, I can add the initialization either in the constructor of the entity, or to provide getter that creates a new instance of the object, if such does not exists. There are couple of reasons I don't want this:
I don't know how objects are actually hydrated. I assume such initialization should happen in an event and not in the constructor
I don't want to write the code for each entity (at some point, someone will forget to add the initialization in the getter) and want to make it automatically for each relation instead.
Some example code:
/**
* SomeObject
* #ORM\Entity()
* #ORM\Table(
name="some_object"
* )
*/ class SomeObject implements DataTransfer {
/**
* #ORM\OneToOne(targetEntity="Settings", mappedBy="SomeObject")
*/
protected $settings;
public function getSettings() {
return $this->settings;
}
public function get() {
$record = new \stdClass();
$record->id = $this->getId();
...
$settingsObject = $this->getSettings();
$record->someKey = $settingsObject ? $settingsObject->getSomeKey() : null;
$record->someOtherKey = $settingsObject ? $settingsObject->getSomeOtherKey() : null;
return $record;
}
Any suggestions, including hacking Doctrine, are welcome.
P.S. Doctrine-ORM version is 2.3. I can upgrade if this will help solving the problem.
I won't discuss your proxy-thingie-theory: your code, your design, I don't have enough knowlegde of these to have an opinion.
About you knowing how Doctrine hydrates its entities, you can see how it's done in \Doctrine\ORM\UnitOfWork::createEntity. It doesn't seem to invoke the constructor (uses \ReflectionClass::newInstanceWithoutConstructor, which obviously shouldn't use the constructor), but you may be interested in listening to Doctrine's post-load event (part of the lifecycle events logic).
About initializing your null properties, i.e. the code that your post-load event should trigger, you should begin by having a superclass over all of your entities: instead of class SomeObject implements DataTransfer {...}, you'd have class SomeObject extends MyEntity {...} (and have MyEntity implement DataTransfer to keep your interface). This MyEntity class would be a "mapped superclass", it would be annotated with #HasLifecycleCallbacks, and declare a method annotated with #PostLoad. There you have your hook to run your null-to-something code.
For this code to be generic (as it'd be coded from this superclass), you can rely on Doctrine's entity metadata, which retains association mappings and all data that the Unit Of Work needs to figure out its low-level DB-accessing business. It should look like the following:
/** #HasLifecycleCallbacks #MappedSuperclass ... */
public class MyEntity implements DataTransfer {
...
/** #PostLoad */
public function doPostLoad(\Doctrine\Common\Persistence\Event\LifecycleEventArgs $event) { //the argument is needed here, and is passed only since 2.4! If you don't want to upgrade, you can work around by using event listeners, but it's more complicated to implement ;)
$em = $event->getEntityManager();
$this->enableFakeMappings($em);
}
private function enableFakeMappings(\Doctrine\ORM\EntityManager $em) {
$mappings = $em->getClassMetadata(get_class($this))->getAssociationMappings(); //try and dump this $mappings array, it's full o'good things!
foreach ($mappings as $mapping) {
if (null === $this->{$mapping['fieldName']}) {
$class = $mapping['targetEntity'];
$this->{$mapping['fieldName']} = new $class(); //this could be cached in a static and cloned when needed
}
}
}
}
Now, consider the case where you have to new an entity, and want to access its properties without the null values checks: you have to forge a decent constructor for this job. As you still need the Entity Manager, the most straightforward way is to pass the EM to the constructor. In ZF2 (and Symfony I believe) you can have a service locator injected and retrieve the EM from there. Several ways, but it's another story. So, the basic, in MyEntity:
public function __construct(\Doctrine\ORM\EntityManager $em) {
$this->enableFakeMappings($em);
}
Doing this, however, would probably confuse Doctrine when the entity is persisted: what should it do with all these instantiated empty objects? It'll cascade-persist them, which is not what you want (if it is, well, you can stop reading ;)). Sacrificing cascade-persisting, an easy solution would be something like this, still in your superclass:
/** #PrePersist */
public function doPrePersist(\Doctrine\Common\Persistence\Event\LifecycleEventArgs $event) {
$em = $event->getEntityManager();
$this->disableFakeMappings($em);
}
/** #PreUpdate */
public function doPreUpdate(\Doctrine\Common\Persistence\Event\LifecycleEventArgs $event) {
$em = $event->getEntityManager();
$this->disableFakeMappings($em);
}
private function disableFakeMappings(\Doctrine\ORM\EntityManager $em) {
$uow = $em->getUnitOfWork();
$mappings = $em->getClassMetadata()->getAssociationMappings();
foreach ($mappings as $mapping) {
if (!$this->{$mapping['fieldName']} instanceof MyEntity) {
continue;
}
//"reset" faked associations: assume they're fake if the object is not yet handled by Doctrine, which breaks the cascading auto-persist... risk nothing, gain nothing, heh? ;)
if (null === $uow->getEntityState($this->{$mapping['fieldName']}, null)) {
$this->{$mapping['fieldName']} = null;
}
}
}
Hope this helps! :)

What's the lazy strategy and how does it work?

I have a problem. I'm learning JPA. I'm using embedded OpenEJB container in unit tests, but only working is #OneToMany(fetch=EAGER). Otherwise is the collection allways null. I haven't found, how the lazy strategy works, how the container fills the data and in which circumstances triggers the container the loading action?
I have read, that the action triggers when the getter is being called. But when I have the code:
#OneToMany(fetch = LAZY, mappedBy="someField")
private Set<AnotherEntities> entities = new Set<AnotherEntities>();
...
public Set<AnotherEntities> getEntities() {
return entities;
}
I'm always getting null. I thing, the LAZY strategy cannot be tested with embedded container. The problem might be also in the bidirectional relation.
Does have anybody else similar expiriences with the JPA testing?
Attachments
The real test case with setup:
#RunWith(UnitilsJUnit4TestClassRunner.class)
#DataSet("dataSource.xml")
public class UnitilsCheck extends UnitilsJUnit4 {
private Persister prs;
public UnitilsCheck() {
Throwable err = null;
try {
Class.forName("org.hsqldb.jdbcDriver").newInstance();
Properties props = new Properties();
props.setProperty(Context.INITIAL_CONTEXT_FACTORY, "org.apache.openejb.client.LocalInitialContextFactory");
props.put("ds", "new://Resource?type=DataSource");
props.put("ds.JdbcDriver", "org.hsqldb.jdbcDriver");
props.put("ds.JdbcUrl", "jdbc:hsqldb:mem:PhoneBookDB");
props.put("ds.UserName", "sa");
props.put("ds.Password", "");
props.put("ds.JtaManaged", "true");
Context context = new InitialContext(props);
prs = (Persister) context.lookup("PersisterImplRemote");
}
catch (Throwable e) {
e.printStackTrace();
err = e;
}
TestCase.assertNull(err);
}
#Test
public void obtainNickNamesLazily() {
TestCase.assertNotNull(prs);
PersistableObject po = prs.findByPrimaryKey("Ferenc");
TestCase.assertNotNull(po);
Collection<NickNames> nicks = po.getNickNames();
TestCase.assertNotNull(nicks);
TestCase.assertEquals("[Nick name: Kutyafája, belongs to Ferenc]", nicks.toString());
}
}
The bean Presister is the bean mediating access to the entity beans. The crucial code of class follows:
#PersistenceUnit(unitName="PhonePU")
protected EntityManagerFactory emf;
public PhoneBook findByPrimaryKey(String name) {
EntityManager em = emf.createEntityManager();
PhoneBook phonebook = (PhoneBook)em.find(PhoneBook.class, name);
em.close();
return phonebook;
}
Entity PhoneBook is one line of phone book (also person). One person can have zero or more nick names. With EAGER strategy it works. With LAZY the collection is allways null. May be the problem is in the detaching of objects. (See OpenEJB - JPA Concepts, part Caches and detaching.) But in the manual is written, that the collection can be sometimes (more like manytimes) empty, but not null.
The problem is in the life cycle of an entity. (Geronimo uses OpenJPA, so le't see OpenJPA tutorial, part Entity Lifecycle Management.) The application uses container managed transactions. Each method call on the bean Persiser runs in an own transation. And the persistency context depends on the transaction. The entity is disconnected from its context at the end of the transaction, thus at the end of the method. I tried to get the entity and on second line in the same method to get the collection of nick names and it worked. So the problem was identifyed: I cannot get additionally any entity data from the data store without re-attaching the entity to some persistency context. The entity is re-attached by the EntityManager.merge() method.
The code needs more correctures. Because the entity cannot obtain the EntityManager reference and re-attach itself, the method returning nick names must be moved to the Persister class. (The comment Heureka marks the critical line re-attaching the entity.)
public Collection<NickNames> getNickNamesFor(PhoneBook pb) {
//emf is an EntityManagerFactory reference
EntityManager em = emf.createEntityManager();
PhoneBook pb = em.merge(pb); //Heureka!
Collection<NickNames> nicks = pb.getNickNames();
em.close();
return nicks;
}
The collection is then obtained in this way:
//I have a PhoneBook instance pb
//pb.getNickNames() returns null only
//I have a Persister instance pe
nicks = pe.getNickNames(pb);
That's all.
You can have a look at my second question concerning this topic I'have asked on this forum. It is the qustion OpenJPA - lazy fetching does not work.
How I would write the code
#Entity
public class MyEntity {
#OneToMany(fetch = LAZY, mappedBy="someField")
private Set<AnotherEntities> entities;
// Constructor for JPA
// Fields aren't initalized here so that each em.load
// won't create unnecessary objects
private MyEntity() {}
// Factory method for the rest
// Have field initialization with default values here
public static MyEntity create() {
MyEntity e = new MyEntity();
e.entities = new Set<AnotherEntities>();
return e;
}
public Set<AnotherEntities> getEntities() {
return entities;
}
}
Idea no 2:
I just thought that the order of operations in EAGER and LAZY fetching may differ i.e. EAGER fetching may
Declare field entities
Fetch value for entities (I'd assume null)
Set value of entities to new Set<T>()
while LAZY may
Declare field `entities
set value of entities to new Set<T>()
Fetch value for entities (I'd assume null)'
Have to find a citation for this as well.
Idea no 1: (Not the right answer)
What if you'd annotate the getter instead of the field? This should instruct JPA to use getters and setters instead of field access.
In the Java Persistence API, an entity can have field-based or
property-based access. In field-based access, the persistence provider
accesses the state of the entity directly through its instance
variables. In property-based access, the persistence provider uses
JavaBeans-style get/set accessor methods to access the entity's
persistent properties.
From The Java Persistence API - A Simpler Programming Model for Entity Persistence

EF Code First issue on CommitTransaction - using Repository pattern

I am having an issue with EF 4.1 using "Code First". Let me setup my situation before I start posting any code. I have my DBContext class, called MemberSalesContext, in a class library project called Data.EF. I have my POCOs in a seperate class library project called Domain. My Domain project knows nothing of Entity Framework, no references, no nothing. My Data.EF project has a reference to the Domain project so that my DB context class can wire up everything in my mapping classes located in Data.EF.Mapping. I am doing all of the mappings in this namespace using the EntityTypeConfiguration class from EntityFramework. All of this is pretty standard stuff. On top of Entity Framework, I am using the Repository pattern and the Specification pattern.
My SQL Server database table has a composite primary key defined. The three columns that are part of the key are Batch_ID, RecDate, and Supplier_Date. This table as an identity column (database generated value => +1) called XREF_ID, which is not part of the PK.
My mapping class, located in Data.EF.Mapping looks like the following:
public class CrossReferenceMapping : EntityTypeConfiguration<CrossReference>
{
public CrossReferenceMapping()
{
HasKey(cpk => cpk.Batch_ID);
HasKey(cpk => cpk.RecDate);
HasKey(cpk => cpk.Supplier_Date);
Property(p => p.XREF_ID).HasDatabaseGeneratedOption(DatabaseGeneratedOption.Identity);
ToTable("wPRSBatchXREF");
}
}
My MemberSalesContext class (inherits from DBContext) looks like the following:
public class MemberSalesContext : DbContext, IDbContext
{
//...more DbSets here...
public DbSet<CrossReference> CrossReferences { get; set; }
//...more DbSets here...
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
modelBuilder.Conventions.Remove<IncludeMetadataConvention>();
//...more modelBuilder here...
modelBuilder.Configurations.Add<CrossReference>(new CrossReferenceMapping());
//...more modelBuilder here...
}
}
I have a private method in a class that uses my repository to return a list of objects that get iterated over. The list I am referring to is the outermost foreach loop in the example below.
private void CloseAllReports()
{
//* get list of completed reports and close each one (populate batches)
foreach (SalesReport salesReport in GetCompletedSalesReports())
{
try
{
//* aggregate sales and revenue by each distinct supplier_date in this report
var aggregates = BatchSalesRevenue(salesReport);
//* ensure that the entire SalesReport breaks out into Batches; success or failure per SalesReport
_repository.UnitOfWork.BeginTransaction();
//* each salesReport here will result in one-to-many batches
foreach (AggregateBySupplierDate aggregate in aggregates)
{
//* get the batch range (type) from the repository
BatchType batchType = _repository.Single<BatchType>(new BatchTypeSpecification(salesReport.Batch_Type));
//* get xref from repository, *if available*
//* some will have already populated the XREF
CrossReference crossReference = _repository.Single<CrossReference>(new CrossReferenceSpecification(salesReport.Batch_ID, salesReport.RecDate, aggregate.SupplierDate));
//* create a new batch
PRSBatch batch = new PRSBatch(salesReport,
aggregate.SupplierDate,
BatchTypeCode(batchType.Description),
BatchControlNumber(batchType.Description, salesReport.RecDate, BatchTypeCode(batchType.Description)),
salesReport.Zero_Sales_Flag == false ? aggregate.SalesAmount : 1,
salesReport.Zero_Sales_Flag == false ? aggregate.RevenueAmount : 0);
//* populate CrossReference property; this will either be a crossReference object, or null
batch.CrossReference = crossReference;
//* close the batch
//* see PRSBatch partial class for business rule implementations
batch.Close();
//* check XREF to see if it needs to be added to the repository
if (crossReference == null)
{
//*add the Xref to the repository
_repository.Add<CrossReference>(batch.CrossReference);
}
//* add batch to the repository
_repository.Add<PRSBatch>(batch);
}
_repository.UnitOfWork.CommitTransaction();
}
catch (Exception ex)
{
//* log the error
_logger.Log(User, ex.Message.ToString().Trim(), ex.Source.ToString().Trim(), ex.StackTrace.ToString().Trim());
//* move on to the next completed salesReport
}
}
}
All goes well on the first iteration of the outer loop. On the second iteration of the outer loop, the code fails at _repository.UnitOfWork.CommitTransaction(). The error message returned is the following:
"The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges."
In this situation, the database changes on the second iteration were not committed successfully, but the changes in the first iteration were. I have ensured that objects in the outer and inner loops are all unique, adhering to the database primary keys.
Is there something that I am missing here? I am willing to augment my code samples, if it proves helpful. I have done everything within my capabilities to troubleshoot this issue, minus modifying the composite primary key set on the database table.
Can anyone help??? Much thanks in advance! BTW, sorry for the long post!
I am answering my own question here...
My issue had to do with how the composite primary key was being defined in my mapping class. When defining a composite primary key using EF Code First, you must define it like so:
HasKey(cpk => new { cpk.COMPANYID, cpk.RecDate, cpk.BATTYPCD, cpk.BATCTLNO });
As opposed to how I had it defined previously:
HasKey(cpk => cpk.COMPANYID);
HasKey(cpk => cpk.RecDate);
HasKey(cpk => cpk.BATTYPCD);
HasKey(cpk => cpk.BATCTLNO);
The error I was receiving was that the ObjectContext contained multiple elements of the same type that were not unique. This became an issue in my UnitOfWork on CommitTransaction. This is because when the mapping class was instanciated from my DBContext class, it executed 4 HasKey statements shown above, with only the last one for property BATCTLNO becoming the primary key (not composite). Defining them inline, as in my first code sample above, resolves the issue.
Hope this helps someone!

using a Singleton to pass credentials in a multi-tenant application a code smell?

I'm currently working on a multi-tenant application that employs Shared DB/Shared Schema approach. IOW, we enforce tenant data segregation by defining a TenantID column on all tables. By convention, all SQL reads/writes must include a Where TenantID = '?' clause. Not an ideal solution, but hindsight is 20/20.
Anyway, since virtually every page/workflow in our app must display tenant specific data, I made the (poor) decision at the project's outset to employ a Singleton to encapsulate the current user credentials (i.e. TenantID and UserID). My thinking at the time was that I didn't want to add a TenantID parameter to each and every method signature in my Data layer.
Here's what the basic pseudo-code looks like:
public class UserIdentity
{
public UserIdentity(int tenantID, int userID)
{
TenantID = tenantID;
UserID = userID;
}
public int TenantID { get; private set; }
public int UserID { get; private set; }
}
public class AuthenticationModule : IHttpModule
{
public void Init(HttpApplication context)
{
context.AuthenticateRequest +=
new EventHandler(context_AuthenticateRequest);
}
private void context_AuthenticateRequest(object sender, EventArgs e)
{
var userIdentity = _authenticationService.AuthenticateUser(sender);
if (userIdentity == null)
{
//authentication failed, so redirect to login page, etc
}
else
{
//put the userIdentity into the HttpContext object so that
//its only valid for the lifetime of a single request
HttpContext.Current.Items["UserIdentity"] = userIdentity;
}
}
}
public static class CurrentUser
{
public static UserIdentity Instance
{
get { return HttpContext.Current.Items["UserIdentity"]; }
}
}
public class WidgetRepository: IWidgetRepository{
public IEnumerable<Widget> ListWidgets(){
var tenantId = CurrentUser.Instance.TenantID;
//call sproc with tenantId parameter
}
}
As you can see, there are several code smells here. This is a singleton, so it's already not unit test friendly. On top of that you have a very tight-coupling between CurrentUser and the HttpContext object. By extension, this also means that I have a reference to System.Web in my Data layer (shudder).
I want to pay down some technical debt this sprint by getting rid of this singleton for the reasons mentioned above. I have a few thoughts on what a better implementation might be, but if anyone has any guidance or lessons learned they could share, I would be much obliged.
CurrentUser isn't quite a singleton. I'm not exactly sure what you'd call it. (A singleton by definition can only exist one at a time, and any number of UserIdentity instances can be created at will by outside code and coexist without any issues.)
Personally, i'd take CurrentUser.Instance and either move it to UserIdentity.CurrentUser, or put it together with whatever similar "get the global instance" methods and properties you have. Gets rid of the CurrentUser class, at least. While you're at it, make the property settable at the same place -- it's already settable, just in an way that (1) would look like magic if the two classes weren't shown right next to each other, and (2) makes changing how the current user identity is set later harder.
Doesn't get rid of the global, but you're not really gonna get around that without passing the UserIdentity to every function that needs it.