Is is possible (without writing custom SQL) to have Eclipselink trust me as to whether to perform an update or insert on a merge, rather than perform a select, then an update or insert? If so, how?
In my mind I'd like to use a transient flag and a custom if statement to determine whether the item is already in the database or not, and instruct eclipselink to perform the query required. I understand Hibernate provides this as update() and save()
A few notable points:
I have a large amount of objects that are being batch-merged, as such
persist() is not suitable for me (also the objects do not exist in the
library except when they are passed in for the merge anyway)
Because there are so many objects being merged, it is unlikely that there will be any cache hits, so eclipselink has been unable to tell if its sent it in before via the cache
Because amount of objects going in (to a non-local database, in this case) the SELECTs are a problem, especially given I can tell which will be required before the operation occurs
I'd really rather not switch to Hibernate
Thanks. Perhaps I am missing something obvious!
What you are looking for is called existence checking in EclipseLink, and can be configured using the #ExistenceChecking annotation as described here:
http://eclipse.org/eclipselink/documentation/2.4/jpa/extensions/a_existencechecking.htm
Try specifying #ExistenceChecking(ExistenceType.CHECK_CACHE) as while it states that check_cache is the default,this is for Native EclipseLink projects. JPA projects use a Check_Database as the default to conform to the JPA specification requiring that merge calls merge into data from the database if necessary. Using the check_cache will prevent EclipseLink from querying at all, so you can query yourself based on your own criteria. Existing objects will be required to be in the cache though, otherwise there is nothing to merge into, and EclipseLink will have to perform an insert.
Another option is to use a customizer to define the DoesExistQuery used for each class. This could allow you to override the checkEarlyReturn method to perform as needed to determine existence.
The above options still use the JPA merge and so still require getting the existing data to merge into - so it will still require selects for existing objects not in the cache. If all you are after is an update all type statement that will update the object or insert the object as is, without tracking only what has changed, you might try looking at native EclipseLink functionality, such as the UnitOfWork api. Using something like ((EntityManagerImpl)em.getDelegate()).getUnitOfWork().updateObject(entity) or use the UOW execute your own UpdateObjectQuery would avoid the selects for existing objects, at the loss of only sending changes.
I got fixes for skipping select call before insert/update in eclipse link jpa
https://www.eclipse.org/eclipselink/documentation/2.4/jpa/extensions/a_existencechecking.htm
Related
for a project, I am trying to create a web-app that, among other things, allows training of machine learning agents using python libraries such as Dedupe or TensorFlow. In cases such as Dedupe, I need to provide an interface for active learning, which I currently realize through jquery based ajax calls to a view that takes and sends the necessary training data.
The problem is that I need this agent object to stay alive throughout multiple view calls and be accessible by each individual call. I have tried realizing this via the built-in cache system using Memcached, but the serialization does not seem to keep all the info intact, and while I am technically able to restore the object from the cache, this appears to break the training algorithm.
Essentially, I want to keep the object alive within the application itself (rather than an external memory store) and be able to access it from another view, but I am at a bit of a loss of how to realize this.
If someone knows the proper technique to achieve this, I would be very grateful.
Thanks in advance!
To follow up with this question, I have since realized that the behavior shown seemed to have been an effect of trying to use the result of a method call from the object loaded from cache directly in the return properties of a view. Specifically, my code looked as follows:
#model is the object loaded from cache
#this returns the wrong object (same object as on an earlier call)
return JsonResponse({"pairs": model.uncertain_pairs()})
and was changed to the following
#model is the object loaded from cache
#this returns the correct object (calls and returns the model.uncertain_pairs() method properly)
uncertain = model.uncertain_pairs()
return JsonResponse({"pairs": uncertain})
I am unsure if this specifically happens due to an implementation from Dedupe or Django side or due to Python, but this has undoubtedly fixed the issue.
To return back to the question, Django does seem to be able to properly (de-)serialize objects and their properties in cache, as long as the cache is set up properly (see Apparent bug storing large keys in django memcached which I also had to deal with)
I want to perform a non-backwards compatible state upgrade using SignatureConstraint. If it were a backwards compatible change, for example adding a property, I'd just added a nullable property in the state and that would work. However I don't have any idea how should I act in the following scenarios:
Scenario1: A new non-null field is added to the state
Scenario2: A field was removed from state
Scenario3: A field was modified in the state. E.g. a field of type Date transformed into an object which contains that date and some other fields.
Scenario4: A field in the state was renamed.
The problem is that explicit upgrade does not support SignatureConstraint and I get the following error message Legacy contract does not satisfy the upgraded contract's constraint, so I need to find a solution for implicity upgrade.
ContractUpgradeFlow doesn't support the upgrade of state with SignatureConstraint. However, the flexibility of Signature Constraint allows you to add any CorDapps as long as it's signed by the same key. You could easily write a simple flow to mimic an ExplicitUpgrade for the scenario you mentioned.
Here is what you could do:
Add both the corDapps jar files (old and updated) in the nodes cordapps folder.
Write another cordapp with a flow that consumes your existing state and outputs the new state (the upgraded one).
Add this flow jar to the nodes cordapps folder.
Execute the new flow to consume the older states and output the upgraded state.
Points to Note:
Make sure to have the correct set of signers to avoid incorrect spending of the states.
This is just an overall idea. The actual way of doing this might get a little complicated depending on the contract rules for your Exit transaction of the state.
I would rather add a new upgrade command to cater to this scenario.
You could have got the overall idea and do the tweaking at your end to perform the upgrade of your usecase. Hope this helps!
As a workaround I made the incompatible change to become a compatible one. Here is how it works.
I've created a state which has propertiesV1 object. This object includes all the fields that CompanyState should.
#CordaSerializable
#BelongsToContract(CompanyContract::class)
data class CompanyState(
override val linearId: UniqueIdentifier,
val propertiesV1: CompanyV1?
) : LinearState
Now when I need to make an incompatible change in the properties, I just add another version of the object to the state.
#CordaSerializable
#BelongsToContract(CompanyContract::class)
data class CompanyState(
override val linearId: UniqueIdentifier,
val propertiesV1: CompanyV1?,
val propertiesV2: CompanyV2?
) : LinearState
Neither contract, nor flows are changed. They are just being updated to handle propertiesV2 field.
I'm working on an events site and have a one to many relationship between a production and its performances, when I have a performance object if I need its production id at the moment I have to do
$productionId = $performance->getProduction()->getId();
In cases when I literally just need the production id it seems like a waste to send off another database query to get a value that's already in the object somewhere.
Is there a way round this?
Edit 2013.02.17:
What I wrote below is no longer true. You don't have to do anything in the scenario outlined in the question, because Doctrine is clever enough to load the id fields into related entities, so the proxy objects will already contain the id, and it will not issue another call to the database.
Outdated answer below:
It is possible, but it is unadvised.
The reason behind that, is Doctrine tries to truly adhere to the principle that your entities should form an object graph, where the foreign keys have no place, because they are just "artifacts", that come from the way relational databases work.
You should rewrite the association to be
eager loaded, if you always need the related entity
write a DQL query (preferably on a Repository) to fetch-join the related entity
let it lazy-load the related entity by calling a getter on it
If you are not convinced, and really want to avoid all of the above, there are two ways (that I know of), to get the id of a related object, without triggering a load, and without resorting to tricks like reflection and serialization:
If you already have the object in hand, you can retrieve the inner UnitOfWork object that Doctrine uses internally, and use it's getEntityIdentifier() method, passing it the unloaded entity (the proxy object). It will return you the id, without triggering the lazy-load.
Assuming you have many-to-one relation, with multiple articles belonging to a category:
$articleId = 1;
$article = $em->find('Article', $articleId);
$categoryId = $em->getUnitOfWork()->getEntityIdentifier($article->getCategory());
Coming 2.2, you will be able to use the IDENTITY DQL function, to select just a foreign key, like this:
SELECT IDENTITY(u.Group) AS group_id FROM User u WHERE u.id = ?0
It is already committed to the development versions.
Still, you should really try to stick to one of the "correct" methods.
The Django documentation states that:
If you were relying on “automatic transactions” to provide locking
between select_for_update() and a subsequent write operation — an
extremely fragile design, but nonetheless possible — you must wrap the
relevant code in atomic().
Is the reason why this no longer works is that autocommit is done at the database layer and not the application layer? Previously the transaction would be held open until a data-altering function is called:
Django’s default behavior is to run with an open transaction which it commits automatically when any built-in, data-altering model
function is called
And since Django 1.6, with autcommit at the database layer, that a select_for_update followed by for example a write would actually run in two transactions? If this is the case, then hasn't select_for_update become useless as its point was to lock the rows until a data altering function was called?
select_for_update will only lock the selected row within the context of a single transaction. If you're using autocommit, it won't do what you think it does, because each query will effectively be its own transaction (including the SELECT ... FOR UPDATE query). Wrap your view (or other function) in transaction.atomic and it'll do exactly what you're expecting it to do.
This should be a simple scenario - I have a data model with a parent/child relationship. For example's sake, let's say it's Orders and OrderDetails - 1 Order -> many OrderDetails.
I'd like to expose the model via oData using a standard DataService, but with a few limitations.
First, I should only see my Orders. That's simple enough using EntitySetRights.ReadSingle and a QueryInterceptor to make sure the order is in fact mine.
So far, so good! But how can the associated OrderDetail records be exposed in the oData feed in a way where I can read OrderDetails for a specific (read single) Order without giving access to the entire OrderDetails table?
In other words, I want to allow reading my details
myUrl.com/OrderService.svc/Orders(5)/OrderDetails <-- Good! My order is #5
but not everyone's details
myUrl.com/OrderService.svc/OrderDetails <-- Danger, Scarry, Keep Out!
Thanks for the help!
This is so called "containment" - your sample exactly described here: http://data.uservoice.com/forums/72027-wcf-data-services-feature-suggestions/suggestions/1012615-support-containment-hierarchical-models-in-odata?ref=title
WCF Data Services doesn't support this out of the box yet.
It is theoretically possible to implement such restriction with a custom LINQ provider. In your LINQ implementation you could detect the expansion (not that hard) and in that case allow it. But you could prevent queries to the entity set itself (also rather easy to recognize). For more details of how the LINQ expressions look plese refere to this series: http://blogs.msdn.com/b/vitek/archive/2010/02/25/data-services-expressions-part-1-intro.aspx
It depends on what provider you wanted to use originally. If you had a custom provider already this is not that hard. If you had a reflection based provider, it is possible to layer this on top. If you had EF, this might be rather tricky (not sure if it's even possible).