read objects persisted but not yet flushed with doctrine

read objects persisted but not yet flushed with doctrine - doctrine-orm

I'm new to symfony2 and doctrine.
here is the problem as I see it.
i cannot use :
$repository = $this->getDoctrine()->getRepository('entity');
$my_object = $repository->findOneBy($index);
on an object that is persisted, BUT NOT FLUSHED YET !!
i think getRepository read from DB, so it will not find a not-flushed object.
my question: how to read those objects that are persisted (i think they are somewhere in a "doctrine session") to re-use them before i do flush my entire batch ?
every profile has 256 physical plumes.
every profile has 1 plumeOptions record assigned to it.
In plumeOptions, I have a cartridgeplume which is a FK for PhysicalPlume.
every plume is identified by ID (auto-generated) and an INDEX (user-generated).
rule: I say profile 1 has physical_plume_index number 3 (=index) connected to it.
now, I want to copy a profile with all its related data to another profile.
new profile is created. New 256 plumes are created and copied from older profile.
i want to link the new profile to the new plume index 3.
check here: http://pastebin.com/WFa8vkt1

I think you might want to have a look at this function:
$entityManager->getUnitOfWork()->getScheduledEntityInsertions()
Gives you back a list of entity objects which are persisting yet.
Hmm, I didn't really read your question well, with the above you will retrieve a full list (as an array) but you cannot query it like with getRepository. I will try found something for u..

I think you might look at the problem from the wrong angle. Doctrine is your persistance layer and database access layer. It is the responsibility of your domain model to provide access to objects once they are in memory. So the problem boils down to how do you get a reference to an object without the persistance layer?
Where do you create the object you need to get hold of later? Can the method/service that create the object return a reference to the controller so it can propagate it to the other place you need it? Can you dispatch an event that you listen to elsewhere in your application to get hold of the object?
In my opinion, Doctrine should be used at the startup of the application (as early as possible), to initialize the domain model, and at the shutdown of the application, to persist any changes to the domain model during the request. To use a repository to get hold of objects in the middle of a request is, in my opinion, probably a code smell and you should look at how the application flow can be refactored to remove that need.

Your is a business logic problem effectively.
Querying down the Database a findby Query on Object that are not flushed yet, means heaving much more the DB layer querying object that you have already in your function scope.
Also Keep in mind a findOneBy will retrieve also other object previously saved with same features.
If you need to find only among those new created objects, you should make f.e. them in a Session Array Variable, and iterate them with the foreach.
If you need a mix of already saved items + some new items, you should threate the 2 parts separately, one with a foreach , other one with the repository query!

Related

Django: Making sure a complex object is accessible throughout multiple view calls

for a project, I am trying to create a web-app that, among other things, allows training of machine learning agents using python libraries such as Dedupe or TensorFlow. In cases such as Dedupe, I need to provide an interface for active learning, which I currently realize through jquery based ajax calls to a view that takes and sends the necessary training data.
The problem is that I need this agent object to stay alive throughout multiple view calls and be accessible by each individual call. I have tried realizing this via the built-in cache system using Memcached, but the serialization does not seem to keep all the info intact, and while I am technically able to restore the object from the cache, this appears to break the training algorithm.
Essentially, I want to keep the object alive within the application itself (rather than an external memory store) and be able to access it from another view, but I am at a bit of a loss of how to realize this.
If someone knows the proper technique to achieve this, I would be very grateful.
Thanks in advance!

To follow up with this question, I have since realized that the behavior shown seemed to have been an effect of trying to use the result of a method call from the object loaded from cache directly in the return properties of a view. Specifically, my code looked as follows:
#model is the object loaded from cache
#this returns the wrong object (same object as on an earlier call)
return JsonResponse({"pairs": model.uncertain_pairs()})
and was changed to the following
#model is the object loaded from cache
#this returns the correct object (calls and returns the model.uncertain_pairs() method properly)
uncertain = model.uncertain_pairs()
return JsonResponse({"pairs": uncertain})
I am unsure if this specifically happens due to an implementation from Dedupe or Django side or due to Python, but this has undoubtedly fixed the issue.
To return back to the question, Django does seem to be able to properly (de-)serialize objects and their properties in cache, as long as the cache is set up properly (see Apparent bug storing large keys in django memcached which I also had to deal with)

Repository pattern: isn't getting the entire domain object bad behavior (read method)?

A repository pattern is there to abstract away the actual data source and I do see a lot of benefits in that, but a repository should not use IQueryable to prevent leaking DB information and it should always return domain objects, not DTO's or POCO's, and it is this last thing I have trouble with getting my head around.
If a repository pattern always has to return a domain object, doesn't that mean it fetches way too much data most of the times? Lets say it returns an employee domain object with forty properties and in the service and view layers consuming that object only five of those properties are actually used.
It means the database has fetched a lot of unnecessary data a pumped that across the network. Doing that with one object is hardly noticeable, but if millions of records are pushed across that way and a lot of of the data is thrown away every time, is that not considered bad behavior?
Yes, when adding or editing or deleting the object, you will use the entire object, but reading the entire object and pushing it to another layer which uses only a fraction of it is not utilizing the underline database and network in the most optimal way. What am I missing here?

There's nothing preventing you from having a separate read model (which could a separately stored projection of the domain or a query-time projection) and separating out the command and query concerns - CQRS.
If you then put something like GraphQL in front of your read side then the consumer can decide exactly what data they want from the full model down to individual field/property level.
Your commands still interact with the full domain model as before (except where it's a performance no-brainer to use set based operations).

Django: How to depend on an externally ID, which can be switched?

Consider the following scenario:
Our Django database objects must rely on IDs that are provided by external service A (ESA) - this is because we use this ID to pull the information about objects that aren't created yet from the external directly. ESA might shut down soon, so we also pull information about the same objects from external service B (ESB), and save them as a fallback.
Because these IDs are relied on heavily in views and URLs, the ideal scenario would be to use a #property:
#property
dynamic_id = ESA_id
And then, if ESA shuts down, we can switch easily by changing dynamic_id to ESB_id. The problem with this though, is that properties cannot be used in queryset filters and various other scenarios, which is also a must in this case.
My current thought is to just save ESA_id, ESB_id, and dynamic_ID as regular fields separately and assign dynamic_ID = ESA_id, and then, in case ESA shuts down, simply go over the objects and do dynamic_ID = ESB_id.
But I feel there must be a better way?

Having ESA_id and ESB_id fields in the same table is a good solution, then you have some kind of setting (DEFAULT_SERVICE_ID='ESA_id'|'ESB_id') and your code change the lookup based on this option.
Here you can see an aproach to create filters dynamicly
https://stackoverflow.com/a/310785/1448667

best practice dealing with non persistent properties, frequent updates

i am new to realm and did not found a solution which was satisfies me.
i have an application where i can record tours with gps data and so on. (there are multiple different objects which are stored in realm).
i created a realm singleton which should do all my realm suff (update, create, delete) for my objects.
now i ran into the following problem:
i start a tour and record it. first it is created, everything is fine. then i came to the point where i have to update my tour object and only a few properties (basically each new gps point updates it). an additional requirement is, that there can be properties, which are not persistent in realm and are only on the object instance.
so now i have the options to call realm.add(object, update:true) which overrides all properties.
i cannot say object.prop1 = asdf , object.pro2 = 345 because i have no write context at this level of my logic. so i can update within a realm.create(type, updatedict, update:true)
but the big downside of this approach is, that i have to refetch the object again to "know" the changes on my object instance.
so updating some properties of an object results in:
create dictionary with id(primary key) and properties to change
call update on my realm singleton and passing all necessary data.
call a fetch on my realm instance to get the new object again, which leads me to loose existing not persisted property values.
i doubt i'm the first with such a requirement but i could not find a solution:
Summary:
Realm Singleton class handling all Realm actions within a write context
Different Realm Object classes which can have not persistent objects
Need partially update for some properties
dont want to have realm code in my viewcontrollers logic, only in its manager.

It's hard to suggest something without any code examples but personally I think not having the ability to update the individual properties of your models is not a good idea.
I think you have 2 options:
Add a method to your RealmSingleton that allows you to get write context (to execute a block inside a write transaction), like:
func updateTour(updateBlock: (Tour) -> Void) {
realm.write {
updateBlock(currentTour)
}
}
...
RealmSingleton.shared.updateTour { tour in
tour.property = value
}
Add the convenience methods to update the individual properties of your Tour object:
RealmSingleton.shared.setTourProperty(value)

Repository Pattern

I've got a quick question regarding the use of repositories. But the best way to ask is to show a bit of pseudocode and you guys tell me what the result should be
Get a record from the repository with ID of 1 (assume it exists)
Edit a couple of properties
Query the repository again for an item with ID of 1
Result = ??
Should I get the object with updated values or the object without (original state), bearing in mind that since updating the values of properties (step 2) I have not told the repository to update this record.
I think I should get a copy of the original item and not a reference to the edited version.
Please tell me what is correct.
Cheers

The repository pattern is suppose to act like a collection of your objects, so ideally I think it should return the same object instance which would have the updates in it.
Generally there is an identity map somewhere so your repositories can keep track of what has already been loaded. With an identity map, when you fetch an object with the same Id you should always get the already loaded object back regardless of how many times. This is how all more sophisticated ORMs work and is generally a good practice. An identity map helps keep things in sync while you are in the same transaction and saves you some data access.
NHibernate's session has an identity map it keeps track of so you don't have to worry about trying to implement your own in your repositories. Also I believe you can use NHibernate's stateless session if you want to load another instance without change tracking, but I'm not positive on that.

Judging from your past questions I'm assuming you are using LINQ/C#?
If you are using a DataContext and you haven't called SubmitChanges() then you should get back the original unchanged object.
Just tested it. I was wrong, you get back the changed object.
If you set ObjectTrackingEnabled = false on the DataContext you will get the unchanged object.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js