DDD external entity reference - web-services

I have an aggregateRoot named Project.
This project can have a sub-entity named GitRepository. This Entity represents access to a external git repository (like Github, Bitbucket).
Here's an example of how should look my persisted data (I use NoSql DB):
Project {
id: String
name: String
GitRepository: GitRepository {
externalRepoId: String // e.g. reference to the external github repo ID
externalToken: String
}
The GitRepository Entity contain properties that allow me to retrieve more data from such external service (for instance repository name or collaborators).
The thing I don't understand when I have external reference is how the aggregate root should be stored (I guess I should store the external ID references of Github). But then how I can create my complete AR that fill the data (collaborators, repository name, ...).
Here's an example of how my Aggregate root should look once data are fetched from external service.
Project {
id: String
name: String
gitRepository {
externalRepoId: String
externalToken: String
repoName: String
collaborators: List []
repoCreatedAt: Date
...
}
}
In that case I would have a modelisation in my DB that is different from my Domain model. Is it valid?
PS: An other option is to duplicate the data and store every informations I can get from Github, but in that case I can have inconsistencies (if the user update its data on external service)

I would really consider whether the external repository is a different aggregate root than the internal one. I know it seems ideologically simpler to say they are the same thing, but they clearly are not. The idea of a "Complete AR" for your project shouldn't include things that can be changed outside of your application. Create a separate AR for your external repo and I think most of your problems will go away.

Related

Get all entities for namespace -- KindError while trying to fetch all entities of a namespace from datastore appengine python 2.7

I try to get all entities from a namespace in order to delete them in a later step.
Im using appengine together with datastore and ndb lib using python2.7
I have a simple query to get all entities:
def get_entities(namespace_id):
return [entity for entity in ndb.Query(namespace=namespace_id).fetch()]
Also modified it to avoid the dunder kinds/entities Datastore Statistics in legacy bundled services:
def get_entities(namespace_id):
return [entity for entity in ndb.Query(namespace=namespace_id).fetch() if not entity.key.id_or_name.startswith('__')]
While running locally using datastore emulator works just fine.
But I get this error when deployed in the cloud:
KindError: No model class found for kind '__Stat_Ns_Kind_IsRootEntity__'. Did you forget to import it?
I found this post Internal Kinds Returned When Retrieving All Entities Belonging to a Particular Namespace but not a clear answer.
If you have another way to get all the entities fro a specific namespace will be welcome!!
Per the documentation you have referenced, it's the kind name that begins and ends with two underscores.
Each statistic is accessible as an entity whose kind name begins and ends with two underscores
However, your code is checking for entity keys that starts with an underscore. You should be checking the kinds instead
Modify your code to
return [entity for entity in ndb.Query(namespace=namespace_id).fetch(keys_only=True) if not entity.kind().startswith('__')]
Note: I switched your query to only fetch keys since all you want is to delete the records

How do I give my JSON schema an absolute URL for its $id when I haven't published it yet because it hasn't been tested yet?

I'm putting together JSON schemas and I'd like to use $ref to DRY my schemas. I'll have many schemas that will each use common subschemas. I want to unit test my schemas before publishing them by writing unit tests that assert that, given certain input, the input is deemed valid or invalid, using a JSON schema library that I trust to be correct (so that I'm just testing my schemas, not the library).
Where I get confused is that in order to load my schemas before I've published them (which I want to do while running tests locally and during CI/CD), I need to use relative local paths like this:
"pet": { "$ref": "file://./schemas/components/pet.schema.json" }
That's because that pet schema hasn't been published to a URL yet. It hasn't been verified by automated tests that it's correct yet. This works well enough for running tests, and it also worked well for packaging inside a Docker image so that the schemas could be loaded from disk as the app starts up.
But then, if I were to give one of the top level schemas to someone (that leverages $ref) after publishing it to an absolute URL, it wouldn't load in their program because of that path that I used that worked only for my unit testing.
I found that I had to publish my schemas using absolute URLs in order for them to be used in consuming programs. I ended up publishing the schemas https://mattwelke.github.io/go-jsonschema-ref-docker-example/schemas/person.1-0-0.schema.json and https://mattwelke.github.io/go-jsonschema-ref-docker-example/schemas/components/pet.1-0-0.schema.json that way. I tested that they worked fine in a consuming program by writing the program:
package main
import (
"fmt"
"github.com/xeipuuv/gojsonschema"
)
func main() {
schemaLoader := gojsonschema.NewReferenceLoader("https://mattwelke.github.io/go-jsonschema-ref-docker-example/schemas/person.1-0-0.schema.json")
jsonStr := `
{
"name": "Matt",
"pet": {
"name": "Shady"
}
}
`
documentLoader := gojsonschema.NewStringLoader(jsonStr)
result, err := gojsonschema.Validate(schemaLoader, documentLoader)
if err != nil {
panic(fmt.Errorf("could not validate: %w", err))
}
if result.Valid() {
fmt.Printf("The document is valid.\n")
} else {
fmt.Printf("The document is not valid. See errors:\n")
for _, desc := range result.Errors() {
fmt.Printf("- %s\n", desc)
}
}
}
Which resulted in the following expected output:
The document is valid.
So I'm confused about this "chicken and egg" situation.
I was able to publish schemas that could be used, as long as I didn't unit test them before publishing them.
And I was able to unit test schemas as long as:
I didn't want to publish them in the form that was verified by the unit testing to be correct.
I'm okay with my application loading them via HTTPS as it starts up instead of loading them from disk. I'm worried about this because I don't want a web server to be a point of failure for my app starting up.
I would appreciate some insight in how one might accomplish both goals.
Where I get confused is that in order to load my schemas before I've published them (which I want to do while running tests locally and during CI/CD), I need to use relative local paths
Your initial assumption is false. URIs used in the $id keyword can be arbitrary identifiers -- they do not need to be resolvable via the network or disk at the stated location. In fact, it is an error for a JSON Schema implementation to assume to find schema documents at the stated location: they MUST support being able to load documents locally and associate them with the stated identifier:
The "$id" keyword identifies a schema resource with its canonical URI.
Note that this URI is an identifier and not necessarily a network locator. In the case of a network-addressable URL, a schema need not be downloadable from its canonical URI.
source
A schema need not be downloadable from the address if it is a network-addressable URL, and implementations SHOULD NOT assume they should perform a network operation when they encounter a network-addressable URI.
source
Therefore, you can give your schema document any identifier you like, such as the URI you anticipate using when you eventually publish your schema for public consumption, and perform local testing using that identifier.
Any implementation that does not support doing this is in violation of the specification, and this should be reported to its maintainers as a bug.

Doctrine: how to set referenceOne relationship without finding() the referenced document?

we need to create a document which references one document in another collection. We know the id of the document being referenced and that's all we need to know.
our first approach is:
$referencedDocument=$repository->find($referencedId);
$newDocument->setUser($referencedDocument);
now the question is if we can do it somehow without the first line (and hitting the database). In the db (we use Mongo) reference is just an integer field and we know that target id, so finding() the $referencedDocument seems redundant.
We tried to create new User with just an id set, but that gets us an error during persisting.
Thanks!
In one of projects I used something like this:
$categoryReference = $this->getEntityManager()->getReference(ProjectCategory::class, $category['id']);
Thou, if you use Mongo, you probably need to use getDocumentManager()
So, link to doctrine docs. mongo odm 1.0.

Entity framework using Data Repository pattern - DeepLoading

I have been implementing a new project which I have decided to use the repository pattern and Entity Framework.
I have sucessfuly implemented basic CRUD methods and I have no moved onto my DeepLoads.
From all the examples and documentation I can find to do this I need to call something like this:
public Foo DeepLoadFoo()
{
return (from foobah in Context.Items.Include("bah").Include("foo").Include("foofoo") select foo).Single();
}
This doesnt work for me, maybe I am trying to be too lazy but what I would like to achieve would be something along the lines of this:
public Foo DeepLoadFoo(Foo entity, Type[] childTypes)
{
return (from foobah in Context.Items.Include(childTypes).Single();
}
Is anything like this possible, or am I stuck with include.include.include.include?
Thanks
This blog post mentions that the Entity Framework ObjectContext has all the metadata about entities and their properties. So maybe you can use that metadata to walk the properties of your entity, and their child properties, etc.
In other words, I believe you should be able to use the metadata to automatically compose Include calls on your query.

Constructing a Domain Object from multiple DTOs

Suppose you have the canonical Customer domain object. You have three different screens on which Customer is displayed: External Admin, Internal Admin, and Update Account.
Suppose further that each screen displays only a subset of all of the data contained in the Customer object.
The problem is: when the UI passes data back from each screen (e.g. through a DTO), it contains only that subset of a full Customer domain object. So when you send that DTO to the Customer Factory to re-create the Customer object, you have only part of the Customer.
Then you send this Customer to your Customer Repository to save it, and a bunch of data will get wiped out because it isn't there. Tragedy ensues.
So the question is: how would you deal with this problem?
Some of my ideas:
include an argument to the
Repository indicating which part of
the Customer to update, and ignore
others
when you load the Customer, keep it in static memory, or in the session, or wherever, and then when you receive one of the DTOs from the UI, update only the parts relevant to the DTO
IMO, both of these are kludges. Are there any other better ideas?
#chadmyers: Here is the problem.
Entity has properties A, B, C, and D.
DTO #1 contains properties for B and C.
DTO #2 contains properties for C and D.
UI asks for DTO #1, you load entity from the repository, convert it into DTO #1, filling in only B and C, and give it to the UI.
Now UI updates B and sends the DTO back. You recreate the entity and it has only B and C filled in because that is all that is contained in the DTO.
Now you want to save the entity, which has only B and C filled in, with A and D null/blank. The repository has no way of knowing if it should update A and D in persistence as blanks, or whether it should ignore them.
I would use factory to load a complete customer object from repository upon receipt of DTO. After that you can update only those fields that were specified in DTO.
That also allows you to apply some optimistic concurrency on your customer by checking last-updated timestamp, for example.
Is this a web app? Load the customer object from the repo, update it from the DTO, save it back. That doesn't seem like a kludge to me. :)
UPDATE: As per your updates (the A, B, C, D example)
So what I was thinking is that when you load the entity, it has A, B, C, and D filled in. If DTO#1 only updates B & C, that's OK. A and D are unaffected (which is the desired situation).
What the repository does with the B & C updates is up to him. If you're using Hibernate/NHibernate, for example, it will just figure it out and issue an update.
Just because DTO #1 only has B & C doesn't mean you have to also null out A & D. Just leave them alone.
I missed the point of this question at first because it is predicated on a few things that I don't think make sense from a design perspective.
Hydrating an entity from repository and then converting it to a DTO is a waste of effort. I assume that your DAL passes a DTO to your repository which then converts it to a full entity object. So converting it back to a DTO seems wasteful.
Having multiple DTOs makes sense if you have a search results page that shows a high volume of records and only displays part of your entity data. In that case it's efficient to pass that page just the data it needs. It does not make sense to pass a DTO that contains partial data to a CRUD page. Just give it a full DTO or even a full entity object. If it doesn't use all of the data, fine, no harm done.
So that main problem is that I don't think you should pass data to these pages using partial DTOs. If you used a full DTO, I would do the following 3 steps whenever the save action is performed:
Pull the full DTO from repository or db
Update the DTO with any changes made through the form
Save the full DTO back to the repository or db
This method requires an extra db hit but that's really not a significant issue on a CRUD form.
If we have an understanding that a Repository handles (almost exclusively) very rich domain Entity, then you numerous DTO's could simply map back.
i.e.
dtoUser.MapFrom<In,Out>(Entity)
or
dtoAdmin.MapFrom<In,Out>(Entity)
you would do the reverse to get the dto information back to the Entity and so on. So your repository only saves rich Entity's NOT numerous DTO's
entity.Foo = dtoUser.Foo
or
entity.Bar = dtoAdmin.Bar
entityRepsotiry.Save(entity) <-- do not pass DTO.
The whole point of DTO's is to keep things simple for the presentation or say for WCF dataTransfer, it has nothing to do with the Repository or the Entity for that matter.
Furthermore, you should never construct an Entity from DTO's... the only two ways to ever acquire an Entity is through a Factory(new) or a Repository(existing) respectively.
You mention storing the Entity somewhere, why would you do this? That is the job of your repository. It will decide where to get the Entity(db,cache,e.t.c), no need to store it somewhere else.
Hope that helps assign responsibility in your domain, it is always a challenge and there are gray area's here and there but in general, these are the typical uses of Repository, DTO e.t.c.