How can I avoid duplicating data in a document database like RavenDB? - document-database

Given that document databases, such as RavenDB, are non-relational, how do you avoid duplicating data that multiple documents have in common? How do you maintain that data if it's okay to duplicate it?

With a document database you have to duplicate your data to some degree. What that degree is will depend on your system and use cases.
For example if we have a simple blog and user aggregates we could set them up as:
public class User
{
public string Id { get; set; }
public string Name { get; set; }
public string Username { get; set; }
public string Password { get; set; }
}
public class Blog
{
public string Id { get; set; }
public string Title { get; set; }
public class BlogUser
{
public string Id { get; set; }
public string Name { get; set; }
}
}
In this example I have nested a BlogUser class inside the Blog class with the Id and Name properties of the User Aggregate associated with the Blog. I have included these fields as they are the only fields the Blog class is interested in, it doesn't need to know the users username or password when the blog is being displayed.
These nested classes are going to dependant on your systems use cases, so you have to design them carefully, but the general idea is to try and design Aggregates which can be loaded from the database with a single read and they will contain all the data required to display or manipulate them.
This then leads to the question of what happens when the User.Name gets updated.
With most document databases you would have to load all the instances of Blog which belong to the updated User and update the Blog.BlogUser.Name field and save them all back to the database.
Raven is slightly different as it support set functions for updates, so you are able to run a single update against RavenDB which will up date the BlogUser.Name property of the users blogs without you have to load them and update them all individually.
The code for doing the update within RavenDB (the manual way) for all the blog's would be:
public void UpdateBlogUser(User user)
{
var blogs = session.Query<Blog>("blogsByUserId")
.Where(b.BlogUser.Id == user.Id)
.ToList();
foreach(var blog in blogs)
blog.BlogUser.Name == user.Name;
session.SaveChanges()
}
I've added in the SaveChanges just as an example. The RavenDB Client uses the Unit of Work pattern and so this should really happen somewhere outside of this method.

There's no one "right" answer to your question IMHO. It truly depends on how mutable the data you're duplicating is.
Take a look at the RavenDB documentation for lots of answers about document DB design vs. relational, but specifically check out the "Associations Management" section of the Document Structure Design Considerations document. In short, document DBs use the concepts of reference by IDs when they don't want to embed shared data in a document. These IDs are not like FKs, they are entirely up to the application to ensure the integrity of and resolve.

Related

Logging/Debugging Mapping Errors in Glass Mapper SC 4.0.2.10

Does anyone know of a way to force Glass Mapper SC to throw exceptions for mapping errors? It appears to swallow them, and I'm left with null properties and no easy way to diagnose the problem. The tutorials don't really dive deep into attribute configuration, so I'm forced to do a lot of TIAS which slows down development.
I'd also settle for any method that other users have found helpful for diagnosing mapping issues.
Example
Here is the template for the items I'm retrieving and attempting to map:
Here is one of the items that I am returning with my query:
Here is the model that matches the template:
[SitecoreType(AutoMap = true)]
public class UnitDetails
{
//[SitecoreField("ID"), SitecoreId]
public virtual Guid ID { get; set; }
[SitecoreField("Pre-Recycled Percentage")]
public virtual decimal PreConsumerRecycledPercentage { get; set; }
[SitecoreField("Post-Recycled Percentage")]
public virtual decimal PostConsumerRecycledPercentage { get; set; }
public virtual Plant Plant { get; set; }
[SitecoreField("Raw Material")]
public virtual RawMaterial RawMaterial { get; set; }
[SitecoreField("Raw Material Origin")]
public virtual RawMaterialOrigin RawMaterialOrigin { get; set; }
}
Even if you forget the RawMaterial and RawMaterialOrigin properties for the moment (those don't map either), the decimal properties do not map. Also, the ID property will always be null unless I name it exactly (ID). I thought the [SitecoreField("ID"), SitecoreId] decorator was supposed to provide the hint to Glass. Here is an example of the mapped data. No exception is thrown:
I understand this is old thread and might have resolved already, but as I managed to resolve this one more time (forgot to update last time :D) thought of recording this time.
I was doing upgrade to v5 of glass mapper. I followed attribute based configuration which is default. It is documented here, but on top of that I add
1) Templates on classes
[SitecoreType(AutoMap = true, TemplateId = "<Branch Id>"]
2) Id field should be declared as following in your code.
[SitecoreId]
public virtual Guid Id { get; set; }
3) Sitecore service changes as mentioned in the article using Sitecore Service (MVC / WebForm), passed lazy load as false and infer type as true in all places. This was really important step.
I hope this will help me next time I visit this issue. :D

Map two templates for different sites

We are developing a multisite sitecore solution where each sites can have have their own News as well as able to display the combined news from other Sites.
Problem:
Each site have their unique News requirements where 90% of the template fields matches but rest 10% are different.
For example, Site-A has news template with Authors drop down list where Author List are authored on Configuration Node. Where as Site-B has news template where Author is a FREE TEXT Field.
Therefore, when Glass Mapper automatically tries to Map Authors field it fails for Free Text one.
Solution:
This can be resolved either by creating a Author as drop down on all Sites but Product owners don't want this.
The other solution is manual mapping of news fields from both sources or use AUTOMAP etc.
Desired Solution:
Glassmapper automatically resolves and populates the Author Text Field or Drop Down Field on the fly.
Is above possible?
Thank you.
I would solve this by "fluent configuration", http://glass.lu/Mapper/Sc/Tutorials/Tutorial8.aspx.
Combined with the new Delegate feature added to the Glass Mapper recently.
The Delegate feature was originally introduced and described here: http://cardinalcore.co.uk/2014/07/02/controlling-glass-fields-from-your-own-code/
Nuget package for the Delegate feature: https://www.nuget.org/packages/Cardinal.Glass.Extensions.Mapping/
You can use Infer types as follows:
public interface IBaseNews
{
string Author {get; set;}
//List all other shared fields below
}
[SitecoreType(TemplateId="....", AutoMap = true)]
public class NewsSiteA : IBaseNews
{
[SitecoreField]
public string Author {get; set;}
//List all fields which are unique for SiteA
}
[SitecoreType(TemplateId="....", AutoMap = true)]
public class NewsSiteB : IBaseNews
{
[SitecoreField]
public string Author {get; set;}
//List all fields which are unique for SiteB
}
Now, Your code should be:
IBaseNews newsClass = NewsItem.GlassCast<IBaseNews>(true,true);
//You can use Author property now
Firstly, I would recommend updating to the latest version of Glass for many other reasons including the delegate feature.
From the infer type example in the comment - you shouldn't use GlassCast, use CreateType(Item item) from the sitecore service / context. If you adopt the version with Delegate in, there is now an official Cast(Item item) on the sitecore service instead.
Also the example there uses a would not solve the difference in type. Delegate would make this very easy. Remember with delegate that there is no lazy loading, this shouldn't matter in this case.
public interface INews
{
// All my other fields
string Author { get; set; }
}
The fluent configuration would be something like (to be done in GlassScCustom)
SitecoreType<INews> = new SitecoreType<INews>();
sitecoreType.Delegate(y => y.Author).GetValue(GetAuthor);
fluentConfig.Add(sitecoreType);
private string GetAuthor(SitecoreDataMappingContext arg)
{
Item item = arg.Item;
if(item.TemplateID == <templateid>)
{
// return the value from the drop link
}
return item["Authors"];
}

MinLength constraint on ICollection fails, Entity Framework

This is the datamodel I have:
public class Team
{
[Key]
public int Id { get; set;}
[Required]
public string Name { get; set; }
[MinLength(1)]
public virtual ICollection<User> Users { get; set; }
}
My issue is that when I later try to create a new Team (that has one user) I get the following issue when the context is saving.
An unexpected exception was thrown during validation of 'Users' when invoking System.ComponentModel.DataAnnotations.MinLengthAttribute.IsValid. See the inner exception for details.
The inner exception is the following:
{"Unable to cast object of type 'System.Collections.Generic.List`1[MyNameSpace.Model.User]' to type 'System.Array'."}
Here is the code for the actual saving (which for now is in the controller):
if (ModelState.IsValid)
{
team.Users = new List<User>();
team.Users.Add(CurrentUser);//CurrentUser is a property that gives me the currently active User (MyNamespace.Model.User).
DB.Teams.Add(team);//DB is a DbContext object that holds DbSets of all my models
DB.SaveChanges();
return RedirectToAction("Index");
}
So, what's going on here? Am I doing something wrong, or is there something else happening?
I do not believe that you will be able to use the MinLength Attribute for what you are trying to achieve. Here is the msdn page for the MinLength Attribute. Based on the description: "Specifies the minimum length of array of string data allowed in a property." So as you can see it can only be used against arrays of string data. You may need to create your own custom ValidationAttribute to handle your scenario.

How to model structures such as family trees in document databases

I have been looking into document databases, specifically RavenDb, and all the examples are clear and understandable. I just can't find any example where we do not know beforehand how many levels a given structure has. As an example how would you persist a family tree given the following class:
public class Person{
public string Name {get;set;}
public Person Parent {get;set;}
public Person[] Children {get;set;}
}
In most examples I have seen we search for the aggregate root and make into a document. It is just not so obvious here what the aggregate root and boundary is.
Ayende has just posted a blog post that answers this.
I guess for RavenDb, you'd have to keep the Ids in your object:
public class Person {
public string Name { get; set; }
public string ParentId { get; set; }
public string[] ChildrenIds { get; set; }
}
Check this page, especially at the bottom, for more info: http://ravendb.net/documentation/docs-document-design

How do I update with a newly-created detached entity using NHibernate?

Explanation:
Let's say I have an object graph that's nested several levels deep and each entity has a bi-directional relationship with each other.
A -> B -> C -> D -> E
Or in other words, A has a collection of B and B has a reference back to A, and B has a collection of C and C has a reference back to B, etc...
Now let's say I want to edit some data for an instance ofC. In Winforms, I would use something like this:
var instanceOfC;
using (var session = SessionFactory.OpenSession())
{
// get the instance of C with Id = 3
instanceOfC = session.Linq<C>().Where(x => x.Id == 3);
}
SendToUIAndLetUserUpdateData(instanceOfC);
using (var session = SessionFactory.OpenSession())
{
// re-attach the detached entity and update it
session.Update(instanceOfC);
}
In plain English, we grab a persistent instance out of the database, detach it, give it to the UI layer for editing, then re-attach it and save it back to the database.
Problem:
This works fine for Winform applications because we're using the same entity all throughout, the only difference being that it goes from persistent to detached to persistent again.
The problem is that now I'm using a web service and a browser, sending over JSON data. The entity gets serialized into a string, and de-serialized into a new entity. It's no longer a detached entity, but rather a transient one that just happens to have the same ID as the persistent one (and updated fields). If I use this entity to update, it will wipe out the relationship to B and D because they don't exist in this new transient entity.
Question:
My question is, how do I serialize detached entities over the web to a client, receive them back, and save them, while preserving any relationships that I didn't explicitly change? I know about ISession.SaveOrUpdateCopy and ISession.Merge() (they seem to do the same thing?), but this will still wipe out the relationships if I don't explicitly set them. I could copy the fields from the transient entity to the persistent entity one by one, but this doesn't work too well when it comes to relationships and I'd have to handle version comparisons manually.
I solved this problem by using an intermediate class to hold data coming in from the web service, then copying its properties to the database entity. For example, let's say I have two entities like so:
Entity Classes
public class Album
{
public virtual int Id { get; set; }
public virtual ICollection Photos { get; set; }
}
public class Photo
{
public virtual int Id { get; set; }
public virtual Album Album { get; set; }
public virtual string Name { get; set; }
public virtual string PathToFile { get; set; }
}
Album contains a collection of Photo objects, and Photo has a reference back to the Album it's in, so it's a bidirectional relationship. I then create a PhotoDTO class:
DTO Class
public class PhotoDTO
{
public virtual int Id { get; set; }
public virtual int AlbumId { get; set; }
public virtual string Name { get; set; }
// note that the DTO does not have a PathToFile property
}
Now let's say I have the following Photo stored in the database:
Server Data
new Photo
{
Id = 15,
Name = "Fluffy Kittens",
Album = Session.Load<Album>(3)
};
The client now wants to update the photo's name. They send over the following JSON to the server:
Client Data
PUT http://server/photos/15
{
"id": 15,
"albumid": 3,
"name": "Angry Kittens"
}
The server then deserializes the JSON into a PhotoDTO object. On the server side, we update the Photo like this:
Server Code
var photoDTO = DeserializeJson();
var photoDB = Session.Load(photoDTO.Id); // or use the ID in the URL
// copy the properties from photoDTO to photoDB
photoDB.Name = photoDTO.Name;
photoDB.Album = Session.Load<Album>(photoDTO.AlbumId);
Session.Flush(); // save the changes to the DB
Explanation
This was the best solution I've found because:
You can choose which properties the client is allowed to modify. For example, PhotoDTO doesn't have a PathToFile property, so the client can never modify it.
You can also choose whether to update a property or not. For example, if the client didn't send over an AlbumId, it will be 0. You can check for that and not change the Album if the ID is 0. Likewise, if the user doesn't send over a Name, you can choose not to update that property.
You don't have to worry about the lifecycle of an entity because it will always be retrieved and updated within the scope of a single session.
AutoMapper
I recommend using AutoMapper to automatically copy the properties from the DTO to the entity, especially if your entites have a lot of properties. It saves you the trouble of having to write every property by hand, and has a lot of configurability.