RavenDB: Why do I get null-values for fields in this multi-map/reduce index? - mapreduce

Inspired by Ayende's article https://ayende.com/blog/89089/ravendb-multi-maps-reduce-indexes, I have the following index, that works as such:
public class Posts_WithViewCountByUser : AbstractMultiMapIndexCreationTask<Posts_WithViewCountByUser.Result>
{
public Posts_WithViewCountByUser()
{
AddMap<Post>(posts => from p in posts
select new
{
ViewedByUserId = (string) null,
ViewCount = 0,
Id = p.Id,
PostTitle = p.PostTitle,
});
AddMap<PostView>(postViews => from postView in postViews
select new
{
ViewedByUserId = postView.ViewedByUserId,
ViewCount = 1,
Id = (string) postView.PostId,
PostTitle = (string) null,
});
Reduce = results => from result in results
group result by new
{
result.Id,
result.ViewedByUserId
}
into g
select new Result
{
ViewCount = g.Sum(x => x.ViewCount),
Id = g.Key.Id,
ViewedByUserId = g.Key.ViewedByUserId,
PostTitle = g.Select(x => x.PostTitle).Where(x => x != null).FirstOrDefault(),
};
Store(x => x.PostTitle, FieldStorage.Yes);
}
public class Result
{
public string Id { get; set; }
public string ViewedByUserId { get; set; }
public int ViewCount { get; set; }
public string PostTitle { get; set; }
}
}
I want to query this index like this:
Return all posts including - for a given user - the integer of how many times, the user has viewed the post. The "views" are stored in a separate document type, PostView. Note, that my real document types have been renamed here to match the example from the article (I certainly would not implement "most-viewed" this way).
The result from the query I get is correct - i.e. I always get all the Post documents with the correct view-count for the user. But my problem is, the PostTitle field always is null in the result set (all Post documents have a non-null value in the dataset).
I'm grouping by the combination of userId and (post)Id as my "uniqueness". The way I understand it (and please correct me if I'm wrong), is, that at this point in the reduce, I have a bunch of pseudo-documents with identical userId /postId combination, some of which come from the Post map, others from the PostView map. Now I simply find any single pseudo-document of the ones, that actually have a value for PostTitle - i.e. one that originates from the Post map. These should all obviously have the same value, as it's the same post, just "outer-joined". The .Select(....).Where(....).FirstOrDefault() chain is taken from the very example I used as a base. I then set this ViewCount value for my final document, which I project into the Result.
My question is: how do I get the non-null value for the PostTitle field in the results?

The problem is that you have:
ViewedByUserId = (string) null,
And:
group result by new
{
result.Id,
result.ViewedByUserId
}
into g
In other words, you are actually grouping by null, which I'm assuming that isn't your intent.
It would be much simpler to have a map/reduce index just on PostView and get the PostTitle from an include or via a transformer.
You understanding of what is going on is correct, in the sense that you are creating index results with userId / postId on them.
Buit what you are actually doing is creating results from PostView with userId /postId and from Post with null /postId.
And that is why you don't have the matches that you want.

The grouping in the index is incorrect. With the following sample data:
new Post { Id = "Post-1", PostTitle = "Post Title", AuthorId = "Author-1" }
new PostView { ViewedByUserId = "User-1", PostId = "Post-1" }
new PostView { ViewedByUserId = "User-1", PostId = "Post-1" }
new PostView { ViewedByUserId = "User-2", PostId = "Post-1" }
The index results are like this:
ViewCount | Id | ViewedByUserId | PostTitle
--------- | ------ | -------------- | ----------
0 | Post-1 | null | Post Title
2 | Post-1 | User-1 | null
1 | Post-1 | User-2 | null
The map operation in the index simply creates a common document for all source documents. Thus, the Post-1 document produces one row, the two documents for Post-1 and User-1 produce two rows (which are later reduced to the single row with ViewCount == 2) and the document for Post-1 and User-2 produces the last row.
The reduce operation the groups all the mapped rows and produces the resulting documents in the index. In this case, the Post-sourced document is stored separately from the PostView-sourced documents because the null value in the ViewedByUserId is not grouped with any document from the PostView collection.
If you can change your way of storing data, you can solve this issue by storing the number of views directly in the PostView. It would greatly reduce duplicate data in your database while having almost the same cost when updating the view count.
Complete test (needs xunit and RavenDB.Tests.Helpers nugets):
using Raven.Abstractions.Indexing;
using Raven.Client;
using Raven.Client.Indexes;
using Raven.Tests.Helpers;
using System.Linq;
using Xunit;
namespace SO41559770Answer
{
public class SO41559770 : RavenTestBase
{
[Fact]
public void SO41559770Test()
{
using (var server = GetNewServer())
using (var store = NewRemoteDocumentStore(ravenDbServer: server))
{
new PostViewsIndex().Execute(store);
using (IDocumentSession session = store.OpenSession())
{
session.Store(new Post { Id = "Post-1", PostTitle = "Post Title", AuthorId = "Author-1" });
session.Store(new PostView { Id = "Views-1-1", ViewedByUserId = "User-1", PostId = "Post-1", ViewCount = 2 });
session.Store(new PostView { Id = "Views-1-2", ViewedByUserId = "User-2", PostId = "Post-1", ViewCount = 1 });
session.SaveChanges();
}
WaitForAllRequestsToComplete(server);
WaitForIndexing(store);
using (IDocumentSession session = store.OpenSession())
{
var resultsForId1 = session
.Query<PostViewsIndex.Result, PostViewsIndex>()
.ProjectFromIndexFieldsInto<PostViewsIndex.Result>()
.Where(x => x.PostId == "Post-1" && x.UserId == "User-1");
Assert.Equal(2, resultsForId1.First().ViewCount);
Assert.Equal("Post Title", resultsForId1.First().PostTitle);
var resultsForId2 = session
.Query<PostViewsIndex.Result, PostViewsIndex>()
.ProjectFromIndexFieldsInto<PostViewsIndex.Result>()
.Where(x => x.PostId == "Post-1" && x.UserId == "User-2");
Assert.Equal(1, resultsForId2.First().ViewCount);
Assert.Equal("Post Title", resultsForId2.First().PostTitle);
}
}
}
}
public class PostViewsIndex : AbstractIndexCreationTask<PostView, PostViewsIndex.Result>
{
public PostViewsIndex()
{
Map = postViews => from postView in postViews
let post = LoadDocument<Post>(postView.PostId)
select new
{
Id = postView.Id,
PostId = post.Id,
PostTitle = post.PostTitle,
UserId = postView.ViewedByUserId,
ViewCount = postView.ViewCount,
};
StoreAllFields(FieldStorage.Yes);
}
public class Result
{
public string Id { get; set; }
public string PostId { get; set; }
public string PostTitle { get; set; }
public string UserId { get; set; }
public int ViewCount { get; set; }
}
}
public class Post
{
public string Id { get; set; }
public string PostTitle { get; set; }
public string AuthorId { get; set; }
}
public class PostView
{
public string Id { get; set; }
public string ViewedByUserId { get; set; }
public string PostId { get; set; }
public int ViewCount { get; set; }
}
}

Related

How to get distinct values from 2 different columns in the same list

So as you can see from the code bellow i have a list object named Matches, from which i would like to get a single list of the distinct teams, both from HomeTeam and AwayTeam. I'm trying to use LINQ and i can get a list of distinct teams if i only use HomeTeam parameter or AwayTeam parameter but not both together.
Thank you.
public class Match
{
public int ID { get; set; }
public string Country { get; set; }
public string Championship { get; set; }
public string Seasson { get; set; }
public DateTime MatchDate { get; set; }
public string HomeTeam { get; set; }
public int HomeScore { get; set; }
public int AwayScore { get; set; }
public string AwayTeam { get; set; }
}
private List<Match> Matches;
Matches = dataAccess.GetAllMatches();
I'm Trying to do something like that:
result = Matches.Select(HomeTeam, AwayTeam).Distinct().ToList();
At the risk that this smells like homework, a hint rather than code. Get your Home teams, Union your Away teams and apply a Distinct to the result.
So i finally come up with this solution.
Notice that now i need also to get not only the team but the country which the team belongs to.
public class Team
{
public string Name { get; set; }
public string Country { get; set; }
}
So Union really do the job here but since now i need to get it as an anonymous type... here is the code:
List<Team> teams = new List<Team>();
var result = Matches.Select(x => new { Name = x.HomeTeam, Country = x.Country }).Union(Matches.Select(x => new { Name = x.AwayTeam, Country = x.Country })).ToList();
foreach (var record in result)
{
teams.Add(new Team { Name = record.Name, Country = record.Country });
}
return teams;
I would prefer this way:
List<Team> teamsResult = Matches.Select(x => new Team { Name = x.HomeTeam, Country = x.Country }).Union(Matches.Select(x => new Team { Name = x.AwayTeam, Country = x.Country })).ToList();
But this way get duplicates so i will stick with the first example for now.
Do you think it is the more elegant way to go?
Thank you.
You can take advantage of GroupBy, like this:
IEnumerable<Team> teams = Matches.GroupBy(m => new { m.AwayTeam, m.HomeTeam, m.Country })
.Select(
g =>
new[]
{
new Team {Country = g.Key.Country, Name = g.Key.AwayTeam},
new Team {Country = g.Key.Country, Name = g.Key.HomeTeam}
})
.SelectMany(x => x)
.GroupBy(t => new { t.Name, t.Country })
.Select(g => new Team { Name = g.Key.Name, Country = g.Key.Country });

How to convert a dynamic list into list<Class>?

I'm trying to convert a dynamic list into a list of class-model(Products). This is how my method looks like:
public List<Products> ConvertToProducts(List<dynamic> data)
{
var sendModel = new List<Products>();
//Mapping List<dynamic> to List<Products>
sendModel = data.Select(x =>
new Products
{
Name = data.GetType().GetProperty("Name").ToString(),
Price = data.GetType().GetProperty("Price").GetValue(data, null).ToString()
}).ToList();
}
I have tried these both ways to get the property values, but it gives me null errors saying these properties doesn't exist or they are null.
Name = data.GetType().GetProperty("Name").ToString(),
Price = data.GetType().GetProperty("Price").GetValue(data,
null).ToString()
This is how my Model-class looks like:
public class Products
{
public string ID { get; set; }
public string Name { get; set; }
public string Price { get; set; }
}
Can someone please let me know what I'm missing? thanks in advance.
You're currently trying to get properties from data, which is your list - and you're ignoring x, which is the item in the list. I suspect you want:
var sendModel = data
.Select(x => new Products { Name = x.Name, Price = x.Price })
.ToList();
You may want to call ToString() on the results of the properties, but it's not clear what's in the original data.

Moq unit test to filter products by their categories

I am new to unit testing so I am sure this is a very basic question, but I couldn't find a solution when I searched for it.
I am trying to test to see if I can filter products by their categories. I can access all the properties in my Product class but not the ones in my Category class. For example, it doesn't find Category1.Name. Can anyone tell me what I'm doing wrong?
This is my product class;
public partial class Product
{
public int ProductID { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public decimal Price { get; set; }
public int CategoryID { get; set; }
public virtual Category Category1 { get; set; }
}
This is my test;
[TestMethod]
public void Can_Filter_Products()
{
//Arrange
Mock<IProductRepository> mock = new Mock<IProductRepository>();
mock.Setup(m => m.Products).Returns(new Product[]
{
new Product {ProductID=1,Name="P1", **Category1.Name** = "test1" },
new Product {ProductID=2,Name="P2", **Category1.Name** = "test2"},
new Product {ProductID=3,Name="P3", **Category1.Name** = "test1"},
new Product {ProductID=4,Name="P4", **Category1.Name** = "test2"},
new Product {ProductID=5,Name="P5", **Category1.Name** = "test3"},
}.AsQueryable());
//Arrange create a controller and make the page size 3 items
ProductController controller = new ProductController(mock.Object);
controller.PageSize = 3;
//Action
Product[] result = ((ProductsListViewModel)controller.List("test2", 1).Model).Products.ToArray();
//Assert - check that the results are the right objects and in the right order.
Assert.AreEqual(result.Length, 2);
Assert.IsTrue(result[0].Name == "P2" && result[0].Category1.Name == "test2");
Assert.IsTrue(result[1].Name == "P4" && result[1].Category1.Name == "test2");
}
In your mock setup, try this instead:
mock.Setup(m => m.Products).Returns(new[]
{
new Product {ProductID=1,Name="P1", Category1 = new Category { Name = "test1"} },
new Product {ProductID=2,Name="P2", Category1 = new Category { Name = "test1"} }
}.AsQueryable());

RavenDB MultiMapReduce Sum not returning the correct value

Sorry for this lengthy query, I decided to add the whole test so that it will be easier for even newbies to help me with this total brain-melt.
The using directives are:
using System.Collections.Generic;
using System.Linq;
using NUnit.Framework;
using Raven.Client;
using Raven.Client.Embedded;
using Raven.Client.Indexes;
Please leave feedback if I'm too lengthy, but what could possibly go wrong if I add a complete test?
[TestFixture]
public class ClicksByScoreAndCardTest
{
private IDocumentStore _documentStore;
[SetUp]
public void SetUp()
{
_documentStore = new EmbeddableDocumentStore {RunInMemory = true}.Initialize();
_documentStore.DatabaseCommands.DisableAllCaching();
IndexCreation.CreateIndexes(typeof (ClicksBySearchAndProductCode).Assembly, _documentStore);
}
[TearDown]
public void TearDown()
{
_documentStore.Dispose();
}
[Test]
public void ShouldCountTotalLeadsMatchingPreference()
{
var userFirst = new User {Id = "users/134"};
var userSecond = new User {Id = "users/135"};
var searchFirst = new Search(userFirst)
{
Id = "searches/24",
VisitId = "visits/63"
};
searchFirst.Result = new Result();
searchFirst.Result.Rows = new List<Row>(
new[]
{
new Row {ProductCode = "CreditCards/123", Score = 6},
new Row {ProductCode = "CreditCards/124", Score = 4}
});
var searchSecond = new Search(userSecond)
{
Id = "searches/25",
VisitId = "visits/64"
};
searchSecond.Result = new Result();
searchSecond.Result.Rows = new List<Row>(
new[]
{
new Row {ProductCode = "CreditCards/122", Score = 9},
new Row {ProductCode = "CreditCards/124", Score = 4}
});
var searches = new List<Search>
{
searchFirst,
searchSecond
};
var click = new Click
{
VisitId = "visits/64",
ProductCode = "CreditCards/122",
SearchId = "searches/25"
};
using (var session = _documentStore.OpenSession())
{
foreach (var search in searches)
{
session.Store(search);
}
session.Store(click);
session.SaveChanges();
}
IList<ClicksBySearchAndProductCode.MapReduceResult> clicksBySearchAndProductCode = null;
using (var session = _documentStore.OpenSession())
{
clicksBySearchAndProductCode = session.Query<ClicksBySearchAndProductCode.MapReduceResult>(ClicksBySearchAndProductCode.INDEX_NAME)
.Customize(x => x.WaitForNonStaleResults()).ToArray();
}
Assert.That(clicksBySearchAndProductCode.Count, Is.EqualTo(4));
var mapReduce = clicksBySearchAndProductCode
.First(x => x.SearchId.Equals("searches/25")
&& x.ProductCode.Equals("CreditCards/122"));
Assert.That(mapReduce.Clicks,
Is.EqualTo(1));
}
}
public class ClicksBySearchAndProductCode :
AbstractMultiMapIndexCreationTask
<ClicksBySearchAndProductCode.MapReduceResult>
{
public const string INDEX_NAME = "ClicksBySearchAndProductCode";
public override string IndexName
{
get { return INDEX_NAME; }
}
public class MapReduceResult
{
public string SearchId { get; set; }
public string ProductCode { get; set; }
public string Score { get; set; }
public int Clicks { get; set; }
}
public ClicksBySearchAndProductCode()
{
AddMap<Search>(
searches =>
from search in searches
from row in search.Result.Rows
select new
{
SearchId = search.Id,
ProductCode = row.ProductCode,
Score = row.Score.ToString(),
Clicks = 0
});
AddMap<Click>(
clicks =>
from click in clicks
select new
{
SearchId = click.SearchId,
ProductCode = click.ProductCode,
Score = (string)null,
Clicks = 1
});
Reduce =
results =>
from result in results
group result by
new { SearchId = result.SearchId, ProductCode = result.ProductCode }
into g
select
new
{
SearchId = g.Key.SearchId,
ProductCode = g.Key.ProductCode,
Score = g.First(x => x.Score != null).Score,
Clicks = g.Sum(x => x.Clicks)
};
}
}
public class User
{
public string Id { get; set; }
}
public class Search
{
public string Id { get; set; }
public string VisitId { get; set; }
public User User { get; set; }
private Result _result = new Result();
public Result Result
{
get { return _result; }
set { _result = value; }
}
public Search(User user)
{
User = user;
}
}
public class Result
{
private IList<Row> _rows = new List<Row>();
public IList<Row> Rows
{
get { return _rows; }
set { _rows = value; }
}
}
public class Row
{
public string ProductCode { get; set; }
public int Score { get; set; }
}
public class Click
{
public string VisitId { get; set; }
public string SearchId { get; set; }
public string ProductCode { get; set; }
}
My problem here is that I expect Count to be one in that specific test, but it just doesn't seem to add the Clicks in the Click map and the result is 0 clicks. I'm totally confused, and I'm sure that there is a really simple solution to my problem, but I just can't find it..
..hope there is a week-end warrior out there who can take me under his wings.
Yes, it was a brain-melt, for me non-trivial, but still. The proper reduce should look like this:
Reduce =
results =>
from result in results
group result by
new { SearchId = result.SearchId, ProductCode = result.ProductCode }
into g
select
new
{
SearchId = g.Key.SearchId,
ProductCode = g.Key.ProductCode,
Score = g.Select(x=>x.Score).FirstOrDefault(),
Clicks = g.Sum(x => x.Clicks)
};
Not all Maps had the Score set to a non-null-value, and therefore my original version had a problem with:
Score = g.First(x => x.Score != null).Score
Mental note, use:
Score = g.Select(x=>x.Score).FirstOrDefault()
Don't use:
Score = g.First(x => x.Score != null).Score

MultiMap / Reduce - Counts = 0?

I want to create an index for a query, I want to return to my view a list of Audio items along with statistics for these items, which are TotalDownloads & TotalPlays.
Here are my relevant docs:
Audio
- Id
- ArtistName
- Name
AudioCounter
- AudioId
- Type
- DateTime
Here is my current Index:
public class AudioWithCounters : AbstractMultiMapIndexCreationTask<AudioWithCounters.AudioViewModel>
{
public class AudioViewModel
{
public string Id { get; set; }
public string ArtistName { get; set; }
public string Name { get; set; }
public int TotalDownloads { get; set; }
public int TotalPlays { get; set; }
}
public AudioWithCounters()
{
AddMap<Audio>(audios => from audio in audios
select new
{
Id = audio.Id,
ArtistName = audio.ArtistName,
Name = audio.Name,
TotalDownloads = 0,
TotalPlays = 0
});
AddMap<AudioCounter>(counters => from counter in counters
where counter.Type == Core.Enums.Audio.AudioCounterType.Download
select new
{
Id = counter.AudioId,
ArtistName = (string)null,
Name = (string)null,
TotalDownloads = 1,
TotalPlays = 0
});
AddMap<AudioCounter>(counters => from counter in counters
where counter.Type == Core.Enums.Audio.AudioCounterType.Download
select new
{
Id = counter.AudioId,
ArtistName = (string)null,
Name = (string)null,
TotalDownloads = 0,
TotalPlays = 1
});
Reduce = results => from result in results
group result by result.Id
into g
select new
{
Id = g.Key,
ArtistName = g.Select(x => x.ArtistName).Where(x => x != null).First(),
Name = g.Select(x => x.Name).Where(x => x != null).First(),
TotalDownloads = g.Sum(x => x.TotalDownloads),
TotalPlays = g.Sum(x => x.TotalPlays)
};
}
}
However, my TotalDownloads & TotalPlays are always 0 even though there should be data in there. What am I doing wrong?
In the reduce function, replace .First() with .FirstOrDefault(), then it works.
Besides that, there is a typo in the second map-function, because you are filtering on the same AudioCounterType.Download.