RavenDB Map/Reduce/Transform on nested, variable-length arrays

RavenDB Map/Reduce/Transform on nested, variable-length arrays - mapreduce

I'm new to RavenDB, and am loving it so far. I have one remaining index to create for my project.
The Problem
I have thousands of responses to surveys (i.e. "Submissions"), and each submission has an array of answers to specific questions (i.e. "Answers"), and each answer has an array of options that were selected (i.e. "Values").
Here is what a single Submission basically looks like:
{
"SurveyId": 1,
"LocationId": 1,
"Answers": [
{
"QuestionId": 1,
"Values": [2,8,32],
"Comment": null
},
{
"QuestionId": 2,
"Values": [4],
"Comment": "Lorem ipsum"
},
...more answers...
]
}
More Problem: I have to able to filter by SurveyId, LocationId, QuestionId, Creation Date. As I understand it, that's done at query time... I just need to make sure that these properties are present in the transformation result (or is it the reduce result? or both?). If I'm right, then this is less of an issue.
The Required Result
We need one object per question per survey that gives the sum of each option. Hopefully it's self explanatory:
[
{
SurveyId: 1,
QuestionId: 1,
NumResponses: 976,
NumComments: 273,
Values: {
"1": 452, // option 1 selected 452 times
"2": 392, // option 2 selected 392 times
"4": 785 // option 4 selected 785 times
}
},
{
SurveyId: 1,
QuestionId: 2,
NumResponses: 921,
NumComments: 46,
Values: {
"1": 325,
"2": 843,
"4": 119,
"8": 346,
"32": 524
}
},
...
]
My Attempt
I didn't get very far, and I think this post is heading me down the right path, but it doesn't help me with the list of Values. I've searched and searched but can't find any direction for what do with a nested array like that. Here's that I have so far:
MAP:
from submission in docs.Submissions
from answer in submission.Answers
where answer.WasSkipped != true && answer.Value != null
select new {
SubmissionDate = submission["#metadata"]["Last-Modified"],
SurveyId = submission.SurveyId,
LocationId = submission.LocationId,
QuestionId = answer.QuestionId,
Value = answer.Value
}
REDUCE:
??
TRANSFORM:
from result in results
from answer in result.Answers
where answer.WasSkipped != true && answer.Value != null
select new {
SubmissionDate = result["#metadata"]["Last-Modified"],
SurveyId = result.SurveyId,
LocationId = result.LocationId,
QuestionId = answer.QuestionId,
Value = answer.Value
}
For what it's worth, this is hosted on RavenHQ.
It's been so long that I've been working on this and can't get it right. Any help in getting me to the required result is very appreciated!

Assuming your C# classes look like this:
public class Submission
{
public int SurveyId { get; set; }
public int LocationId { get; set; }
public IList<Answer> Answers { get; set; }
}
public class Answer
{
public int QuestionId { get; set; }
public int[] Values { get; set; }
public string Comment { get; set; }
}
If you are running RavenDB 2.5.2637 or higher, you can now use a dictionary result type:
public class Result
{
public int SurveyId { get; set; }
public int QuestionId { get; set; }
public int NumResponses { get; set; }
public int NumComments { get; set; }
public Dictionary<int, int> Values { get; set; }
}
If you are running anything earlier (including 2.0 releases), then you won't be able to use a dictionary, but you can use an IList<KeyValuePair<int,int>> instead.
Here is the index:
public class TestIndex : AbstractIndexCreationTask<Submission, Result>
{
public TestIndex()
{
Map = submissions =>
from submission in submissions
from answer in submission.Answers
select new
{
submission.SurveyId,
answer.QuestionId,
NumResponses = 1,
NumComments = answer.Comment == null ? 0 : 1,
Values = answer.Values.ToDictionary(x => x, x => 1)
//Values = answer.Values.Select(x => new KeyValuePair<int, int>(x, 1))
};
Reduce = results =>
from result in results
group result by new { result.SurveyId, result.QuestionId }
into g
select new
{
g.Key.SurveyId,
g.Key.QuestionId,
NumResponses = g.Sum(x => x.NumResponses),
NumComments = g.Sum(x => x.NumComments),
Values = g.SelectMany(x => x.Values)
.GroupBy(x => x.Key)
.ToDictionary(x => x.Key, x => x.Sum(y => y.Value))
//.Select(x => new KeyValuePair<int, int>(x.Key, x.Sum(y => y.Value)))
};
}
}
(No transform step is needed.)
If you can't use 2.5.2637 or higher, then replace the .ToDictionary lines with the commented lines just below them, and use an IList<KeyValuePair<int,int>> in the result class.
The fix to allow dictionaries in the map/reduce was based on this issue which your post helped to identify. Thank you!

Related

Creating List<dynamic> using Dynamic variable in C#

JSON Output
{
"balance":100.0,
"trc10": [{
"TR7NHqjeK": "10000000",
"KQxGTCi8": "20000000"
}],
"trc20": [{
"TR7NHqjeKQxGTCi8q8ZY4pL8otSzgjLj6t": "10000000",
"TR7NHqjeKQxGTCi8q8ZY4pL8otSzgjL56": "40000000"
}]
}
public class Root
{
public double balance { get; set; }
public List<dynamic> trc10 { get; set; }
public List<dynamic> trc20 { get; set; }
}
The code to transform the JSON would look something like this:
c# code
Root myDeserializedClass = JsonConvert.DeserializeObject<Root>(myJsonResponse);
Response.Write((myDeserializedClass.balance).ToString());
myDeserializedClass.balance output - 100.0
how to gat value of trc10,trc20 List item with select and where query
like myDeserializedClass.trc20.Where(x => x.TR7NHqjeKQxGTCi8q8ZY4pL8otSsssj6t == "TR7NHqjeKQxGTCi8q8ZY4pL8otSsssj6t")

NUnit testing a controller method that returns a list

I am writing a test for a controller method.This method accesses the getPharmacySupply method in SupplyRepository which return a list of PharmacyMedicineSupply according to the project requirement:
[HttpPost("PharmacySupply")]
public Task<List<PharmacyMedicineSupply>> GetPharmacySupply([FromBody] List<MedicineDemand> medDemand)
{
_log4net.Info("Get Pharmacy Supply API Acessed");
return _supplyRepo.GetPharmacySupply(medDemand);
}
These are my models:
public class MedicineStock
{
public string Name { get; set; }
public string ChemicalComposition { get; set; }
public string TargetAilment { get; set; }
public DateTime DateOfExpiry { get; set; }
public int NumberOfTabletsInStock { get; set; }
}
public class PharmacyMedicineSupply
{
public string pharmacyName { get; set; }
public List<MedicineNameAndSupply> medicineAndSupply { get; set; }
}
public class MedicineNameAndSupply
{
public string medicineName { get; set; }
public int supplyCount { get; set; }
}
This is the test I am writing:
public static List<MedicineDemand> mockMedicineDemand= new List<MedicineDemand>()
{
new MedicineDemand()
{
Medicine = "Medcine1",
DemandCount = 20
},
new MedicineDemand()
{
Medicine = "Medcine2",
DemandCount = 25
},
new MedicineDemand()
{
Medicine = "Medcine3",
DemandCount = 30
},
new MedicineDemand()
{
Medicine = "Medcine4",
DemandCount = 35
},
new MedicineDemand()
{
Medicine = "Medcine5",
DemandCount = 40
}
[Test, TestCaseSource(nameof(mockMedicineDemand))]
public void Test_GetPharmacySupply(List<MedicineDemand> mockMedicineDemand)
{
Mock<ISupplyRepo> supplyMock = new Mock<ISupplyRepo>();
SupplyController sc = new SupplyController(supplyMock.Object);
var result = sc.GetPharmacySupply(mockMedicineDemand) as Task<List<PharmacyMedicineSupply>>;
var resultList = result.Result;
Assert.That(4,Is.EqualTo( resultList[0].medicineAndSupply[0].supplyCount));
}
But I am getting this error
Object of type 'MedicineSupplyMicroservice.Models.MedicineDemand' cannot be converted to type 'System.Collections.Generic.List`1[MedicineSupplyMicroservice.Models.MedicineDemand]'.
What am I doing wrong??

The method you are testing, GetPharmacySupply takes a single argument, which is a List of MedicineDemand objects.
NUnit, on the other hand, is calling it with a single MedicineDemand.
Take a look at your data source declaration...
public static List<MedicineDemand> mockMedicineDemand= new List<MedicineDemand>()
{
...
}
The TestCaseDataAttribute designates a field, property or method, which returns the necessary data for one or more test cases. Your source is returning a List<MedicineDemand> so NUnit takes that list and uses the content to call your method five times, once for each MedicineDemand.
This is clearly an error and NUnit might be a bit smarter about it and mark your test as non-runnable. However, with the current versions of NUnit, it defers the check until your test is actually run. (Once you have used TestCaseSource a lot, the error message leads you right to the problem.)
So your data source, in this case, needs to be a list or other type of enumeration of List<MedicineDemand> - basically a list of lists.
One way to do that would be to use List<List<MedicineDemand>> but I think it would make the code more, rather than less, confusing!
There are a lot of options here - check the docs for TestCaseSourceAttribute. My personal preference would be to make the data source into a method, as follows:
public static IEnumerable<List<MedicineDemand>> MockMedicineDemands()
{
yield return new List<MedicineDemand>
{
new MedicineDemand() { Medicine = "Medicine1", DemandCount = 20 },
new MedicineDemand() { Medicine = "Medicine2", DemandCount = 25 },
new MedicineDemand() { Medicine = "Medicine3", DemandCount = 30 },
new MedicineDemand() { Medicine = "Medicine4", DemandCount = 35 },
new MedicineDemand() { Medicine = "Medicine5", DemandCount = 40 }
};
// Add more cases here if desired
Additionally, I believe the code would be clearer, both to others and to yourself when you return to it after a period of time, if you named the lists differently from the individual components, that is something likemockMedicineDemandList or mockMedicineDemands.
Fair warning... it's not convenient for me to compile the above code right now, so it's just "forum code." Typos etc. are a strong possibility. Please post a comment if you find any errors.

Microsoft.SharePoint.Client C# getting only User created Lists (and not Document Libraries)

I am trying to retrieve a list of user generated Lists from a specified website. I do not want System generated lists (eg MicroFeed) nor Document Libraries. Using the Microsoft example I have this code:
public static void LoadLists(Microsoft.SharePoint.Client.Web web, List<String> foldersList)
{
var ctx = web.Context;
ListCollection collList = web.Lists;
IEnumerable<List> listInfo = ctx.LoadQuery(
collList.Include(
list => list.Title,
list => list.Fields.Include(
field => field.Title,
field => field.InternalName)));
ctx.ExecuteQuery();
foreach (List oList in listInfo)
{
FieldCollection collField = oList.Fields;
foreach (Microsoft.SharePoint.Client.Field oField in collField)
{
Regex regEx = new Regex("name", RegexOptions.IgnoreCase);
if (regEx.IsMatch(oField.InternalName))
{
Console.WriteLine("List: {0} \n\t Field Title: {1} \n\t Field Internal Name: {2}",
oList.Title, oField.Title, oField.InternalName);
}
}
}
}
However this returns all Lists and Document Libraries (and heaven knows what else). Is there an easy way to just get back the user defined lists? Here is an example of what I would like to get:
And looking at the documentation from Microsoft they seems to use the term list to refer to actual lists (tables) and document libraries (folders). What is the proper nomenclature for getting the list that is really just like an excel spreadsheet of data? Finally, is it possible for lists (tables) to be nested in side a Document Libraries? I can't seem to be able to do this, but I wanted to check since I am new to SharePoint.
Thanks!

So after having to lookup lots of examples (not from Microsoft, thank you) and stepping thru actual responses, here is the code for loading only the Lists and their field columns (not hidden) created by the user. I am sure that this could be optimized/cleaned up (for example not having to run the secondary queries to get List attributes, but it gave me access denied in original query), but it is working for me. Also needs some loving care for try-catches in case things go south.
First a couple of classes to hold the data:
public class SharePointColumn
{
public string Title { get; set; }
public string InternalName { get; set; }
public string TypeAsString { get; set; }
}
public class SharePointLibrary
{
public SharePointLibrary()
{
Columns = new List<SharePointColumn>();
}
public string Title { get; set; }
public Boolean IsList { get; set; } // If true a list, else DocumentLibrary
public List<SharePointColumn> Columns { get; set; }
}
Then the real code.
public static void LoadLists(Microsoft.SharePoint.Client.Web web, List<SharePointLibrary> sharePointLibraries)
{
var ctx = web.Context;
ListCollection collList = web.Lists;
IEnumerable<List> listInfo = ctx.LoadQuery(
collList.Include(
list => list.Title,
list => list.Fields.Include(
field => field.Title,
field => field.InternalName,
field => field.Hidden,
field => field.TypeAsString)));
ctx.ExecuteQuery();
foreach (List oList in listInfo)
{
// Had to add these because trying to add in above query failed
ctx.Load(oList);
ctx.ExecuteQuery();
// 544 Base Template is MicroFeed
if (oList.Hidden == false && oList.IsCatalog == false && (!oList.IsObjectPropertyInstantiated("IsSiteAssetsLibrary") || oList.IsSiteAssetsLibrary == false) &&
oList.BaseType != BaseType.DocumentLibrary && oList.BaseTemplate != 544)
{
FieldCollection collField = oList.Fields;
SharePointLibrary lib = new SharePointLibrary
{
Title = oList.Title,
IsList = true,
Columns = new List<SharePointColumn>()
};
foreach (Microsoft.SharePoint.Client.Field oField in collField)
{
if (!oField.Hidden)
{
SharePointColumn col = new SharePointColumn();
col.Title = oField.Title;
col.InternalName = oField.InternalName;
col.TypeAsString = oField.TypeAsString;
lib.Columns.Add(col);
}
}
sharePointLibraries.Add(lib);
}
}
}

RavenDB: Why do I get null-values for fields in this multi-map/reduce index?

Inspired by Ayende's article https://ayende.com/blog/89089/ravendb-multi-maps-reduce-indexes, I have the following index, that works as such:
public class Posts_WithViewCountByUser : AbstractMultiMapIndexCreationTask<Posts_WithViewCountByUser.Result>
{
public Posts_WithViewCountByUser()
{
AddMap<Post>(posts => from p in posts
select new
{
ViewedByUserId = (string) null,
ViewCount = 0,
Id = p.Id,
PostTitle = p.PostTitle,
});
AddMap<PostView>(postViews => from postView in postViews
select new
{
ViewedByUserId = postView.ViewedByUserId,
ViewCount = 1,
Id = (string) postView.PostId,
PostTitle = (string) null,
});
Reduce = results => from result in results
group result by new
{
result.Id,
result.ViewedByUserId
}
into g
select new Result
{
ViewCount = g.Sum(x => x.ViewCount),
Id = g.Key.Id,
ViewedByUserId = g.Key.ViewedByUserId,
PostTitle = g.Select(x => x.PostTitle).Where(x => x != null).FirstOrDefault(),
};
Store(x => x.PostTitle, FieldStorage.Yes);
}
public class Result
{
public string Id { get; set; }
public string ViewedByUserId { get; set; }
public int ViewCount { get; set; }
public string PostTitle { get; set; }
}
}
I want to query this index like this:
Return all posts including - for a given user - the integer of how many times, the user has viewed the post. The "views" are stored in a separate document type, PostView. Note, that my real document types have been renamed here to match the example from the article (I certainly would not implement "most-viewed" this way).
The result from the query I get is correct - i.e. I always get all the Post documents with the correct view-count for the user. But my problem is, the PostTitle field always is null in the result set (all Post documents have a non-null value in the dataset).
I'm grouping by the combination of userId and (post)Id as my "uniqueness". The way I understand it (and please correct me if I'm wrong), is, that at this point in the reduce, I have a bunch of pseudo-documents with identical userId /postId combination, some of which come from the Post map, others from the PostView map. Now I simply find any single pseudo-document of the ones, that actually have a value for PostTitle - i.e. one that originates from the Post map. These should all obviously have the same value, as it's the same post, just "outer-joined". The .Select(....).Where(....).FirstOrDefault() chain is taken from the very example I used as a base. I then set this ViewCount value for my final document, which I project into the Result.
My question is: how do I get the non-null value for the PostTitle field in the results?

The problem is that you have:
ViewedByUserId = (string) null,
And:
group result by new
{
result.Id,
result.ViewedByUserId
}
into g
In other words, you are actually grouping by null, which I'm assuming that isn't your intent.
It would be much simpler to have a map/reduce index just on PostView and get the PostTitle from an include or via a transformer.
You understanding of what is going on is correct, in the sense that you are creating index results with userId / postId on them.
Buit what you are actually doing is creating results from PostView with userId /postId and from Post with null /postId.
And that is why you don't have the matches that you want.

The grouping in the index is incorrect. With the following sample data:
new Post { Id = "Post-1", PostTitle = "Post Title", AuthorId = "Author-1" }
new PostView { ViewedByUserId = "User-1", PostId = "Post-1" }
new PostView { ViewedByUserId = "User-1", PostId = "Post-1" }
new PostView { ViewedByUserId = "User-2", PostId = "Post-1" }
The index results are like this:
ViewCount | Id | ViewedByUserId | PostTitle
--------- | ------ | -------------- | ----------
0 | Post-1 | null | Post Title
2 | Post-1 | User-1 | null
1 | Post-1 | User-2 | null
The map operation in the index simply creates a common document for all source documents. Thus, the Post-1 document produces one row, the two documents for Post-1 and User-1 produce two rows (which are later reduced to the single row with ViewCount == 2) and the document for Post-1 and User-2 produces the last row.
The reduce operation the groups all the mapped rows and produces the resulting documents in the index. In this case, the Post-sourced document is stored separately from the PostView-sourced documents because the null value in the ViewedByUserId is not grouped with any document from the PostView collection.
If you can change your way of storing data, you can solve this issue by storing the number of views directly in the PostView. It would greatly reduce duplicate data in your database while having almost the same cost when updating the view count.
Complete test (needs xunit and RavenDB.Tests.Helpers nugets):
using Raven.Abstractions.Indexing;
using Raven.Client;
using Raven.Client.Indexes;
using Raven.Tests.Helpers;
using System.Linq;
using Xunit;
namespace SO41559770Answer
{
public class SO41559770 : RavenTestBase
{
[Fact]
public void SO41559770Test()
{
using (var server = GetNewServer())
using (var store = NewRemoteDocumentStore(ravenDbServer: server))
{
new PostViewsIndex().Execute(store);
using (IDocumentSession session = store.OpenSession())
{
session.Store(new Post { Id = "Post-1", PostTitle = "Post Title", AuthorId = "Author-1" });
session.Store(new PostView { Id = "Views-1-1", ViewedByUserId = "User-1", PostId = "Post-1", ViewCount = 2 });
session.Store(new PostView { Id = "Views-1-2", ViewedByUserId = "User-2", PostId = "Post-1", ViewCount = 1 });
session.SaveChanges();
}
WaitForAllRequestsToComplete(server);
WaitForIndexing(store);
using (IDocumentSession session = store.OpenSession())
{
var resultsForId1 = session
.Query<PostViewsIndex.Result, PostViewsIndex>()
.ProjectFromIndexFieldsInto<PostViewsIndex.Result>()
.Where(x => x.PostId == "Post-1" && x.UserId == "User-1");
Assert.Equal(2, resultsForId1.First().ViewCount);
Assert.Equal("Post Title", resultsForId1.First().PostTitle);
var resultsForId2 = session
.Query<PostViewsIndex.Result, PostViewsIndex>()
.ProjectFromIndexFieldsInto<PostViewsIndex.Result>()
.Where(x => x.PostId == "Post-1" && x.UserId == "User-2");
Assert.Equal(1, resultsForId2.First().ViewCount);
Assert.Equal("Post Title", resultsForId2.First().PostTitle);
}
}
}
}
public class PostViewsIndex : AbstractIndexCreationTask<PostView, PostViewsIndex.Result>
{
public PostViewsIndex()
{
Map = postViews => from postView in postViews
let post = LoadDocument<Post>(postView.PostId)
select new
{
Id = postView.Id,
PostId = post.Id,
PostTitle = post.PostTitle,
UserId = postView.ViewedByUserId,
ViewCount = postView.ViewCount,
};
StoreAllFields(FieldStorage.Yes);
}
public class Result
{
public string Id { get; set; }
public string PostId { get; set; }
public string PostTitle { get; set; }
public string UserId { get; set; }
public int ViewCount { get; set; }
}
}
public class Post
{
public string Id { get; set; }
public string PostTitle { get; set; }
public string AuthorId { get; set; }
}
public class PostView
{
public string Id { get; set; }
public string ViewedByUserId { get; set; }
public string PostId { get; set; }
public int ViewCount { get; set; }
}
}

How to convert a dynamic list into list<Class>?

I'm trying to convert a dynamic list into a list of class-model(Products). This is how my method looks like:
public List<Products> ConvertToProducts(List<dynamic> data)
{
var sendModel = new List<Products>();
//Mapping List<dynamic> to List<Products>
sendModel = data.Select(x =>
new Products
{
Name = data.GetType().GetProperty("Name").ToString(),
Price = data.GetType().GetProperty("Price").GetValue(data, null).ToString()
}).ToList();
}
I have tried these both ways to get the property values, but it gives me null errors saying these properties doesn't exist or they are null.
Name = data.GetType().GetProperty("Name").ToString(),
Price = data.GetType().GetProperty("Price").GetValue(data,
null).ToString()
This is how my Model-class looks like:
public class Products
{
public string ID { get; set; }
public string Name { get; set; }
public string Price { get; set; }
}
Can someone please let me know what I'm missing? thanks in advance.

You're currently trying to get properties from data, which is your list - and you're ignoring x, which is the item in the list. I suspect you want:
var sendModel = data
.Select(x => new Products { Name = x.Name, Price = x.Price })
.ToList();
You may want to call ToString() on the results of the properties, but it's not clear what's in the original data.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

RavenDB Map/Reduce/Transform on nested, variable-length arrays - mapreduce

Related

Creating List<dynamic> using Dynamic variable in C#

NUnit testing a controller method that returns a list

Microsoft.SharePoint.Client C# getting only User created Lists (and not Document Libraries)

RavenDB: Why do I get null-values for fields in this multi-map/reduce index?

How to convert a dynamic list into list<Class>?

Categories

Resources