Complex MapReduce Query with RavenDB [closed] - mapreduce

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
Hope you can help me !!
I am collecting tweets, which have a created_at date (DataPublicacao), and some Hashtags. Each tweet refers to a broadcaster (redeId), and a show (programaId).
I want to query the database for the 20 most used hashtags in a certain period.
I have to map each hashtag, when it was used, and to which broadcaster and tv show it refers to.
Then, I need to be able to count the occurrences of each hashtag in a certain period (I dont know how).
public class Tweet : IModelo
{
public string Id { get; set; }
public string RedeId { get; set; }
public string ProgramaId { get; set; }
public DateTime DataPublicacao { get; set; }
public string Conteudo { get; set; }
public string Aplicacao { get; set; }
public Autor Autor { get; set; }
public Twitter.Monitor.Dominio.Modelo.TweetJson.Geo LocalizacaoGeo { get; set; }
public Twitter.Monitor.Dominio.Modelo.TweetJson.Place Localizacao { get; set; }
public Twitter.Monitor.Dominio.Modelo.TweetJson.Entities Entidades { get; set; }
public string Imagem { get; set; }
public Autor Para_Usuario { get; set; }
public string Retweet_Para_Status_Id { get; set; }
}
And the "entities" are hashtags, usermentions, and urls.
I tried to group the hashtags by broadcaster, tv show, and text, and listing the dates of the occurrences. Then, I have to transform the results, so I can count the occurrences on that period.
public class EntityResult
{
public string hashtagText { get; set; }
public string progId { get; set; }
public string redeId { get; set; }
public int listCount { get; set; }
}
public class HashtagsIndex : AbstractIndexCreationTask<Tweet, HashtagsIndex.ReduceResults>
{
public class ReduceResults
{
public string hashtagText { get; set; }
public DateTime createdAt { get; set; }
public string progId { get; set; }
public string redeId { get; set; }
public List<DateTime> datesList { get; set; }
}
public HashtagsIndex()
{
Map = tweets => from tweet in tweets
from hts in tweet.Entidades.hashtags
where tweet.Entidades != null
select new
{
createdAt = tweet.DataPublicacao,
progId = tweet.ProgramaId,
redeId = tweet.RedeId,
hashtagText = hts.text,
datesList = new List<DateTime>(new DateTime[] { tweet.DataPublicacao })
};
Reduce = results => from result in results
group result by new { result.progId, result.redeId, result.hashtagText }
into g
select new
{
createdAt = DateTime.MinValue,
progId = g.Key.progId,
redeId = g.Key.redeId,
hashtagText = g.Key.hashtagText,
datesList = g.ToList().Select(t => t.createdAt).ToList()
};
}
}
And the query I made so far is:
var hashtags2 = session.Query<dynamic, HashtagsIndex>().Customize(t => t.TransformResults((query, results) =>
results.Cast<dynamic>().Select(g =>
{
Expression<Func<DateTime, bool>> exp = o => o >= dtInit && o <= dtEnd;
int count = g.Where(exp);
return new EntityResult
{
redeId = g.redeId,
progId = g.progId,
hashtagText = g.hashtagText,
listCount = count
};
}))).Take(20).ToList();
Now I need to OrderByDescending(t=>t.count), so I cant Take(20) most used hashtags on that period.
How do I do that?

Is it possible to filter items before the mapreduce process?
A map/reduce index is just like any other index. All documents are processed through all indexes, always. So when phrased with "before" like you asked, the answer is clearly "no".
But I think you are just interested in filtering items during the indexing, and that is easily done in the map:
Map = items => from item in items
where item.foo == whatever // this is how you filter
select new
{
// whatever you want to map
}
This index will process all documents, but the resulting index will only contain items that match the filter you specified in the where clause.
Is it possible to subsequently group by features, like users by age, and then by region
Grouping is done in the reduce step. That is what map/reduce is all about.
My advice to you (and I mean no disrespect by this), is to walk before you try to run. Build a simple prototype or set of unit tests, and try first just basic storage and retrieval. Then try basic indexing and querying. Then try a simple map reduce, such as counting all your tweets. Only then should you attempt an advance map/reduce with other groupings. And if you run into trouble, then you will have code you can post here for help.
Is it possible?
Of course. Anything is possible. :)

Related

How to retrieve list from DB to front-end with AutoMapper

After changing the mapping to Automapper, only an empty list is sent through the endpoint.
Initially I had an endpoint that retrieved all employees with info including a list with every course each employee had taken. This was with manual mapping between entities & Dto.
//From startup.cs in Configure
AutoMapper.Mapper.Initialize(cfg =>
{
cfg.CreateMap<Employee, Models.EmployeeCoursesDto>();
cfg.CreateMap<Employee, Models.EmployeeDto>();
cfg.CreateMap<EmployeeCourses, Models.EmployeeCoursesDto>();
});
//From Employee entity
public class Employee
{
[Key]
//Gen new Id key in DB when object created
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
[Required]
[MaxLength(50)]
public string Name { get; set; }
[MaxLength(50)]
public string Title { get; set; }
public ICollection<EmployeeCourses> EmployeeCourses { get; set; }
= new List<EmployeeCourses>();
}
}
//From employee Dto
public class EmployeeDto
{
public int Id { get; set; }
public string Name { get; set; }
public string Title { get; set; }
public ICollection<EmployeeCoursesDto> EmployeeCourses { get; set; }
= new List<EmployeeCoursesDto>();
}
}
//Endpoint in controller
[HttpGet()]
public IActionResult GetAllEmployees()
{
var employeeEntities = _employeeInfoRepository.GetEmployees();
var results = Mapper.Map<IEnumerable<EmployeeDto>>(employeeEntities);
return Ok(results);
}
//From Irepository
IEnumerable<Employee> GetEmployees();
//From repository
public IEnumerable<Employee> GetEmployees()
{
return _context.Employees.OrderBy(c => c.Name).ToList();
}
I expected output all employees with all datafileds, including their list of courses.
The output is all fields with data, except the list of courses which is "0" when running with a breakpoint, and in Postman it shows as only:
"id": 2,
"name": "Test Person",
"title": "Bus Driver",
"numberOfCourses": 0,
"employeeCourses": [],
"totalAchievedHoursAuditor": 0,
"totalAchievedHoursAccountant": 51,
"courseBalanceAccountant": null,
"courseBalanceAuditor": null
However, if I try another endpoint only for retrieving a specific course, or a list of courses, the data show correctly. Seems there are an issue with mapping the employees & courses at the same time?
I found the error, not Automapper, but my Linq statement:
return _context.Employees.Include(c => c.EmployeeCourses).ToList();
Please close this thread. Thanks for the reply Lucian Bargaoanu & have a great weekend.

How do I use c# List<string> to store multiple file paths in ASP.NET MVC 4

I am working on a small project with ASP.NET MVC 4 and .net framework 4.5. I have a class RecReservation to deal with user reservations. A user may make a reservation at first and upload 2-3 files later, the files have to be related to a certain reservation. At first I defined a string type in RecReservation to store file path, it worked, but only one file can be related to a reservation in this way, I need a solution to handle multiple file paths.
So I defined a List<string> _recFilePath in my RecReservation class to store the file paths. Each time a file is uploaded, the path of which should be appended to _recFilePath. The class looks like this:
public class RecReservation
{
[HiddenInput(DisplayValue = false)]
public int RecReservationID { get; set; }
public String ProjectName { get; set; }
[Required]
public string ProjectManager { get; set; }
public String Department { get; set; }
[DataType(DataType.Date)]
public DateTime ReservationDate { get; set; }
public String ReservationTime { get; set; }
[Required]
public string CourseName { get; set; }
[Required]
public string InstructorName { get; set; }
[Required]
public string PhoneNumber { get; set; }
[DataType(DataType.MultilineText)]
public string Description { get; set; }
private List<string> _recFilePath = new List<string>();
public List<string> RecFilePath
{
get
{ return _recFilePath; }
}
public void AddRecFilePath(string newpath)
{
_recFilePath.Add(newpath);
}
}
In the database, I defined a nvarchar(500) data type for _recFilePath.
In the controller, I defined a pair of RecUpload overloads to deal with file uploads:
public ViewResult RecUpload(int recReservationID)
{
RecReservation recReservation = repository.RecReservations.FirstOrDefault(p => p.RecReservationID == recReservationID);
return View(recReservation);
}
[HttpPost]
public ActionResult RecUpload(HttpPostedFileBase file, RecReservation recReservation)
{
string fileName = Path.GetFileName(file.FileName);
string path = Path.Combine(Server.MapPath("~/Files/RecUploads"), fileName);
file.SaveAs(path);
repository.AddFilePathToRec(recReservation, path);
TempData["message"] = string.Format("{0} has been uploaded", fileName);
return RedirectToAction("AdminRecList");
}
This is the implementation of AddFilePathToRec
public void AddFilePathToRec(RecReservation recReservation, string newFilePath)
{
RecReservation dbEntry = context.RecReservations.Find(recReservation.RecReservationID);
if (dbEntry != null)
{
dbEntry.AddRecFilePath(newFilePath);
}
context.SaveChanges();
}
Now that I have tested the implementation, files upload works fine, but no path is added to _recFilePath field in my database. So what is the reason for this? Help is appreciated! Thank you in advance.

How to get distinct values from 2 different columns in the same list

So as you can see from the code bellow i have a list object named Matches, from which i would like to get a single list of the distinct teams, both from HomeTeam and AwayTeam. I'm trying to use LINQ and i can get a list of distinct teams if i only use HomeTeam parameter or AwayTeam parameter but not both together.
Thank you.
public class Match
{
public int ID { get; set; }
public string Country { get; set; }
public string Championship { get; set; }
public string Seasson { get; set; }
public DateTime MatchDate { get; set; }
public string HomeTeam { get; set; }
public int HomeScore { get; set; }
public int AwayScore { get; set; }
public string AwayTeam { get; set; }
}
private List<Match> Matches;
Matches = dataAccess.GetAllMatches();
I'm Trying to do something like that:
result = Matches.Select(HomeTeam, AwayTeam).Distinct().ToList();
At the risk that this smells like homework, a hint rather than code. Get your Home teams, Union your Away teams and apply a Distinct to the result.
So i finally come up with this solution.
Notice that now i need also to get not only the team but the country which the team belongs to.
public class Team
{
public string Name { get; set; }
public string Country { get; set; }
}
So Union really do the job here but since now i need to get it as an anonymous type... here is the code:
List<Team> teams = new List<Team>();
var result = Matches.Select(x => new { Name = x.HomeTeam, Country = x.Country }).Union(Matches.Select(x => new { Name = x.AwayTeam, Country = x.Country })).ToList();
foreach (var record in result)
{
teams.Add(new Team { Name = record.Name, Country = record.Country });
}
return teams;
I would prefer this way:
List<Team> teamsResult = Matches.Select(x => new Team { Name = x.HomeTeam, Country = x.Country }).Union(Matches.Select(x => new Team { Name = x.AwayTeam, Country = x.Country })).ToList();
But this way get duplicates so i will stick with the first example for now.
Do you think it is the more elegant way to go?
Thank you.
You can take advantage of GroupBy, like this:
IEnumerable<Team> teams = Matches.GroupBy(m => new { m.AwayTeam, m.HomeTeam, m.Country })
.Select(
g =>
new[]
{
new Team {Country = g.Key.Country, Name = g.Key.AwayTeam},
new Team {Country = g.Key.Country, Name = g.Key.HomeTeam}
})
.SelectMany(x => x)
.GroupBy(t => new { t.Name, t.Country })
.Select(g => new Team { Name = g.Key.Name, Country = g.Key.Country });

how to convert List<KeyValuePair<string,object>> to class, model in .Net Core

I have got response List<List<KeyValuePair<string, object>>> but I want to convert response into List<ClassName>.
KeyValuePair key and ClassName property both are same name and same type
What is the most programmatically way to convert response?
I have got response
My class structure
public class TestModel
{
public string TaxablePersonCode { get; set; }
public string LegalNameAsPerPan { get; set; }
public string TradeName { get; set; }
public string ConstitutionName { get; set; }
public string ResidentialStatusName { get; set; }
public string PrimaryMobileNo { get; set; }
public string FlatOrOfficeNo { get; set; }
public string TownOrCityOrDist { get; set; }
public string Pincode { get; set; }
public string StateName { get; set; }
public string CountryName { get; set; }
public string ContactName { get; set; }
public string ContactDesignationName { get; set; }
public string ContactMobile { get; set; }
public string ContactEmail { get; set; }
}
var listKeyValue = response.Select(x => x.Value).ToList();
var data = JsonConvert.DeserializeObject<List<TestModel>>(listKeyValue);
The only part of this im unsure about is when indexing the final list for the correct property, but you can choose what works best for you. Using a .Where() every time ensures you'll get the right result but it will search the list every single time and be a lot slower. If your %100 certain the order of the list will never change you could gain some performance by directly indexing the list for the element you want using [] or .ElementAt(). Anyway, heres what your looking for.
List<TestModel> myList = response.Select(x => new TestModel
{
// Using Where
TaxablePersonCode = x.Where(t => t.Key == "TaxablePersonCode").First().Value,
// Using direct index
LegalNameAsPerPan = x[1].Value,
// Using ElementAt
TradeName = x.ElementAt(2).Value,
...
});
Hope that helps!
Did you get some result with JsonConvert class? Did it worked for you? If not, you can try out something like this (if JSON and field are properly named):
var listKeyValue = response.Select(x => x.Value).ToList();
var result = new List<TestModel>();
foreach (var keyValueList in listKeyValue)
{
// convert the response list of KeyValuePair to dictionary
var dictionary = keyValueList.ToDictionary(kv => kv.Key, kv => kv.Value);
var tempModel = new TestModel();
// get actual value by name of the rpoperty
tempModel.TaxablePersonCode = dictionary[nameof(tempModel.TaxablePersonCode)].ToString();
// etc.
result.Add(tempModel);
}
Maybe this approach could be improved with reflection, but this will degrade the performance.
// get all properties to populate
var properties = typeof(TestModel).GetProperties(BindingFlags.Public | BindingFlags.Instance);
I see you're using JSON, so you should probably just deserialize the object properly, which I would expect to look something like this:
List<TestModel> models = JsonConvert.DeserializeObject<List<TestModel>>(response);
Otherwise, you could use reflection to bind the known KeyValuePair keys to the properties of the object; that being said, you will need to ensure that the return values are compatible with the values from the returned data, else this will fail.
outerList.ForEach(innerList => {
TestModel result = new TestModel();
innerList.ForEach(listItem => {
result
.GetType()
.GetProperty(listItem.Key)
?.SetValue(result, listItem.Value);
});
});

How to update the content inside a list in c#

I have a class as follow:
public class student
{
public string studentID { get; set; }
public string studentName { get; set; }
public string studentGender { get; set; }
public string studentCGP { get; set; }
}
List<student> students = new List<student>();
..... I had added some data into the students List mention above, except for the data to the studentCGP.
After my other calculation for the studentCGP data, how do I put the data back to respectively? I'll have the studentID and studentCGP in hand.
using Linq...
var student = students.Find( x => x.studentID == idValue );
student.studentCGP = cgpValue;
Seems pretty trivial... am I missing something in the question?
students.Single(o => o.studentID == idValue).studentCGP = cgpValue;
There is a number of functions that you can use, just choose the one that suits you the best. For example you can use also First, Last etc.