CouchDB: Group data by key - mapreduce

I have following data in my database:
| value1 | value2 |
|----------+----------|
| 1 | a |
| 1 | b |
| 2 | a |
| 3 | c |
| 3 | d |
|----------+----------|
What I want as a output is {"key":1,"value":[a,b]},{"key":2,"value":[a]},{"key":3,"value":[c,d]}
I wrote this map function (but not quiet sure if this is correct)
function(doc) {
emit(doc.value1,doc.value2);
}
...but I am missing the reduce-function. Thanks for your help!

Not sure if this can/should be done with a reduce function.
However, you can reformat the output with lists. Try the following list function:
function (head, req) {
var row,
returnObj = {};
while (row = getRow()) {
if(returnObj[row.key]){
returnObj[row.key].push(row.value);
}else{
returnObj[row.key] = [row.value];
}
}
send(JSON.stringify(returnObj));
};
The output should look like this:
{
1: [
"a",
"b"
],
2: [
"a"
],
3: [
"c",
"d"
]
}
Hope that helps.

Related

Build CGAL Mesh_criteria one by one

I used to build my CGAL MeshCriteria as follows, and that works well.
auto criteria = Mesh_criteria(
CGAL::parameters::edge_size=edge_size,
CGAL::parameters::facet_angle=facet_angle,
CGAL::parameters::facet_size=facet_size,
CGAL::parameters::facet_distance=facet_distance,
CGAL::parameters::cell_radius_edge_ratio=cell_radius_edge_ratio,
CGAL::parameters::cell_size=size
);
Now I have a function which has only some criteria constraints, other values are invalid (e.g., negative). I would like to build Mesh_criteria as follows (pseudocode), but don't know how to do it:
auto criteria = Mesh_criteria();
if edge_size > 0.0:
criteria.add(CGAL::parameters::edge_size=edge_size);
if facet_angle > 0.0:
criteria.add(CGAL::parameters::facet_angle=facet_angle);
// [...]
Any hints?
I don't see any solution but knowing the default values and use the ternary operator ?:.
Here is a copy paste from the code that will give you the default values:
template <class ArgumentPack>
Mesh_criteria_3_impl(const ArgumentPack& args)
: edge_criteria_(args[parameters::edge_size
| args[parameters::edge_sizing_field
| args[parameters::sizing_field | FT(DBL_MAX)] ] ])
, facet_criteria_(args[parameters::facet_angle | FT(0)],
args[parameters::facet_size
| args[parameters::facet_sizing_field
| args[parameters::sizing_field | FT(0)] ] ],
args[parameters::facet_distance | FT(0)],
args[parameters::facet_topology | CGAL::FACET_VERTICES_ON_SURFACE])
, cell_criteria_(args[parameters::cell_radius_edge_ratio
| args[parameters::cell_radius_edge | FT(0)] ],
args[parameters::cell_size
| args[parameters::cell_sizing_field
| args[parameters::sizing_field | FT(0)] ] ])
{ }

Replacing regex pattern with another string works, but replacing with NONE replaces all values

I am trying to replace all strings in a column that start with 'DEL_' with a NULL value.
I have tried this:
customer_details = customer_details.withColumn("phone_number", F.regexp_replace("phone_number", "DEL_.*", ""))
Which works as expected and the new column now looks like this:
+--------------+
| phone_number|
+--------------+
|00971585059437|
|00971559274811|
|00971559274811|
| |
|00918472847271|
| |
+--------------+
However, if I change the code to:
customer_details = customer_details.withColumn("phone_number", F.regexp_replace("phone_number", "DEL_.*", None))
This now replaces all values in the column:
+------------+
|phone_number|
+------------+
| null|
| null|
| null|
| null|
| null|
| null|
+------------+
Try this-
scala
df.withColumn("phone_number", when(col("phone_number").rlike("^DEL_.*"), null)
.otherwise(col("phone_number"))
)
python
df.withColumn("phone_number", when(col("phone_number").rlike("^DEL_.*"), None)
.otherwise(col("phone_number"))
)
Update
Query-
Can you explain why my original solution doesn't work? customer_details.withColumn("phone_number", F.regexp_replace("phone_number", "DEL_.*", None))
Ans- All the ternary expressions(functions taking 3 arguments) are all null-safe. That means if spark finds any of the arguments null, it will indeed return null without any actual processing (eg. pattern matching for regexp_replace).
you may wanted to look at this piece of spark repo
override def eval(input: InternalRow): Any = {
val exprs = children
val value1 = exprs(0).eval(input)
if (value1 != null) {
val value2 = exprs(1).eval(input)
if (value2 != null) {
val value3 = exprs(2).eval(input)
if (value3 != null) {
return nullSafeEval(value1, value2, value3)
}
}
}
null
}

Keep only newest records using DQL

I have a symfony app with doctrine. There is a table like:
+--------+---------------------+-------+
| user | log_date | foo |
+---------+---------------------+-------+
| john | 2018-03-20 22:59:18 | 58 |
| kyle | 2018-04-11 13:45:02 | 22 |
| paul | 2018-11-08 22:19:16 | 41 |
| kyle | 2018-08-14 09:39:26 | 19 |
| fred | 2018-03-28 06:08:31 | 24 |
| john | 2018-01-21 11:52:17 | 81 |
| ... | ... | ... |
+---------+---------------------+-------+
A cron should execute a symfony command to delete all records but keep the latest 10 of every user. Can this be done using DQL or do I have to use an SQL (sub-)query?
I think something like this in entity repository can get all the entries for the user except the last 10
public function getAllExceptLatest($user)
{
return $this
->createQueryBuilder('t')
->andWhere('t.logDate <= :logDate')
->orderBy('t.logDate', 'DESC')
->setParameter(':logDate', $this->getLatestDate($user))
->setFirstResult(10)
->getQuery()
->execute();
}
public function getLatestDate($user)
{
return $this->createQueryBuilder('e')
->select('MAX(e.logDate)')
->andWhere('e.user = :user')
->setParameter(':user', $user)
->getQuery()
->getSingleScalarResult();
}
And in controller you can use
public function keepLatest(){
$em = $this->getDoctrine()->getManager();
$userRepo = $em->getRepository(User::class);
$users = $userRepo->findAll();
foreach ($users as $u) {
$records = $userRepo->getAllExceptLatest($u);
foreach ($records as $r)
$em->remove($r);
}
$em->flush();
}
I didn't test this, but in mine apps similar methods works fine

Mockito verifying method invocation without using equals method

While using Spock i can do something like this:
when:
12.times {mailSender.send("blabla", "subject", "content")}
then:
12 * javaMailSender.send(_)
When i tried to do same in Mockito:
verify(javaMailSender,times(12)).send(any(SimpleMailMessage.class))
I got an error that SimpleMailMessage has null values, so i had to initialize it in test:
SimpleMailMessage simpleMailMessage = new SimpleMailMessage()
simpleMailMessage.setTo("blablabla")
simpleMailMessage.subject = "subject"
simpleMailMessage.text = "content"
verify(javaMailSender,times(12)).send(simpleMailMessage))
Now it works but it's a large workload and i really don't care about equality. What if SimpleMailMessage will have much more arguments or another objects with another arguments, meh. Is there any way to check that send method was just called X times?
EDIT: added implementation of send method.
private fun sendEmail(recipient: String, subject: String, content: String)
{
val mailMessage = SimpleMailMessage()
mailMessage.setTo(recipient)
mailMessage.subject = subject
mailMessage.text = content
javaMailSender.send(mailMessage)
}
There are 2 senders, mailSender is my custom object and javaMailSender is from another libary
Stacktrace:
Mockito.verify(javaMailSender,
Mockito.times(2)).send(Mockito.any(SimpleMailMessage.class))
| | | | |
| | | | null
| | | Wanted but not invoked:
| | | javaMailSender.send(
| | | <any org.springframework.mail.SimpleMailMessage>
| | | );
| | | -> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
| | |
| | | However, there were exactly 2 interactions with this mock:
| | | javaMailSender.send(
| | | SimpleMailMessage: from=null; replyTo=null; to=blabla; cc=; bcc=; sentDate=null; subject=subject; text=content
| | | );
| | | -> at MailSenderServiceImpl.sendEmail(MailSenderServiceImpl.kt:42)
| | |
| | | javaMailSender.send(
| | | SimpleMailMessage: from=null; replyTo=null; to=blabla; cc=; bcc=; sentDate=null; subject=subject; text=content
| | | );
If you don't care for the parameter of send, leave any() empty:
verify(javaMailSender,times(12)).send(any())

Sorting Vector Alphabetically by Index Value

I have a vector that I want to sort alphabetically. I have successfully been able to sort it by one indexes value alphabetically, but when I do it only changes the order of that index and not the entire vector. How can I get it to apply the order change to the entire vector?
This is my current code I am running:
std::sort (myvector[2].begin(), myvector[2].end(), compare);
bool icompare_char(char c1, char c2)
{
return std::toupper(c1) < std::toupper(c2);
}
bool compare(std::string const& s1, std::string const& s2)
{
if (s1.length() > s2.length())
return true;
if (s1.length() < s2.length())
return false;
return std::lexicographical_compare(s1.begin(), s1.end(),
s2.begin(), s2.end(),
icompare_char);
}
My general structure for this vector is vector[row][column] where:
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
For example if I had a vector:
myvector[0][0] = 'One' AND myvector[2][0]='b'
myvector[0][1] = 'Two' AND myvector[2][1]='a'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| b | a | c |
And I sort it I get:
myvector[0][0] = 'One' AND myvector[2][0]='a'
myvector[0][1] = 'Two' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| One | Two | Three |
| 1 | 2 | 3 |
| a | b | c |
and not what I want:
myvector[0][0] = 'Two' AND myvector[2][0]='a'
myvector[0][1] = 'One' AND myvector[2][1]='b'
myvector[0][2] = 'Three' AND myvector[2][2]='c'
| Two | One | Three |
| 2 | 1 | 3 |
| a | b | c |
I looked around for a good approach but could not find anything that worked... I was thinking something like:
std::sort (myvector.begin(), myvector.end(), compare);
Then handle the sorting of the third index within my compare function so the whole vector would get edited... but when I passed my data I either only changed the order in the function and still did not change the top layer or got errors. Any advice or help would be greatly appreciated. Thank you in advance.
Ideally, merge the 3 data fields into a struct so that you can have just 1 vector and so sort it simply.
struct DataElement{
std::string str;
char theChar;
int num;
bool operator<(const DataElement& other)const{return theChar<other.theChar;}
};
std::vector<DataElement> myvector;
std::sort (myvector.begin(), myvector.end());