ranges from associative arrays in D 2

ranges from associative arrays in D 2 - d

I've just started implementing my first medium scale program in D 2.0 after reading Andrei's book The D Programming Language. One of the first problems I came to was using the std.algorithm library with a built-in associative array. For example:
#!/usr/bin/env rdmd
import std.stdio;
import std.algorithm;
void main()
{
alias int[string] StringHashmap;
StringHashmap map1;
map1["one"] = 1;
map1["two"] = 2;
writefln("map1: %s", map1);
StringHashmap map2;
map2["two"] = 2;
map2["three"] = 3;
writefln("map2: %s", map2);
auto inter = setIntersection(map1, map2);
}
It seemed a simple enough thing to me, expecting that iterating over the inter would produce the single "two" entry. However, I get this compiler error:
./test.d(20): Error: template
std.algorithm.setIntersection(alias
less = "a < b",Rs...) if
(allSatisfy!(isInputRange,Rs)) does
not match any function template
declaration
./test.d(20): Error: template
std.algorithm.setIntersection(alias
less = "a < b",Rs...) if
(allSatisfy!(isInputRange,Rs)) cannot
deduce template function from argument
types !()(int[string],int[string])
I can see that the built-in associative array doesn't seem to provide any version of the range to use with the std algorithms.
Am I missing something? Doing something wrong? If not, is this a glaring omission? Is there some reason why this is properly unavailable?

Use this:
auto inter = setIntersection(map1.keys, map2.keys);

Note that std::map in C++ is a sorted data structure, while an associative array in D is unordered. std.algorithm.setIntersection assumes a sorted range, so you can't use this function until you've converted the associative array into a sorted range, e.g. (result)
import std.typecons;
import std.array;
import std.algorithm;
import std.stdio;
auto byItemSorted(K,V)(V[K] dict) {
auto app = appender!(Tuple!(K,V)[])();
foreach (k, v; dict)
app.put(tuple(k, v));
auto res = app.data; // if there's byItem() we don't need this appender stuff.
sort(res);
return res;
}
auto dictIntersection(K,V)(V[K] map1, V[K] map2) {
return setIntersection(byItemSorted(map1), byItemSorted(map2));
}
void main () {
auto map1 = ["red":4, "blue":6],
map2 = ["blue":2, "green":1],
map3 = ["blue":6, "purple":8];
writeln("map1 & map2 = ", array(dictIntersection(map1, map2)));
writeln("map1 & map3 = ", array(dictIntersection(map1, map3)));
}
But this method is inefficient — it takes O(N log N) to sort a range.
A more efficient method is like to write your own intersection routine, which only takes O(N) (result):
import std.stdio;
struct DictIntersection(K,V) {
V[K] m1, m2;
this(V[K] map1, V[K] map2) { m1 = map1; m2 = map2; }
int opApply(int delegate(ref K, ref V) dg) {
int res = 0;
foreach (k, v; m1) {
V* p = k in m2;
if (p && v == *p) {
res = dg(k, v);
if (res)
break;
}
}
return res;
}
}
DictIntersection!(K,V) dictIntersection(K,V)(V[K] map1, V[K] map2) {
return typeof(return)(map1, map2);
}
void main () {
auto map1 = ["red":4, "blue":6],
map2 = ["blue":2, "green":1],
map3 = ["blue":6, "purple":8];
write("map1 & map2 = ");
foreach (k, v; dictIntersection(map1, map2)) write(k, "->", v, " ");
write("\nmap1 & map3 = ");
foreach (k, v; dictIntersection(map1, map3)) write(k, "->", v, " ");
}
However, because opApply doesn't count as an input range, all range algorithms won't work with this. (I don't know how this can be made into an input range.)

You can get either the keys or the values from an associative array.
To get the intersection on the values, use
auto inter = setIntersection(map1.values, map2.values);
foreach (i; inter) {
writeln(i);
}
To get the intersection on the keys, use
auto inter = setIntersection(map1.keys, map2.keys);
foreach (i; inter) {
writeln(i);
}
I don't think you can get access to a range containing the key, value pairs like with a C++ std::map.
See http://www.digitalmars.com/d/2.0/hash-map.html

Related

Find neighbor of target in BTreeMap (or any other treemap) in rust

How to efficiently implement below c++ function in rust? The data structure must be tree based (BTree, RBTree, etc).
Given a sorted map m, a key target, and a value val.
Find the lower_bound entry (the first key >= target). return DEFAULT if no such entry.
If the value of the found entry <= val and it has previous entry, return value of previous entry.
If the value of the found entry > val and it has next entry, return value of the next entry.
Otherwise, return the found value.
template<class K, class V>
V find_neighbor(const std::map<K, V>& m, const K& target, const V& val) {
auto it = m.lower_bound(target);
if( it == m.end() ) return V{}; // DEFAULT value.
if( it->second <= val && it != m.begin() )
return (--it)->value; // return previous value
if( it->second > val && it != (--m.end()) )
return (++it)->value; // return next value
return it->second; // return target value
}

Thats what I've got.
Create trait FindNeighbor that adds the function find_neighbor to all BTreeMaps
I'm quite confused what the algorithm does, though, tbh. But it should (tm) behave identical to the C++ version.
If you use this in an actual project though, for the love of god, please write unit tests for it. 😄
use std::{borrow::Borrow, collections::BTreeMap};
trait FindNeighbor<K, V> {
type Output;
fn find_neighbor(&self, target: K, val: V) -> Self::Output;
}
impl<K, V, KI, VI> FindNeighbor<K, V> for BTreeMap<KI, VI>
where
K: Borrow<KI>,
V: Borrow<VI>,
KI: Ord,
VI: Default + PartialOrd + Clone,
{
type Output = VI;
fn find_neighbor(&self, target: K, val: V) -> VI {
let val: &VI = val.borrow();
let target: &KI = target.borrow();
let mut it = self.range(target..);
match it.next() {
None => VI::default(),
Some((_, it_value)) => {
if it_value <= val {
match self.range(..target).rev().next() {
Some((_, prev_val)) => prev_val.clone(),
None => it_value.clone(),
}
} else {
match it.next() {
Some((_, next_val)) => next_val.clone(),
None => it_value.clone(),
}
}
}
}
}
}
fn main() {
let map = BTreeMap::from([(1, 5), (2, 3), (3, 8)]);
println!("{:?}", map.find_neighbor(3, 10));
}
3
Note a couple of differences between C++ and Rust:
Note that there are trait annotations on the generic parameters. Generic functions work a little different than C++ templates. All the capabilities that get used inside of a generic method have to be annotated as trait capabilities. The advantage is that generics are then guaranteed to work with every type they take, no random compiler errors can occur any more. (C++ templates are more like duck-typing, while Rust generics are strongly typed)
We implement a trait that adds new functionality to an external struct. That is something that also doesn't exist in C++, and tbh I really like this mechanic in Rust.

How to get the index of each entry when call the map function on a map object in Dart / Flutter?

How to get the index of each entry in a Map<K, V> in Dart?
Specifically, can the index of each entry be printed out, if a map function is run on the object as shown in the below example?
e.g. how could I print out:
MapEntry(Spring: 1) is index 0
MapEntry(Chair: 9) is index 1
MapEntry(Autumn: 3) is index 2
etc.
Map<String, int> exampleMap = {"Spring" : 1, "Chair" : 9, "Autumn" : 3};
void main() {
exampleMap.entries.map((e) { return print(e);}).toList(); ///print each index here
}
-Note: I can get the index with a List object (e.g. by using exampleList.indexOf(e)) but unsure how to do this when working with a Map object.

You can use forEach and a variable to track the current index:
Map<String, int> exampleMap = {"Spring": 1, "Chair": 9, "Autumn": 3};
void main() {
int i = 0;
exampleMap.forEach((key, value) {
print(i.toString());
print(key);
print(value);
i++;
});
}

partial lookup in key-value map where key itself is a key-value map

Suppose we have a data structure that is a key-value map, where the key itself is again a key-value map. For example:
map<map<string,string>>, string>
Now, suppose that we want to query all top-level key/values in this map matching a certain subset of the key-values of the key. Example:
map = { { "k1" : "v1", "k2 : "v2" } : "value1",
{ "k1" : "v3", "k2 : "v4" } : "value2",
{ "k1" : "v1", "k2 : "v5" } : "value3"
}
And our query is "give me all key-values where key contains { "k1" : "v1" } and it would return the first and third value. Similarly, querying for { "k1" : "v3", "k2" : "v4" } would return all key-values that have both k1=v3 and k2=v4, yielding the second value. Obviously we could search through the full map on every query, but I'm looking for something more efficient than that.
I have looked around, but can't find an efficient, easy-to-use solution out there for C++. Boost multi_index does not seem to have this kind of flexibility in querying subsets of key-value pairs.
Some databases have ways to create indices that can answer exactly these kind of queries. For example, Postgres has GIN indices (generalized inverted indices) that allow you to ask
SELECT * FROM table WHERE some_json_column #> '{"k1":"v1","k2":"v2"}'
-- returns all rows that have both k1=v1 and k2=v2
However, I'm looking for a solution without databases just in C++. Is there any library or data structure out there that can accomplish something like this? In case there is none, some pointers on a custom implementation?

I would stay with the database index analogy. In that analogy, the indexed search does not use a generic k=v type search, but just a tuple with the values for the elements (generally columns) that constitute the index. The database then reverts to scans for the other k=v parameters that are not in the index.
In that analogy, you would have a fixed number of keys that could be represented as an array or strings (fixed size). The good news is that it is then trivial to set a global order on the keys, and thanks to the std::map::upper_bound method, it is also trivial to find an iterator immediately after a partial key.
So getting a full key is immediate: just extract it with find, at or operator []. And getting all elements for a partial key is still simple:
find an iterator starting above the partial key with upper_bound
iterate forward while the element matches the partial key
But this require that you change your initial type to std::map<std::array<string, N>, string>
You could build an API over this container using std::map<string, string> as input values, extract the actual full or partial key from that, and iterate as above, keeping only elements matching the k,v pairs not present in index.

You could use std::includes to check if key maps include another map of queried key-value pairs.
I am unsure how to avoid checking every key-map though. Maybe other answers have a better idea.
template <typename MapOfMapsIt, typename QueryMapIt>
std::vector<MapOfMapsIt> query_keymap_contains(
MapOfMapsIt mom_fst,
MapOfMapsIt mom_lst,
QueryMapIt q_fst,
QueryMapIt q_lst)
{
std::vector<MapOfMapsIt> out;
for(; mom_fst != mom_lst; ++mom_fst)
{
const auto key_map = mom_fst->first;
if(std::includes(key_map.begin(), key_map.end(), q_fst, q_lst))
out.push_back(mom_fst);
}
return out;
}
Usage:
typedef std::map<std::string, std::string> StrMap;
typedef std::map<StrMap, std::string> MapKeyMaps;
MapKeyMaps m = {{{{"k1", "v1"}, {"k2", "v2"}}, "value1"},
{{{"k1", "v3"}, {"k2", "v4"}}, "value2"},
{{{"k1", "v1"}, {"k2", "v5"}}, "value3"}};
StrMap q1 = {{"k1", "v1"}};
StrMap q2 = {{"k1", "v3"}, {"k2", "v4"}};
auto res1 = query_keymap_contains(m.begin(), m.end(), q1.begin(), q1.end());
auto res2 = query_keymap_contains(m.begin(), m.end(), q2.begin(), q2.end());
std::cout << "Query1: ";
for(auto i : res1) std::cout << i->second << " ";
std::cout << "\nQuery2: ";
for(auto i : res2) std::cout << i->second << " ";
Output:
Query1: value1 value3
Query2: value2
Live Example

I believe the efficiency of different methods will depend on actual data. However, I would consider making a "cache" of iterators to outer map elements for particular "kX","vY" pairs as follows:
using M = std::map<std::map<std::string, std::string>, std::string>;
M m = {
{ { { "k1", "v1" }, { "k2", "v2" } }, "value1" },
{ { { "k1", "v3" }, { "k2", "v4" } }, "value2" },
{ { { "k1", "v1" }, { "k2", "v5" } }, "value3" }
};
std::map<M::key_type::value_type, std::vector<M::iterator>> cache;
for (auto it = m.begin(); it != m.end(); ++it)
for (const auto& kv : it->first)
cache[kv].push_back(it);
Now, you basically need to take all searched "kX","vY" pairs and find the intersection of cached iterators for them:
std::vector<M::key_type::value_type> find_list = { { "k1", "v1" }, { "k2", "v5" } };
std::vector<M::iterator> found;
if (find_list.size() > 0) {
auto it = find_list.begin();
std::copy(cache[*it].begin(), cache[*it].end(), std::back_inserter(found));
while (++it != find_list.end()) {
const auto& temp = cache[*it];
found.erase(std::remove_if(found.begin(), found.end(),
[&temp](const auto& e){ return std::find(temp.begin(), temp.end(), e) == temp.end(); } ),
found.end());
}
}
The final output:
for (const auto& it : found)
std::cout << it->second << std::endl;
gives value3 in this case.
A live demo: https://wandbox.org/permlink/S9Zp8yofSvjfLokc.
Note that the complexity of the intersection step is quite large, since cached iterators are unsorted. If you use pointers instead, you can sort the vectors or store the pointers in a map instead, which would allow you to find intersections much faster, e.g., by using std::set_intersection.

You can do it with as single (partial) pass through each element with an ordered query, returning early as much as possible. Taking inspiration from std::set_difference, we want to know if query is a subset of data, which lets us select entries of the outer map.
// Is the sorted range [first1, last1) a subset of the sorted range [first2, last2)
template<class InputIt1, class InputIt2>
bool is_subset(InputIt1 first1, InputIt1 last1, InputIt2 first2, InputIt2 last2)
{
while (first1 != last1) {
if (first2 == last2) return false; // Reached the end of data with query still remaing
if (*first1 < *first2) {
return false; // didn't find this query element
} else {
if (! (*first2 < *first1)) {
++first1; // found this query element
}
++first2;
}
}
return true; // reached the end of query
}
// find every element of "map-of-maps" [first2, last2) for which the sorted range [first1, last1) is a subset of it's key
template<class InputIt1, class InputIt2, class OutputIt>
OutputIt query_data(InputIt1 first1, InputIt1 last1, InputIt2 first2, InputIt2 last2, OutputIt d_first)
{
auto item_matches = [=](auto & inner){ return is_subset(first1, last1, inner.first.begin(), inner.first.end()); };
return std::copy_if(first2, last2, d_first, item_matches);
}

std::map is implemented as a balanced binary tree which has O(nlgn) look-up. What you need instead, is std::unordered_map which is implemented as a hash-table, that is O(1) look-ups.
Now let me rephrase your wording, you want to:
And our query is "give me all key-values where key contains { "k1" : "v1" } and it would return the first and third value.
Which translates to:
If the key-value pair given is in the inner map, give me back its value.
Essentially what you need is a double look-up which std::unordered_map excel at.
Here is a code spinet that solves your problem with the standard library (no fancy code required)
#include <iostream>
#include <unordered_map>
#include <string>
int main() {
using elemType = std::pair<std::string, std::string>;
using innerMap = std::unordered_map<std::string, std::string>;
using myMap = std::unordered_map<std::string, innerMap>;
auto table = myMap{ { "value1", { {"k1", "v1"}, {"k2", "v2"} } },
{ "value2", { {"k1", "v3"}, {"k2", "v4"} } },
{ "value3", { {"k1", "v1"}, {"k2", "v5"} } } };
//First we set-up a predicate lambda
auto printIfKeyValueFound = [](const myMap& tab, const elemType& query) {
// O(n) for the first table and O(1) lookup for each, O(n) total
for(const auto& el : tab) {
auto it = el.second.find(query.first);
if(it != el.second.end()) {
if(it->second == query.second) {
std::cout << "Element found: " << el.first << "\n";
}
}
}
};
auto query = elemType{"k1", "v1"};
printIfKeyValueFound(table, query);
Output: Value3, Value1
For queries of arbitrary size you can:
//First we set-up a predicate lambda
auto printIfKeyValueFound = [](const myMap& tab, const std::vector<elemType>& query) {
// O(n) for the first table and O(n) for the query O(1) search
// O(n^2) total
for(const auto& el : tab) {
bool found = true;
for(const auto& queryEl : query) {
auto it = el.second.find(queryEl.first);
if(it != el.second.end() && it->second != queryEl.second) {
found = false;
break;
}
}
if(found)
std::cout << el.first << "\n";
}
};
auto query = std::vector<elemType>{ {"k1", "v1"}, {"k2", "v2"} };
output Value1

How can I take ownership of a collection using range-v3?

I want to return a range from a function that represents a view on a STL collection, something like this:
auto createRange() {
std::unordered_set<int> is = {1, 2, 3, 4, 5, 6};
return is | view::transform([](auto&& i) {
return i;
});
}
However, view::transform does not take ownership of is, so when I run this, there is undefined behavior, because is is freed when createRange exits.
int main(int argc, char* argv[]) {
auto rng = createRange();
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
If I try std::move(is) as the input, I get a static assert indicating that I can't use rvalue references as inputs to a view. Is there any way to ensure that the view takes ownership of the collection?
Edit: Some Additional Info
I want to add some clarifying info. I have a stream of data, data that I have a view on that transforms the data into a struct, Foo, that looks something like this:
struct Foo {
std::string name;
std::unordered_set<int> values;
}
// Take the input stream and turn it into a range of Foos
auto foos = data | asFoo();
What I want to do is create a range of std::pair<std::string, int> by distributing the name throughout the values. My naive attempt looks something like this:
auto result = data | asFoo() | view::transform([](auto&& foo) {
const auto& name = foo.name;
const auto& values = foo.values;
return values | view::transform([name](auto&& value) {
return std::make_pair(name, value);
}
}) | view::join;
However, this results in the undefined behavior because values is freed. The only way that I have been able to get around this is to make values a std::shared_ptr and to capture it in the lambda passed to view::transform to preserve it's lifetime. That seems like an inelegant solution.
I think what I am looking for is a view that will take ownership of the source collection, but it does not look like range-v3 has that.
Alternatively, I could just create the distributed version using a good old fashioned for-loop, but that does not appear to work with view::join:
auto result = data | asFoo() | view::transform([](auto&& foo) {
const auto& name = foo.name;
const auto& values = foo.values;
std::vector<std::pair<std::string, std::string>> distributedValues;
for (const auto& value : values) {
distributedValues.emplace_back(name, value);
}
return distributedValues;
}) | view::join;
Even if this did work with view::join, I also think that the mixed metaphor of ranges and loops is also inelegant.

Views do not own the data they present. If you need to ensure the persistence of the data, then the data itself needs to be preserved.
auto createRange() {
//We're using a pointer to ensure that the contents don't get moved around, which might invalidate the view
std::unique_ptr<std::unordered_set<int>> is_ptr = std::make_unique<std::unordered_set<int>>({1,2,3,4,5,6});
auto & is = *is_ptr;
auto view = is | view::transform([](auto&& i) {return i;});
return std::make_pair(view, std::move(is_ptr));
}
int main() {
auto[rng, data_ptr] = createRange();
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
An alternate method is to make sure the function is provided the data set from which the view will be created:
auto createRange(std::unordered_set<int> & is) {
return is | view::transform([](auto&& i) {return i;});
}
int main() {
std::unordered_set<int> is = {1,2,3,4,5,6};
auto rng = createRange(is);
ranges::for_each(rng, [](auto&& i) {
std::cout << std::to_string(i) << std::endl;
});
}
Either solution should broadly represent what your solution for your project will need to do.

A more elegant way for a template function grouping values?

The following code groups values for any container with a generic grouping lambda:
template<class Iterator, class GroupingFunc,
class T = remove_reference_t<decltype(*declval<Iterator>())>,
class GroupingType = decltype(declval<GroupingFunc>()(declval<T&>()))>
auto groupValues(Iterator begin, Iterator end, GroupingFunc groupingFunc) {
map<GroupingType, list<T>> groups;
for_each(begin, end,
[&groups, groupingFunc](const auto& val){
groups[groupingFunc(val)].push_back(val);
} );
return groups;
}
With the following usage:
int main() {
list<string> strs = {"hello", "world", "Hello", "World"};
auto groupOfStrings =
groupValues(strs.begin(), strs.end(),
[](auto& val) {
return (char)toupper(val.at(0));
});
print(groupOfStrings); // assume a print method
list<int> numbers = {1, 5, 10, 24, 13};
auto groupOfNumbers =
groupValues(numbers.begin(), numbers.end(),
[](int val) {
int decile = int(val / 10) * 10;
return to_string(decile) + '-' + to_string(decile + 9);
});
print(groupOfNumbers); // assume a print method
}
I am a bit reluctant regarding the (over?)-use of declval and decltype in groupValues function.
Do you see a better way for writing it?
(Question is mainly for better style and clarity unless of course you see any other issue).
Code: http://coliru.stacked-crooked.com/a/f65d4939b402a750

I would probably move the last two template parameters inside the function, and use std::result_of to give a slightly more tidy function:
template <typename T>
using deref_iter_t = std::remove_reference_t<decltype(*std::declval<T>())>;
template<class Iterator, class GroupingFunc>
auto groupValues(Iterator begin, Iterator end, GroupingFunc groupingFunc) {
using T = deref_iter_t<Iterator>;
using GroupingType = std::result_of_t<GroupingFunc(T&)>;
std::map<GroupingType, std::list<T>> groups;
std::for_each(begin, end, [&groups, groupingFunc](const auto& val){
groups[groupingFunc(val)].push_back(val);
});
return groups;
}
live demo

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

ranges from associative arrays in D 2 - d

Use this: auto inter = setIntersection(map1.keys, map2.keys);

Related

Find neighbor of target in BTreeMap (or any other treemap) in rust

How to get the index of each entry when call the map function on a map object in Dart / Flutter?

partial lookup in key-value map where key itself is a key-value map

How can I take ownership of a collection using range-v3?

A more elegant way for a template function grouping values?

Categories

Resources