why does mutiset don't act like set - c++

Why is multiset a set while a set can only contains only different elements, while multiset can contain the same elements? It could of just be called sortedArray or sortedList. Even if it just wants a sorted "collections", why is it a set?

Why is multiset a set
In mathematics there are two distinct concept of set and multiset. Standard library has two containers that model these concepts: std::set and std::multiset. These concepts are not the same and therefore container names are also different because they model different mathematical concepts.

Why is multiset a set [...]
It's not. The word "set" does appear in "multiset", but that does not make a multiset a set. A multiset is a generalization of a set, not necessarily itself a set. This linguistic setup is similar to a hypergraph, which is a generalization of a graph but not necessarily a graph, and to a hyperplane, which is a generalization of a plane but not necessarily a plane.
A less mathematical example would be penultimate, which is not "ultimate", or any other word with a prefix that changes the meaning of the root.
Perhaps "butterfly" and "dragonfly" would be apropos examples. Neither is a fly, despite the word "fly" appearing in both names. (For that matter, neither is buttery or draconic.) Sometimes a name is just a name.

Related

C++ container for storing sorted unique values with different predicates for sorting and uniqueness

I have a record with 2 fields (say, A and B). 2 instances of the record should be considered equal, if their As are equal. On the other hand, a collection of the record instances should be sorted by the B field.
Is there a container like std::set, which can be defined with two different predicates, one for sorting and one for uniqueness, so I could avoid explicit sorting and just append elements? If no, how can it be workarounded?
Regards,
There is nothing in the standard library which would support your use case directly. You could use Boost.MultiIndexContainer for this purpose, though. Something like this:
typedef multi_index_container<
Record,
indexed_by<
ordered_non_unique<member<Record, decltype(Record::B), &Record::B>>,
hashed_unique<member<Record, decltype(Record::A), &Record::A>>
>
> RecordContainer;
(Code assuming correct headers and using namespace directives for brevity).
The idea is to create a container with two indices, one which will guarantee the ordering based on B and the other which will guarantee uniqueness based on A. decltype() in the code can of course be replaced by the actual types of A and B which you know, but I don't.
The order of the indices matters slightly, since for convenience, the container itself offers the same interface as its first index. You can always access any index by using container.get(), though.
The code is not intended as a copy&paste solution, but as a starting point. You can add customisations, index tags etc. Refer to Boost documentation for details.
Is there a container like std::set, which can be defined with two different predicates, one for sorting and one for uniqueness
std::set defines whether particular element is unique OR not in terms of the sorting criteria you provide to it( by default it uses less<>) . There's no need to explicitly pass another criteria for checking equality of elements.
With that said, however, you can use a predicate with algorithms to check for equality of elements of std::set.

Why is the C++ STL set container's count() method thus named?

What it really checks for is contains() and not the count of the number of occurrences, right? Duplicates are not permitted either so wouldn't contains() be a better name than count()?
It's to make it consistent with other container classes, given that one of the great aspects of polymorphism is to be able to treat different classes with the same API.
It does actually return the count. The fact that the count can only be zero or one for a set does not change that aspect.
It's not fundamentally different to a collection object that only allows two things of each "value" at the same time. In that case, it would return the count of zero, one or two, but it's still a count, the same as with a set.
The relevant part of the standard that requires this is C++11 23.2.4 which talks about the associative containers set, multiset, map and multimap. Table 102 contains the requirements for these associative containers over and above the requirements for "regular" containers, and the bit for count is paraphrased below:
size_type a.count(k) - returns the number of elements with key equivalent to k. Complexity is log(a.size()) + a.count(k).
All associative containers must meet the requirements listed in §23.2.4/8 Table 102 - Associative container requirements. One of these is that they implement a.count(k) which then
returns the number of elements with key equivalent to k
So the reason is to have a consistent interface between all associative containers. For instance, this uniformity will be very important when writing generic function templates that must work with any associative container.
It's a standard operation on containers that returns the number of matching elements. In things like lists, this makes perfect sense. It just so happens that on a set, there can only be one occurrence of an element and therefore count can never return a value greater than 1.

What is the set-like data structure in c++

I need to use the advantages of delphi sets like "in" in c++, but I don't know if there is a data structure like sets in c++
I know that I may use an array instead, but as I have said I want to use sets advantages like "in", so is there any built in data structure like sets in c++?
If yes, please explain how to use it, I'm still a starter in c++
If no, is there any way to represent it (exept array since I know it).
thanks in advance :)
There is a standard library container called std::set... I don't know delphi, but a simple element in set operation would be implemented by using the find method and comparing the result with end:
std::set<int> s;
s.insert( 5 );
if ( s.find( 5 ) != s.end() ) {
// 5 is in the set
}
Other operations might be implemented as algorithms in the standard library (std::union, std::difference... )
Use std::set. See http://www.cplusplus.com for reference.
In C++ there is nothing similarly integrated. Depending on your needs you might want to use bit flags and bitwise operations or the std::bitset standard container (besides std::set, of course). If you are using C++Builder there is also a class that simulates Delphi sets - search System.hpp for something like BaseSet or SetBase or similar - I don't recall the exact name.
Yes, there is a C++ STL set container class described on p. 491 of Stroustrup's TC++PL (Special Ed.).
STL algorithm has the following
From MSDN
set_difference
Unites all of the elements that belong to one sorted source range, but not to a second sorted source range, into a single, sorted destination range, where the ordering criterion may be specified by a binary predicate.
set_intersection
Unites all of the elements that belong to both sorted source ranges into a single, sorted destination range, where the ordering criterion may be specified by a binary predicate.
set_symmetric_difference
Unites all of the elements that belong to one, but not both, of the sorted source ranges into a single, sorted destination range, where the ordering criterion may be specified by a binary predicate.
set_union
Unites all of the elements that belong to at least one of two sorted source ranges into a single, sorted destination range, where the ordering criterion may be specified by a binary predicate.

Multiset without Compare?

I want to use multiset to count some custom defined keys. The keys are not comparable numerically, comparing two keys does not mean anything, but their equality can be checked.
I see that multiset template wants a Compare to order the multiset. The order is not important to me, only the counts are important. If I omit Compare completely what happens? Does multiset work without any problems for my custom keys? If I cannot use std::multiset what are my alternatives?
If you can only compare keys for equality then you cannot use std::multiset. For associative containers your key type must have a strict weak ordering imposed by a comparison operation.
The strict weak ordering doesn't necessarily have to be numerical.
[For use in an associative container, you don't actually need an equality comparison. Key equivalence is determined by !compare(a, b) && !compare(b, a).]
If you really can't define an ordering for your keys then your only option is to use an sequence container of key-value pairs and use an linear search for lookup. Needless to say this will be less efficient for set like operations than a multiset so you should probably try hard to create an ordering if at all possible.
You cannot use std::multiset if you don't have a strict weak ordering. Your options are:
Impose a strict-weak ordering on your data. If your key is a "linear" data structure, it is usually a good idea to compare it lexicographically.
Use an unordered container equivalent, e.g., boost::unordered_multiset. For that, you will need to make your custom data-type hash-able, which is often-times easier than imposing some kind of order.
If you omit the Compare completely, it will get the default value, which is less (which gives the result of the < operator applied to your key) - which may or may not even compile for your key.
The reason for having an ordering is that it allows the implementation to look up elements more quickly by their key (when inserting, deleting etc), To understand why, imagine looking words up in a dictionary. Traditional dictionaries use alphabetical order, which makes words easy to look up. If you were preparing a dictionary for a language that isn't easily ordered - say a pictographic language - then either it would be very hard to find words in it at all (you'd have to search the whole dictionary), or you'd try to find a logical way to order them (e.g. by putting all the pictures that can be drawn with one pen stroke first, then two lines, etc...) - because even if this order was completely arbitrary, it would make finding entries in the dictionary far more efficient.
Similarly, even if your keys don't need to be ordered for your own purposes, and don't have any natural order, you can usually define an ordering that is good enough to address these concerns. The ordering must be transitive (if a<b and b<c then a<c), and strict (never return true for a<a), asymmetric (a<b and b>a never both true). Ideally it should order all elements (if a & b are different then either a<b or b<a), though you can get away with that not being true (ie a strict weak ordering) - though that's rather technical.
Indeed, perhaps the most obvious use for it is the rare case where it is completely impossible to order the items - in which case you can supply a comparison operator which always returns false. This will very likely result in poor performance, but will at least function correctly.
So you have two important criteria which you listed.
You don't care about order
comparison of keys do not mean anything
and one assumed,
the fact that you are using multiset implies that there are many instances
So, why not use std::vector or std::deque or std::list? then you can take advantage of the various algorithms that can use the equality check (such as count_if etc.)

index or position in std::set

I have a std::set of std::string. I need the "index" or "position" of each string in the set, is this a meaningful concept in the context?
I guess find() will return an iterator to the string, so my question might be better phrased as : "How do I convert an iterator to a number?".
std::distance is what you need. You will want, I guess std::distance(set.begin(), find_result)
I don't think it is meaningful - set's are 'self keyed' and sorted thus the 'index' would be invalidated when the set is modified.
Of course it depends upon how you intend to use the index and if the set is essentially static (say a dictionary).
Despite what others have written here, I don't think that "index" or "position" has meaning with respect to a set. In mathematical terms, a set exposes only its members and maybe its cardinality. The only meaningful operations involve testing whether an item is a member of the set, and combining or subtracting sets to yield new sets.
Some people talk about sets as data structures in looser terms, by facets of being "ordered" or "unordered", and whether they permit duplicates or enforce uniqueness. The former facet distinguishes an array with an O(n) insertion guard, where an attempt to insert an item first scans the existing members to see if the new item exists and, if not, inserts the new item at the end, and a hash table, that might retain such order only within a bucket's chain. A tree such as the Red-Black Tree used by std::set is somewhere in between; its traversal order is deterministic with respect to the strict weak order imposed by the comparator predicate, but, unlike the array sketched above, it doesn't retain insertion order.
The other facet — whether the set permits duplicate elements — is meaningless in mathematics, and is more accurately described as a bag. Such a structure acknowledges the difference between identity and value-based "sameness."
Your problem may involve caring about some position; it's not clear what that position means, but I expect you're going to need some data structure separate from std::set to model this properly. Perhaps a std::map mapping from your set of elements to each position would do. That would not guarantee that the positions are unique.
It may also help clarify the problem to think how you'd model it as relations, such as in a relational database. What comprises the key? What portions of the entities can vary independently?