What is the time complexity of inserting string to set container of c++ STL?
According to me, it should be O(xlogn), where x is length of string to be inserted and n is size of set. Also copying of string to set should be of linear in length of string.
But this code of mine is running instantly.
#include<bits/stdc++.h>
using namespace std;
int main(){
set<string> c;
string s(100000,'a');
for(int i=0;i<100000;i++){
c.insert(s);
}
}
Where am I wrong, shouldn't the complexity be order of 10^10?
You should use the set in some way to reduce the risk of the loop getting optimized away, for example by adding return c.size();.
Also your choice of the number of iterations might be too low. Add a digit to the loop counter and you will see a noticeable run time.
A modern CPU can easily process >2*109 ops/s. Assuming your compiler uses memcmp, which is probably hand-vectorized, with a small working set such as yours you're working entirely from the cache and can reach throughput of up to 512 bytes per comparison (with AVX2). Assuming a moderate rate of 10 cycles per iteration, we can still compare >1010 bytes/s. So your program should run in <1 s on moderate hardware.
Try this updated code instead:
#include <string>
#include <set>
using namespace std;
int main(){
set<string> c;
string s(100000,'a');
for(int i=0;i<1000000;i++) { // Add a digit here
c.insert(s);
}
return c.size(); // use something from the set
}
With optimization on (-O3) this takes ~5 seconds to run on my system.
In other words yes, inserting into a binary tree has O(log n) complexity, but comparing a string has O(n) complexity. These n's aren't the same, in the case of a map it represents the map size, and in case of string - the length of the string.
In your particular case the map has just one element, so insertion is O(1). You get linear complexity O(n) purely from string comparisons, where n is string_length * number_of_iterations.
Related
I am trying to solve the programming problem firstDuplicate on codesignal. The problem is "Given an array a that contains only numbers in the range 1 to a.length, find the first duplicate number for which the second occurrence has minimal index".
Example: For a = [2, 1, 3, 5, 3, 2] the output should be firstDuplicate(a) = 3
There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than the second occurrence of 2 does, so the answer is 3.
With this code I pass 21/23 tests, but then it tells me that the program exceeded the execution time limit on test 22. How would I go about making it faster so that it passes the remaining two tests?
#include <algorithm>
int firstDuplicate(vector<int> a) {
vector<int> seen;
for (size_t i = 0; i < a.size(); ++i){
if (std::find(seen.begin(), seen.end(), a[i]) != seen.end()){
return a[i];
}else{
seen.push_back(a[i]);
}
}
if (seen == a){
return -1;
}
}
Anytime you get asked a question about "find the duplicate", "find the missing element", or "find the thing that should be there", your first instinct should be use a hash table. In C++, there are the unordered_map and unordered_set classes that are for such types of coding exercises. The unordered_set is effectively a map of keys to bools.
Also, pass you vector by reference, not value. Passing by value incurs the overhead of copying the entire vector.
Also, that comparison seems costly and unnecessary at the end.
This is probably closer to what you want:
#include <unordered_set>
int firstDuplicate(const vector<int>& a) {
std::unordered_set<int> seen;
for (int i : a) {
auto result_pair = seen.insert(i);
bool duplicate = (result_pair.second == false);
if (duplicate) {
return (i);
}
}
return -1;
}
std::find is linear time complexity in terms of distance between first and last element (or until the number is found) in the container, thus having a worst-case complexity of O(N), so your algorithm would be O(N^2).
Instead of storing your numbers in a vector and searching for it every time, Yyu should do something like hashing with std::map to store the numbers encountered and return a number if while iterating, it is already present in the map.
std::map<int, int> hash;
for(const auto &i: a) {
if(hash[i])
return i;
else
hash[i] = 1;
}
Edit: std::unordered_map is even more efficient if the order of keys doesn't matter, since insertion time complexity is constant in average case as compared to logarithmic insertion complexity for std::map.
It's probably an unnecessary optimization, but I think I'd try to take slightly better advantage of the specification. A hash table is intended primarily for cases where you have a fairly sparse conversion from possible keys to actual keys--that is, only a small percentage of possible keys are ever used. For example, if your keys are strings of length up to 20 characters, the theoretical maximum number of keys is 25620. With that many possible keys, it's clear no practical program is going to store any more than a minuscule percentage, so a hash table makes sense.
In this case, however, we're told that the input is: "an array a that contains only numbers in the range 1 to a.length". So, even if half the numbers are duplicates, we're using 50% of the possible keys.
Under the circumstances, instead of a hash table, even though it's often maligned, I'd use an std::vector<bool>, and expect to get considerably better performance in the vast majority of cases.
int firstDuplicate(std::vector<int> const &input) {
std::vector<bool> seen(input.size()+1);
for (auto i : input) {
if (seen[i])
return i;
seen[i] = true;
}
return -1;
}
The advantage here is fairly simple: at least in a typical case, std::vector<bool> uses a specialization to store bools in only one bit apiece. This way we're storing only one bit for each number of input, which increases storage density, so we can expect excellent use of the cache. In particular, as long as the number of bytes in the cache is at least a little more than 1/8th the number of elements in the input array, we can expect all of seen to be in the cache most of the time.
Now make no mistake: if you look around, you'll find quite a few articles pointing out that vector<bool> has problems--and for some cases, that's entirely true. There are places and times that vector<bool> should be avoided. But none of its limitations applies to the way we're using it here--and it really does give an advantage in storage density that can be quite useful, especially for cases like this one.
We could also write some custom code to implement a bitmap that would give still faster code than vector<bool>. But using vector<bool> is easy, and writing our own replacement that's more efficient is quite a bit of extra work...
I'm getting memory limit exceeded error for this code. I can't find the way to resolve it. If I'm taking a long long int it gives the same error.
Why this error happening?
#include<bits/stdc++.h>
#define ll long long int
using namespace std;
int main()
{
///1000000000000 500000000001 getting memory limit exceed for this test case.
ll n,k;
cin>>n>>k;
vector<ll> v;
vector<ll> arrange;
for(ll i=0;i<n;i++)
{
v.push_back(i+1);
}
//Arranging vector like 1,3,5,...2,4,6,....
for(ll i=0;i<v.size();i++)
{
if(v[i]%2!=0)
{
arrange.push_back(v[i]);
}
}
for(ll i=0;i<v.size();i++)
{
if(v[i]%2==0)
{
arrange.push_back(v[i]);
}
}
cout<<arrange[k-1]<<endl; // Found the kth number.
return 0;
}
The provided code solves a coding problem for small values of n and k. However as you noticed it does fail for large values of n. This is because you are trying to allocate a couple of vectors of 1000000000000 elements, which exceeds the amount of memory available in today's computers.
Hence I'd suggest to return to the original problem you're solving, and try an approach that doesn't need to store all the intermediary values in memory. Since the given code works for small values of n and k, you can use the given code to check whether the approach without using vectors works.
I would suggest the following steps to redesign the approach to the coding problem:
Write down the contents of arrange for a small value of n
Write down the matching values of k for each element of arrange
Derive the (mathematical) function that maps k to the matching element in arrange
For this problem this can be done in constant time, so there is no need for loops
Note that this function should work both for even and odd values of k.
Test whether your code works by comparing it with your current results.
I would suggest trying the preceding steps first to come up with your own approach. If you can not find a working solution, please have a look at this approach on Wandbox.
Assume long long int is a 8 byte type. This is a commonly valid assumptions.
For every entry in the array, you are requesting to allocate 8 byte.
If you request to allocate 1000000000000 items, your are requesting to allocate 8 Terabyte of memory.
Moreover you are using two arrays and you are requesting to allocate more than 8 Terabyte.
Just use a lower number of items for your arrays and it will work.
I encountered a problem that requires the program to count the number of points within an interval. This problem provides a large amount of unsorted points, and lo,hi(restriction lo<=hi), and it aims to enumerate the points within [lo,hi]. The problem is that although my code is correct, it is too time-consuming to finish within given time (2200ms). My code can finish this mission in O(n). I would like to ask if there are any faster methods.
int n,m,c,lo,hi;
cin>>n>>m;
int arr[n];
for(int i=0;i<n;i++){
cin>>arr[i];
}
cin>>lo>>hi;
c=0;
for(int j=0;j<n;j++){
if(arr[j]<=hi&&lo<=arr[j])c++;
}
cout<<c<<endl;
It is impossible to solve this problem in less than O(n) time, because you must consider all inputs at least once.
However, you might be able to reduce the constant factor of n — have you consider storing a set of (start, end) intervals, rather than a simple array? What is the input size which causes this to be slow?
Edit: upon further testing, it seems the bottleneck is actually the use of cin to read numbers.
Try replacing every instance of cin >> x; with scanf("%d", &x); — for me, this brings the runtime down to about 0.08 seconds.
You can do it faster than O(N) only if you need to do lookups more than once on the same data set:
Sort the array or its copy. For lookup you can use binary search - which is O(log2 N) complex.
Instead of flat array to use something like binary tree, lookup complexity will be as in #1.
What I have is two text files. One contains a list of roughly 70,000 names (~1.5MB). The other contains text which will be obtained from miscellaneous sources. That is, this file's contents will change each time the program is executed (~0.5MB). Essentially, I want to be able to paste some text into a text file and see which names from my list are found. Kind of like the find function (CTR + F) but with 70,000 keywords.
In any case, what I have thus far is:
int main()
{
ifstream namesfile("names.txt"); //names list
ifstream miscfile("misc.txt"); //misc text
vector<string> vecnames; //vector to hold names
vector<string> vecmisc; //vector to hold misc text
size_t found;
string s;
string t;
while (getline(namesfile,s))
veccomp.push_back(s);
while (getline(miscfile,t))
vectenk.push_back(t);
//outer loop iterates through names list
for (vector<string>::size_type i = 0; i != vecnames.size(); ++i) {
//inner loop iterates through the lines of the mist text file
for (vector<string>::size_type j = 0;j != vecmisc.size(); ++j) {
found=vecmisc[j].find(vecnames[i]);
if (found!=string::npos) {
cout << vecnames[i] << endl;
break;
}
}
}
cout << "SEARCH COMPLETE";
//to keep console application from exiting
getchar();
return 0;
}
Now this works great as far as extracting the data I need, however, it is terribly slow and obviously inefficient since each name requires that I potentially search the entire file again which gives (75000 x # of lines in misc text file) iterations. If anyone could help, I would certainly appreciate it. Some sample code is most welcomed. Additionally, I'm using Dev C++ if that makes any difference. Thanks.
Use a std::hash_set. Insert all your keywords into the set, then traverse the large document and each time you come to a word, test whether the set includes that word.
Using a vector, the best-case search time you're going to get is O(log N) complexity using a binary search algorithm, and that's only going to work for a sorted list. If you include the time it will take to make sorted insertions into a list, the final amortized complexity for a sorted linear container (arrays, lists), as well as non-linear containers such as binary-search trees, O(N log N). So that basically means that if you add more elements to the list, the time it will take to both add those elements to the list, as well as find them later on, will increase at a rate a little faster than the linear growth rate of the list (i.e., if you double the size of the list, it will take a little over double the time to sort the list, and then any searches on the list will be pretty quick ... in order to double the search time, the list would have to grow by the square of the existing amount of elements).
A good hash-table implementation on the other-hand (such as std::unordered_map) along with a good hash-algorithm that avoids too many collisions, has an amortized complexity of O(1) ... that means overall there's a constant look-up time for any given element, no matter how many elements there are, making searches very fast. The main penalty over a linear list or binary-search tree for the hash-table is the actual memory footprint of the hash table. A good hash-table, in order to avoid too many collisions, will want to have a size equal to some large prime number that is at least greater than 2*N, where N is the total number of elements you plan on storing in the array. But the "wasted space" is the trade-off for efficient and extremely fast look-ups.
While a map of any kind is the simplest solution, Scott Myers makes a good case for sorted vector and binary_search from algorithm (in Effective STL).
Using a sorted vector, your code would look something like
#include <algorithm>
...
int vecsize = vecnames.size();
sort(vecnames.begin(), vecnames.begin() + vecsize );
for (vector<string>::size_type j = 0;j != vecmisc.size(); ++j)
{
bool found= binary_search(vecnames.begin(), vecnames.begin()+vecsize,vecmisc[j]);
if (found) std::cout << vecmisc[j] << std::endl;
}
The advantages of using a sorted vector and binary_search are
1) There is no tree to traverse, the binary_search begins at (end-start)/2, and keeps dividing by 2. It will take at most log(n) to search the range.
2) There is no key,value pair. You get the simplicity of a vector without the overhead of a map.
3) The vector's elements are in a contiguous range (which is why you should use reserve before populating the vector, inserts are faster), and so searching through the vector's elements rarely crosses page boundaries (slightly faster).
4) It's cool.
I'm intersecting some sets of numbers, and doing this by storing a count of each time I see a number in a map.
I'm finding the performance be very slow.
Details:
- One of the sets has 150,000 numbers in it
- The intersection of that set and another set takes about 300ms the first time, and about 5000ms the second time
- I haven't done any profiling yet, but every time I break the debugger while doing the intersection its in malloc.c!
So, how can I improve this performance? Switch to a different data structure? Some how improve the memory allocation performance of map?
Update:
Is there any way to ask std::map or
boost::unordered_map to pre-allocate
some space?
Or, are there any tips for using these efficiently?
Update2:
See Fast C++ container like the C# HashSet<T> and Dictionary<K,V>?
Update3:
I benchmarked set_intersection and got horrible results:
(set_intersection) Found 313 values in the intersection, in 11345ms
(set_intersection) Found 309 values in the intersection, in 12332ms
Code:
int runIntersectionTestAlgo()
{
set<int> set1;
set<int> set2;
set<int> intersection;
// Create 100,000 values for set1
for ( int i = 0; i < 100000; i++ )
{
int value = 1000000000 + i;
set1.insert(value);
}
// Create 1,000 values for set2
for ( int i = 0; i < 1000; i++ )
{
int random = rand() % 200000 + 1;
random *= 10;
int value = 1000000000 + random;
set2.insert(value);
}
set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));
return intersection.size();
}
You should definitely be using preallocated vectors which are way faster. The problem with doing set intersection with stl sets is that each time you move to the next element you're chasing a dynamically allocated pointer, which could easily not be in your CPU caches. With a vector the next element will often be in your cache because it's physically close to the previous element.
The trick with vectors, is that if you don't preallocate the memory for a task like this, it'll perform EVEN WORSE because it'll go on reallocating memory as it resizes itself during your initialization step.
Try something like this instaed - it'll be WAY faster.
int runIntersectionTestAlgo() {
vector<char> vector1; vector1.reserve(100000);
vector<char> vector2; vector2.reserve(1000);
// Create 100,000 values for set1
for ( int i = 0; i < 100000; i++ ) {
int value = 1000000000 + i;
set1.push_back(value);
}
sort(vector1.begin(), vector1.end());
// Create 1,000 values for set2
for ( int i = 0; i < 1000; i++ ) {
int random = rand() % 200000 + 1;
random *= 10;
int value = 1000000000 + random;
set2.push_back(value);
}
sort(vector2.begin(), vector2.end());
// Reserve at most 1,000 spots for the intersection
vector<char> intersection; intersection.reserve(min(vector1.size(),vector2.size()));
set_intersection(vector1.begin(), vector1.end(),vector2.begin(), vector2.end(),back_inserter(intersection));
return intersection.size();
}
Without knowing any more about your problem, "check with a good profiler" is the best general advise I can give. Beyond that...
If memory allocation is your problem, switch to some sort of pooled allocator that reduces calls to malloc. Boost has a number of custom allocators that should be compatible with std::allocator<T>. In fact, you may even try this before profiling, if you've already noticed debug-break samples always ending up in malloc.
If your number-space is known to be dense, you can switch to using a vector- or bitset-based implementation, using your numbers as indexes in the vector.
If your number-space is mostly sparse but has some natural clustering (this is a big if), you may switch to a map-of-vectors. Use higher-order bits for map indexing, and lower-order bits for vector indexing. This is functionally very similar to simply using a pooled allocator, but it is likely to give you better caching behavior. This makes sense, since you are providing more information to the machine (clustering is explicit and cache-friendly, rather than a random distribution you'd expect from pool allocation).
I would second the suggestion to sort them. There are already STL set algorithms that operate on sorted ranges (like set_intersection, set_union, etc):
set_intersection
I don't understand why you have to use a map to do intersection. Like people have said, you could put the sets in std::set's, and then use std::set_intersection().
Or you can put them into hash_set's. But then you would have to implement intersection manually: technically you only need to put one of the sets into a hash_set, and then loop through the other one, and test if each element is contained in the hash_set.
Intersection with maps are slow, try a hash_map. (however, this is not provided in all STL implementation.
Alternatively, sort both map and do it in a merge-sort-like way.
What is your intersection algorithm? Maybe there are some improvements to be made?
Here is an alternate method
I do not know it to be faster or slower, but it could be something to try. Before doing so, I also recommend using a profiler to ensure you really are working on the hotspot. Change the sets of numbers you are intersecting to use std::set<int> instead. Then iterate through the smallest one looking at each value you find. For each value in the smallest set, use the find method to see if the number is present in each of the other sets (for performance, search from smallest to largest).
This is optimised in the case that the number is not found in all of the sets, so if the intersection is relatively small, it may be fast.
Then, store the intersection in std::vector<int> instead - insertion using push_back is also very fast.
Here is another alternate method
Change the sets of numbers to std::vector<int> and use std::sort to sort from smallest to largest. Then use std::binary_search to find the values, using roughly the same method as above. This may be faster than searching a std::set since the array is more tightly packed in memory. Actually, never mind that, you can then just iterate through the values in lock-step, looking at the ones with the same value. Increment only the iterators which are less than the minimum value you saw at the previous step (if the values were different).
Might be your algorithm. As I understand it, you are spinning over each set (which I'm hoping is a standard set), and throwing them into yet another map. This is doing a lot of work you don't need to do, since the keys of a standard set are in sorted order already. Instead, take a "merge-sort" like approach. Spin over each iter, dereferencing to find the min. Count the number that have that min, and increment those. If the count was N, add it to the intersection. Repeat until the first map hits it's end (If you compare the sizes before starting, you won't have to check every map's end each time).
Responding to update: There do exist faculties to speed up memory allocation by pre-reserving space, like boost::pool_alloc. Something like:
std::map<int, int, std::less<int>, boost::pool_allocator< std::pair<int const, int> > > m;
But honestly, malloc is pretty good at what it does; I'd profile before doing anything too extreme.
Look at your algorithms, then choose the proper data type. If you're going to have set-like behaviour, and want to do intersections and the like, std::set is the container to use.
Since it's elements are stored in a sorted way, insertion may cost you O(log N), but intersection with another (sorted!) std::set can be done in linear time.
I figured something out: if I attach the debugger to either RELEASE or DEBUG builds (e.g. hit F5 in the IDE), then I get horrible times.