What's the complexity of this algorithm? - c++

Here is an algorithm counting occurrences of anagrams of one string (search_word) in the other (text):
#include<iostream>
#include<algorithm>
#include<string>
#include<deque>
using namespace std;
int main()
{
string text = "forxxorfxdofr";
string search_word = "for";
deque<char> word;
word.insert(word.begin(), text.begin(), text.begin() + search_word.size());
int ana_cnt = 0;
for (int ix = 3; ix <= text.size(); ++ix)
{
deque<char> temp = word;
sort(word.begin(), word.end());
if (string(word.begin(), word.end()) == search_word)
++ana_cnt;
word = temp;
word.pop_front();
word.push_back(text[ix]);
}
cout << ana_cnt << endl;
}
What's the complexity of this algorithm?
I think it's O(n) algorithm, where n is the length o text. This is because the amount of time needed to execute what is inside for loop is independent of the lenght of n. However, some think it is not O(n). They say the sorting algorithm also counts when computing complexity.

It's O(n) if you only consider the string text with length n as input.
Proof: You're looping over ix from 3 (probably search_word.size(), isn't it?) to text.size(), so asymptotically you execute the loop body n times (since there is no break, continue or modification of ix in the loop body).
The loop body is independent of n. It sorts a queue of fixed size, namely m = search_word.size(), that is O(m log(m)) in the average case (worst case O(m^2)). As this is independent of n we're done with a total of O(n).
It's not O(n): If you want to be a little bit more precise, you'd probably count search_word with length m as input and this comes to a total of O(n m log(m)) on average, O(n m^2) in the worst case.

Related

Time complexity for finding minimum and maximum value of vector's elements in C++

It's the code covering the above problem:
#include <iostream>
using namespace std;
#include <vector>
int main()
{
unsigned int n;
cin >> n;
int elementsOfVector;
vector <double> wektor;
for(int i = 0; i<n; i++) {
cin >> elementsOfVector;
wektor.push_back(elementsOfVector);
}
double min = wektor[0];
double max = wektor[1];
if (min > max) {
min = wektor[1];
max = wektor[0];
}
for(int i = 2; i<n; i++) {
if (max < wektor[i]) {
max = wektor[i];
}
else if (min > wektor[i]) {
min = wektor[i];
}
}
cout << "Min " << min << " max " << max;
return 0;
}
According to my analysis:
Firstly, we have a for loop to assign all vector's elements with values, we make n-iterations so the time complexity of the action is O(n). Then we have a if statement with condition within it where we compare one value to other but there are always just those two values no matter what n input is so we can assume it's O(1) constant complexity in Big-O notation - not sure if this is correct so I would be grateful If anyone could relate. In the second for loop we make n-2 iterations and the operations inside the for loop are simple arithmetic operations and cost 1 so we can avoid it in big O notation: To sum up n + n = 2n O(2n) so total time complexity is O(n). Am I right?
First part for loop. O(n)
Second part if(min>max) O(1)
Third part for loop O(n)
Total O(n)+O(1)+O(n) = O(n)
The third part since you iterator n-2 times, it's O(n).

Find duplicate in unsorted array with best time Complexity

I know there were similar questions, but not of such specificity
Input: n-elements array with unsorted emelents with values from 1 to (n-1).
one of the values is duplicate (eg. n=5, tab[n] = {3,4,2,4,1}.
Task: find duplicate with best Complexity.
I wrote alghoritm:
int tab[] = { 1,6,7,8,9,4,2,2,3,5 };
int arrSize = sizeof(tab)/sizeof(tab[0]);
for (int i = 0; i < arrSize; i++) {
tab[tab[i] % arrSize] = tab[tab[i] % arrSize] + arrSize;
}
for (int i = 0; i < arrSize; i++) {
if (tab[i] >= arrSize * 2) {
std::cout << i;
break;
}
but i dont think it is with best possible Complexity.
Do You know better method/alghoritm? I can use any c++ library, but i don't have any idea.
Is it possible to get better complexity than O(n) ?
In terms of big-O notation, you cannot beat O(n) (same as your solution here). But you can have better constants and simpler algorithm, by using the property that the sum of elements 1,...,n-1 is well known.
int sum = 0;
for (int x : tab) {
sum += x;
}
duplicate = sum - ((n*(n-1)/2))
The constants here will be significntly better - as each array index is accessed exactly once, which is much more cache friendly and efficient to modern architectures.
(Note, this solution does ignore integer overflow, but it's easy to account for it by using 2x more bits in sum than there are in the array's elements).
Adding the classic answer because it was requested. It is based on the idea that if you xor a number with itself you get 0. So if you xor all numbers from 1 to n - 1 and all numbers in the array you will end up with the duplicate.
int duplicate = arr[0];
for (int i = 1; i < arr.length; i++) {
duplicate = duplicate ^ arr[i] ^ i;
}
Don't focus too much on asymptotic complexity. In practice the fastest algorithm is not necessarily the one with lowest asymtotic complexity. That is because constants are not taken into account: O( huge_constant * N) == O(N) == O( tiny_constant * N).
You cannot inspect N values in less than O(N). Though you do not need a full pass through the array. You can stop once you found the duplicate:
#include <iostream>
#include <vector>
int main() {
std::vector<int> vals{1,2,4,6,5,3,2};
std::vector<bool> present(vals.size());
for (const auto& e : vals) {
if (present[e]) {
std::cout << "duplicate is " << e << "\n";
break;
}
present[e] = true;
}
}
In the "lucky case" the duplicate is at index 2. In the worst case the whole vector has to be scanned. On average it is again O(N) time complexity. Further it uses O(N) additional memory while yours is using no additional memory. Again: Complexity alone cannot tell you which algorithm is faster (especially not for a fixed input size).
No matter how hard you try, you won't beat O(N), because no matter in what order you traverse the elements (and remember already found elements), the best and worst case are always the same: Either the duplicate is in the first two elements you inspect or it's the last, and on average it will be O(N).

Time complexity of this algorithm in Big(O)

I came up with the following algorithm to calculate the time complexity to find the second most occuring character in a string. This algo is divided into two parts. The first part where characters are inserted into a map in O(n). I am having difficulty with the second part. Iterating over the map is O(n) push and pop is O(log(n)). what would be the BigO complexity of the second part ? finally what would the overall complexity be ? Any help understanding this would be great ?
void findKthHighestChar(int k,std::string str)
{
std::unordered_map<char, int> map;
//Step 1: O(n)
for (int i = 0; i < str.size(); i++)
{
map[str[i]] = map[str[i]] + 1;
}
//Step2: O(n*log())
//Iterate through the map
using mypair = std::pair<int, char>;
std::priority_queue<mypair, std::vector<mypair>, std::greater<mypair>> pq;
for (auto it = map.begin(); it != map.end(); it++) //This is O(n) .
{
pq.push(mypair(it->second, it->first)); //push is O(log(n))
if (pq.size() > k) {
pq.pop(); //pop() is O(log(n))
}
}
std::cout << k << " highest is " << pq.top().second;
}
You have 2 input variables, k and n (with k < n).
And one hidden: alphabet size A
Step1 has average-case complexity of O(n).
Step2: O(std::min(A, n)*log(k)).
Iterating the map is O(std::min(A, n))
Queue size is bound to k, so its operation are in O(log(k))
Whole algorithm is so O(n) + O(std::min(A, n)*log(k))
If we simplify and get rid of some variables to keep only n:
(k->n, A->n): O(n) + O(n*log(n)) so O(n*log(n)).
(k->n, std::min(A, n)->A): O(n) + O(log(n)) so O(n).
Does it have to be this algorithm?
You can use an array (of the size of your alphabet) to hold the frequencies.
You can populate it in O(n), (one pass through your string). Then you can find the largest, or second largest, frequency in one pass. Still O(n).

Complexity of function with array having even and odds numbers separate

So i have an array which has even and odds numbers in it.
I have to sort it with odd numbers first and then even numbers.
Here is my approach to it:
int key,val;
int odd = 0;
int index = 0;
for(int i=0;i<max;i++)
{
if(arr[i]%2!=0)
{
int temp = arr[index];
arr[index] = arr[i];
arr[i] = temp;
index++;
odd++;
}
}
First I separate even and odd numbers then I apply sorting to it.
For sorting I have this code:
for (int i=1; i<max;i++)
{
key=arr[i];
if(i<odd)
{
val = 0;
}
if(i>=odd)
{
val = odd;
}
for(int j=i; j>val && key < arr[j-1]; j--)
{
arr[j] = arr[j-1];
arr[j-1] = key;
}
}
The problem i am facing is this i cant find the complexity of the above sorting code.
Like insertion sort is applied to first odd numbers.
When they are done I skip that part and start sorting the even numbers.
Here is my approach for sorting if i have sorted array e.g: 3 5 7 9 2 6 10 12
complexity table
How all this works?
in first for loop i traverse through the loop and put all the odd numbers before the even numbers.
But since it doesnt sort them.
in next for loop which has insertion sort. I basically did is only like sorted only odd numbers first in array using if statement. Then when i == odd the nested for loop then doesnt go through all the odd numbers instead it only counts the even numbers and then sorts them.
I'm assuming you know the complexity of your partitioning (let's say A) and sorting algorithms (let's call this one B).
You first partition your n element array, then sort m element, and finally sort n - m elements. So the total complexity would be:
A(n) + B(m) + B(n - m)
Depending on what A and B actually are you should probably be able to simplify that further.
Edit: Btw, unless the goal of your code is to try and implement partitioning/sorting algorithms, I believe this is much clearer:
#include <algorithm>
#include <iterator>
template <class T>
void partition_and_sort (T & values) {
auto isOdd = [](auto const & e) { return e % 2 == 1; };
auto middle = std::partition(std::begin(values), std::end(values), isOdd);
std::sort(std::begin(values), middle);
std::sort(middle, std::end(values));
}
Complexity in this case is O(n) + 2 * O(n * log(n)) = O(n * log(n)).
Edit 2: I wrongly assumed std::partition keeps the relative order of elements. That's not the case. Fixed the code example.

Complexity and Big - O of an algorithm

So I am preparing for an exam and 25% of that exam is over Big-O and I'm kind of lost at how to get the complexity and Big-O from an algorithm. Below are examples with the answers, I just need an explanation of how to the answers came to be and reasoning as to why some things are done, this is the best explanation I can give because, as mentioned above, I don't know this very well:
int i =n; //this is 1 because it is an assignment (=)
while (i>0){ //this is log10(10)*(1 or 2) because while
i/=10; //2 bc / and = // loops are log base (whatever is being /='d
} //the answer to this one is 1+log10(n)*(1 or 2) or O(logn)
//so i know how to do this one, but im confused when while and for
//loops nested in each other
int i = n; int s = 0;
while (i>0){
for(j=1;j<=i;j++)s++;{
i/=2;
} //the answer to this one is 2n +log2(n) + 2 or O(n)
//also the i/=2 is outside for loop for this and the next one
int i = n; int s=0
while (i>0){
for(j=1;j<=n;++J) s++;
i/=2;
} //answer 1+nlogn or O(nlogn)
int i = n;
for(j=1;j<=n;j++)
while(i>o) i/=2;
//answer is 1+log2(n) or O(log(n))
for(j=1; <=n; ++j){
int i-n;
while(i>0) i/=2;
} //answer O(nlog(n))
Number 4: the for loop counts from 1 to N, so it is at least O(n). The while loop takes O(log n) the first time, but since i doesn't get reset, while loop has only has one iteration each successive time through the for loop. So basically O(n + log n), which simplifies to O(n).
Number 5: same as above, but now i does get reset each time, so you have O(log n) done N times: O(n log n).