Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I'm trying to find the fastest way to get the sum of a character orders in a string
for example if the string contains:
ABAACA;
Then the sum of character A will be:
(A=0)+(A=2)+(A=3)+(A=5)=10;
A=10;
i know some way to that but it takes too long,so would you please tell me how can i get the fastest sum?
Fastest way I see to do this with C++ (since there's nothing else describing your problem), involves parallel processing:
Parallel scan and in-place indexing
Reduction
Although it might not be what you were looking for.
Fastest non parallel solution:
-go over every char.
-increase count if match (if(ch=='A')count+=i;).
there is just no faster way, because you MUST visit each character.
Anyway if you have a working solution, it's probably the fastest already..
If no parallel tools are in place the fastest solution would be to just visit each char in loop and increase sum.
int count_match ( char* str, int length, char digit )
{
int output = 0;
for ( int index = 0; index < length; ++ index )
output += index & - (str[index] == digit);
return output;
}
If the length of the string could be known at compile time, then you could conceivably template that value out and let the compiler vectorize the loop for you.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
There have already been some questions on this topic (1, 2, 3). The thing is that there doesn't seem to be a clear-cut answer. Some answers suggest size_t (1, 2), some suggest ptrdiff_t (1, 2). Other options include int, uint32_t, auto or using decltype on .size() of a container or the member type size_type.
This question may seem unsuitable as being opinion-based, but I don't think that's the case. Just because there isn't already a consensus on which type to use, doesn't mean that there cannot exist an objective answer. This is due to the fact that the different choices aren't only aesthetical, but can actually influence the behavior of the code.
For example, using an index variable type with mismatched signedness in the loop condition will cause compiler warnings, like this. Also, using a type that has a range that is too small can cause an overflow, which in the case of signed types is UB. At the same time, in some cases changing the loop counter type can cause "crazy performance deviations".
I also wanted to find out what is the most popular, though not necessarily the best, way to create for loop, so I used GitHub* search to find out. Here are the results:
Loop type
Code result count on GitHub (averaged; "manual" loop + range-based)
for (int
15.8m
for (size_t
11.6m
for (auto
7.5m
for (uint32_t
2.3m
std::for_each
501k
for (ptrdiff_t
98.7k
for (decltype
77.5k
There are certainly large differences in occurrence count between the different loop types, however, there doesn't seem to be a clear outstanding leader.
As such I post this question asking, what is the best type to use for the index variable in a for loop in C++ or what are the rules or conditions based on which this type should be chosen?
*: The GitHub search tool produces varying results for "code results" (count) each time, so I averaged 26 values. As the search is text-based it includes both results of the form for (int i = 0; i < n; ++i) and for (int i : vec).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
i was just playing with the factorial question using recursion when i wrote the following code.
i know that i could directly return the factorial.however i created a variable result and wrote the code below.
now what i want to know is that haven't i created n (the no. i want to calculate the factorial of)no. of result variables in the process?? because whenever my function factorial is called ,result variable is created , and each of those result variables would hold some value.
long long factorial(long long param) {
long long result;
if (param == 1) return 1;
else {
result = param * factorial(param - 1);
}
return result;
}
i know this is not a good code and i didn't thought that this would give me the write answer .however to my surprise it is.i want to know what is going on in this program.
Your function is a recursive function. You can read about recursion, and about recursive debugging here:
https://www.programiz.com/cpp-programming/recursion
https://beginnersbook.com/2017/08/cpp-recursion/
First of all : your function is unable to determine 0!
Second, yes, without any optimization from the compiler your program will take up unnecessary resources. The function is called n times and so the stack grows n times. Within each stack frame a temporary result is pushed on the stack.
However, with this program being so small, it is very likely that a minimal compiler effort will optimize that away in a release build.
It is also possible to do a recursion without the stack growing : define your factorial in such a way that there are never temporary values involved. If f(n, a) := n == 0 ? a : f(n-1, n*a) then factorial(n) := f(n, 1)
This recursion just keeps an accumulated result, wich is a fine example of functional programming. The stack needn't grow.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have recently got a basic idea of bit manipulation, and I was going through a problem where I found this C++ statement:
int popcount[1<<16];
I do have a basic idea of left/right Bit shift, but I would like to know why it is used in array size place.
Unless you find a comment in the code and unless you find out what the intent of popcount is, one can just guess why one writes 1 << 16 instead of, for example, 65536.
A common case could be that you want to count the number of occurrences of a particular id in, for example, a file. If the range of such an id where 16 bits, then such code could look as follows. The [1<<16] then expresses that you expect a range of not more than 16 bits:
int popcounts[1<<16] = { 0 };
int main() {
uint16_t id;
while (myfile >> id) {
popcounts[id]++;
}
}
Note that this is more accurate than writing int popcounts[UINT_MAX], because UINT_MAX is guaranteed to be equal or greater than 65536, and it is not guaranteed to be exactly 65536.
1<<16 is a common way to write 2 ** 16, which is easier to verify and modify than the "magic number" 65536. You may also encounter things like 1000 * 1000 instead of 1000000 for the same reason (although C++14 allows for 1000'000).
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I am looking for a hash table implementation that I can use for CUDA coding. are there any good one's out there. Something like the Python dictionary . I will use strings as my keys
Alcantara et al have demonstrated a data-parallel algorithm for building hash tables on the GPU. I believe the implementation was made available as part of CUDPP.
That said, you may want to reconsider your original choice of a hash table. Sorting your data by key and then performing lots of queries en masse should yield much better performance in a massively parallel setting. What problem are you trying to solve?
When I wrote an OpenCL kernel to create a simple hash table for strings, I used the hash algorithm from Java's String.hashCode(), and then just modded that over the number of rows in the table to get a row index.
Hashing function
uint getWordHash(__global char* str, uint len) {
uint hash = 0, multiplier = 1;
for(int i = len - 1; i >= 0; i--) {
hash += str[i] * multiplier;
int shifted = multiplier << 5;
multiplier = shifted - multiplier;
}
return hash;
}
Indexing
uint hash = getWordHash(word, len);
uint row = hash % nRows;
I handled collisions manually of course, and this approach worked well when I knew the number of strings ahead of time.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I'm a newbee in C++ and I'm writing a C++ program that asks the user to input two integers and then it raises the first integer to the power specified by the second integer. For example, if the user enters 5 and 8, the result will be 5 subscript 8, i.e., number five will be raised to the eighth power. The program must not use any pre-defined C++ functions (like pow function) for this task. The program should allow the user to perform another calculation if they so desire. Can anyone help
I'm not going to give you any code, because that won't allow you to truly explore this concept. Rather, you should use this pseudo code to implement something on your own.
Create a function which accepts two inputs, the base and the exponent.
Now there are several ways to go about doing this. You can use efficient bit shifting, but let's start simple, shall we?
answer = base
i = 1
while i is less than or equal to exponent
answer = answer * base
return answer
Simply loop through multiplying the base by itself.
There are other ways that focus on efficiency. Look here to see something that you may want to attempt: are 2^n exponent calculations really less efficient than bit-shifts?
The program must not use any pre-defined C++ functions (like pow function) for this task
You can use some piece of c++ code like follows, to compute xy, without using any predefined function:
int x = 5;
int y = 3;
int result = 1;
for(int i = 0; i < y; ++i)
{
result *= x;
}
cout << result << endl;
Output:
125
See a working sample here.