Suppose that I have an array. I want to remove all the elements within the array that have a given value. Does anyone know how to do this? The value I am trying to remove may occur more than once and the array is not necessarily sorted. I would prefer to filter the array in-place instead of creating a new array. For example, removing the value 2 from the array [1, 2, 3, 2, 4] should produce the result [1, 3, 4].
This is the best thing I could come up with:
T[] without(T)(T[] stuff, T thingToExclude) {
auto length = stuff.length;
T[] result;
foreach (thing; stuff) {
if (thing != thingToExclude) {
result ~= thing;
}
}
return result;
}
stuff = stuff.without(thingToExclude);
writeln(stuff);
This seems unnecessarily complex and inefficient. Is there a simpler way? I looked at the std.algorithm module in the standard library hoping to find something helpful but everything that looked like it would do what I wanted was problematic. Here are some examples of things I tried that didn't work:
import std.stdio, std.algorithm, std.conv;
auto stuff = [1, 2, 3, 2, 4];
auto thingToExclude = 2;
/* Works fine with a hard-coded constant but compiler throws an error when
given a value unknowable by the compiler:
variable thingToExclude cannot be read at compile time */
stuff = filter!("a != " ~ to!string(thingToExclude))(stuff);
writeln(stuff);
/* Works fine if I pass the result directly to writeln but compiler throws
an error if I try assigning it to a variable such as stuff:
cannot implicitly convert expression (filter(stuff)) of type FilterResult!(__lambda2,int[]) to int[] */
stuff = filter!((a) { return a != thingToExclude; })(stuff);
writeln(stuff);
/* Mysterious error from compiler:
template to(A...) if (!isRawStaticArray!(A)) cannot be sliced with [] */
stuff = to!int[](filter!((a) { return a != thingToExclude; })(stuff));
writeln(stuff);
So, how can I remove all occurrences of a value from an array without knowing the indexes where they appear?
std.algorithm.filter is pretty close to what you want: your second try is good.
You'll want to either assign it to a new variable or use the array() function on it.
auto stuffWithoutThing = filter!((a) { return a != thingToExclude; })(stuff);
// use stuffWithoutThing
or
stuff = array(filter!((a) { return a != thingToExclude; })(stuff));
The first one does NOT create a new array. It just provides iteration over the thing with the given thing filtered out.
The second one will allocate memory for a new array to hold the content. You must import the std.array module for it to work.
Look up function remove in http://dlang.org/phobos/std_algorithm.html. There are two strategies - stable and unstable depending on whether you want the remaining elements to keep their relative positions. Both strategies operate in place and have O(n) complexity. The unstable version does fewer writes.
if you want to remove the values you can use remove
auto stuffWithoutThing = remove!((a) { return a == thingToExclude; })(stuff);
this will not allocate a new array but work in place, note that the stuff range needs to be mutable
Related
I am trying to find one element in one array, which has the minimum absolute value. For example, in array [5.1, -2.2, 8.2, -1, 4, 3, -5, 6], I want get the value -1. I use following code (myarray is 1D array and not sorted)
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<fabsf(myarray[0])) myarray[0] = myarray[i];
}
Then, the target value is in myarray[0].
Because I have to repeat this procedure many times, this piece of code becomes the bottleneck in my program. Does anyone know how to improve this code? Thanks in advance!
BTW, the size of the array is always eight. Could this be used to optimize this code?
Update: so far, following code works slightly better on my machine:
float absMin = fabsf(myarray[0]); int index = 0;
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<absMin) {absMin = fabsf(myarray[i]); index=i;}
}
float result = myarray[index];
I am wandering how to avoid fabsf, because I just want to compare the absolute values instead of computing them. Does anyone have any idea?
There are some urban myths like inlining, loop unrolling by hand and similar which are supposed to make your code faster. Good news is you don't have to do it, at least if you use -O3 compiler optimization.
Bad news is, if you already use -O3 there is nothing you can do to speed up this function: the compiler will optimize the hell out of your code! For example it will surely do the caching of fabsf(myarray[0]) as some suggested. The only thing you can achieve with this "refactoring" is to build bugs into your program and make it less readable.
My advice is to look somewhere else for improvements:
try to reduce the number of invocations of this code
if this code is the bottle neck, than my guess would be that you recalculate the minimal value over and over again (otherwise filling the values into the array would take approximately the same time) - so cache the results of the search
shift costs to changing the elements of the array, for example by using some fancy data structures (heaps, priority_queue) or by tracking the minimum of elements. Lets say your array has only two elements values [1,2] so minimum is 1. Now if you change
2 to 3, you don't have to do anything
2 to 0, you can easily update your minimum to 0
1 to 3, you have to loop through all elements. But maybe this case is not that often.
Can you store the values pre fabbed?
Also as #Gerstrong mentions, storing the number outside the loop and only calculating it when array changes will give you a boost.
Calling partial_sort or nth_element will sort the array only so that the correct value is in the right location.
std::nth_element(v.begin(), v.begin(), v.end(), [](float& lhs, float& rhs){
return fabsf(lhs)<fabsf(rhs);
});
Let me give some ideas that could help:
float minVal = fabsf(myarray[0]);
for (int i = 1; i < 8; ++i)
{
if(fabsf(myarray[i])<minVal) minVal = fabsf(myarray[i]);
}
myarray[0] = minVal;
But compilers nowadays are very smart and you might not get any more speed, as you already get optimized code. It depends on how your mentioned piece of code is called.
Another way to optimize this maybe is using C++ and STL, so you can do the following using the typical binary search tree std::set:
// Absolute comparator for std::set
bool absless_compare(const int64_t &a, const int64_t &b)
{
return (fabsf(a) < fabsf(b));
}
std::set<float, absless_compare> mySet = {5.1, -2.2, 8.2, -1, 4, 3, -5, 6};
const float minVal = *(mySet.begin());
With this approach by inserting your numbers they are already sorted in ascending order. The less-Comparator is usually a set for the std::set, but you can change it to use something different like in this example. This might help on larger datasets, but you mentioned you only have eight values to compare, so it really will not help.
Eight elements is a very small number, which might be kept in stack with for example the declaration of std::array<float,8> myarray close to your sorting function before filling it with data. You should that variants on your full codeset and observe what helps. Of course if you declare std::array<float,8> myarray or float[8] myarray runtime you should get the same results.
What you also could check is if fabsf really uses float as parameter and does not convert your variable to double which would degrade the performance. There is also std::abs() which for my understanding deduces the data type, because in C++ you can use templates etc.
If don't want to use fabs obviously a call like this
float myAbs(const float val)
{
return (val<0) ? -val : val;
}
or you hack the bit to zero which make your number negative. Either way, I'm pretty sure, that fabsf is fully aware of that, and I don't think a code like that will make it faster.
So I would check if the argument is converted to double. If you have C99 Standard in your system though, you should not have that issue.
One thought would be to do your comparisons "tournament" style, instead of linearly. In other words, you first compare 1 with 2, 3 with 4, etc. Then you take those 4 elements and do the same thing, and then again, until you only have one element left.
This does not change the number of comparisons. Since each comparison eliminates one element from the running, you will have exactly 7 comparisons no matter what. So why do I suggest this? Because it removes data dependencies from your code. Modern processors have multiple pipelines and can retire multiple instructions simultaneously. However, when you do the comparisons in a loop, each loop iteration depends on the previous one. When you do it tournament style, the first four comparisons are completely independent, so the processor may be able to do them all at once.
In addition to doing that, you can compute all the fabs at once in a trivial loop and put it in a new array. Since the fabs computations are independent, this can get sped up pretty easily. You would do this first, and then the tournament style comparisons to get the index. It should be exactly the same number of operations, it's just changing the order around so that the compiler can more easily see larger blocks that lack data dependencies.
The element of an array with minimal absolute value
Let the array, A
A = [5.1, -2.2, 8.2, -1, 4, 3, -5, 6]
The minimal absolute value of A is,
double miniAbsValue = A.array().abs().minCoeff();
int i_minimum = 0; // to find the position of minimum absolute value
for(int i = 0; i < 8; i++)
{
double ftn = evalsH(i);
if( fabs(ftn) == miniAbsValue )
{
i_minimum = i;
}
}
Now the element of A with minimal absolute value is
A(i_minimum)
To be practical in the future I'll use standard lib's vector, but right now I'm trying to create some of the basic data structures to better learn C++ (I'm migrating from Java).
I've gotten almost everything working, except for the remove method. I want to get the element I'm removing from the array
template <class generic_type> generic_type & ArrayList<generic_type>::remove(const unsigned int index)
{
check_range_get(index);
generic_type & temp = data_array[index];
for(int i=index;i<size()-1;++i)
{
data_array[i]=data_array[i+1];
}
--number_of_elements;
return temp;
}
The method removes the correct index, so if you have a collection of numbers 0 through 4.
0, 1, 2, 3, 4
If we use my remove method is called with index 0 you get:
1, 2, 3, 4
HOWEVER, it doesn't return the correct number. It returns 1 instead of returning 0. I believe that this is because my method overrides the reference to the number in the first index.
To fix this I can change generic_type & temp to generic_type temp, which will return the correct value, but to my understanding this means that the value is actually duplicated a copy is made. For a simple primitive type, this isn't so bad; but for a more complex object with a larger N size in our collection duplication doesn't sound like the best thing that can be done.
Is there a way to fix this?
Thanks to all in advance.
To optimize this code, one possible solution is to use C++11's move semantics:
#include <utility>
generic_type temp{ std::move(data_array[index]) };
and return this by value, not reference.
Additionally, please note that you are already making A LOT of copies within your loop. You can apply the same technique there:
for(int i=index;i<size()-1;++i)
{
data_array[i] = std::move(data_array[i+1]);
}
I just wonder if I can use a "complicated" map as the value of another map. I have self-defined several structs as follow:
typedef std::vector<std::string> pattern;
typedef std::map<int, std::vector<pattern>> dimPatternsMap;
typedef std::map<int, dimPatternsMap> supportDimMapMap;
OK let me explain these things...pattern is a vector of strings. For the "smaller" map dimPatternsMap, the key is an integer which is the dimension of pattern (the size of that vector containing strings) and the value is vector containing patterns (which is a vector of vectors...).
The "bigger" map supportDimMapMap also use an integer as the key value, but use dimPatternsMap as its value. The key means "support count".
Now I begin to construct this "complicated" map:
supportDimMapMap currReverseMap;
pattern p = getItFromSomePlace(); //I just omit the process I got pattern and its support
int support = getItFromSomePlaceToo();
if(currReverseMap.find(support) == currReverseMap.end()) {
dimPatternsMap newDpm;
std::vector<pattern> newPatterns;
newPatterns.push_back(currPattern);
newDpm[dim] = newPatterns;
currReverseMap[support] = newDpm;
} else{
dimPatternsMap currDpm = currReverseMap[support];
if(currDpm.find(dim) == currDpm.end()) {
std::vector<pattern> currDimPatterns;
currDimPatterns.push_back(currPattern);
currDpm[dim] = currDimPatterns;
} else {
currDpm[dim].push_back(currPattern);
}
}
Forgive me the code is really a mass...
But then as I want to traverse the map like:
for(supportDimMapMap::iterator iter = currReverseMap.begin(); iter != currReverseMap.end(); ++iter) {
int support = iter->first;
dimPatternsMap dpm = iter->second;
for(dimPatternsMap::iterator ittt = dpm.begin(); ittt != dpm.end(); ++ittt) {
int dim = ittt->first;
std::vector<pattern> patterns = ittt->second;
int s = patterns.size();
}
}
I found the value s is always 1, which means that for each unique support value and for each dimension of that support value, there is only one pattern! But as I debug my code in the map constructing process, I indeed found that the size is not 1 - I actually added the new patterns into the map successfully...But when it comes to traversing, all the sizes become 1 and I don't know why...
Any suggestions or explanations will be greatly appreciated! Thanks!!
dimPatternsMap currDpm = currReverseMap[support];
currDpm is a copy of currReverseMap[support]. It is not the same object. So then when you make changes to currDpm, nothing within currReverseMap changes.
On the other hand, if you use a reference:
dimPatternsMap& currDpm = currReverseMap[support];
then currDpm and currReverseMap[support] really are the same object, so later statements using currDpm will really be changing a value within currReverseMap.
There are a few other places where your code could benefit from references too.
My guess: you should use a reference in your else:
dimPatternsMap& currDpm = currReverseMap[support];
Your current code creates a copy instead of using the original map.
Your problem is this line:
dimPatternsMap currDpm = currReverseMap[support];
Based on the code following it, it wants to read like this:
dimPatternsMap& currDpm = currReverseMap[support];
Without the & you modify a copy of the entry rather than the existing entry.
Your code is making several copies of the objects underneath, try using more references and iterators (find() already gives you an element if it was found, for example).
For example, dimPatternsMap currDpm = currReverseMap[support]; actually makes a copy of a map in your structure and adds an element to it (not to the original). Try using a reference instead.
What's the real difference between a foreach and for loop if either can get the same job done? I'm learning C++ and apparently there is no foreach loop for its arrays :(
There is no "foreach" language construct in C++, a least not literally. C++11 introduces something that's "as good as" a foreach loop, though.
The traditional for loop has something to do with evaluating conditions and performing repeated operations. It's a very general control structure. Its most popular use is to iterate over container or array contents, but that's just a tiny fraction of what you can do with it.
A "foreach" loop, on the other hand, is explicitly designed to iterate over container elements.
Example:
int arr[5] = { 1, 3, 5, 2, 4 };
for (int & n : arr) { n *= 2; } // "for-each" loop, new in C++11
for (size_t i = 0; i != 5; ++i) { arr[i] *= 2; } // "classic" for loop
In the second for, we use a traditional for loop to increment an auxiliary variable i in order to access the container arr. The first, range-based loop does not expose any details of the iteration, but just says "do this and that to each element in the collection".
Since the traditional for loop is a very general control structure, it can also be used in unusual ways:
std::vector<std::string> all_lines;
for (std::string line; std::cin >> line; all_lines.push_back(line))
{
std::cout << "On line " << (all_lines.size() + 1) << " you said: " << line << std::endl;
}
You can trivially rewrite for(A; B; C) as a while loop:
{ // scope!
A;
while (true && B)
{
{ // more scope!
/* for loop body */
}
C;
}
}
Edit: I would probably be remiss not to mention the library function template std::for_each from <algorithm>, which in conjunction with lambdas is a very nice and self-descriptive way to iterate over arbitrary ranges (not just entire containers). It has existed since Day 1, but before lambdas it was a show-stopping pain to use.
Update: I thought of something else that might be relevant here: A "foreach" loop generally assumes that you don't modify the container. A common type of looping that modifies the container requires the traditional for-loop; as for example in this typical erase pattern:
for(Container::const_iterator it = v.begin(); it != v.end() /* not hoisted! */; /* no increment */ )
{
// do something
if (suitable_condition)
{
v.erase(it++); // or it = v.erase(it), depending on container type
}
else
{
++it;
}
}
foreach generally has 1 parameter, for has 3. Anything foreach can do for can too. Part of the reason why foreach doesn't exist in C++ is because the number of iterations can't always be inferred from the type.
I believe boost library has a method of getting foreach to work, and C++11 has a range-based of for:
int my_array[5] = {1, 2, 3, 4, 5};
for (int &x : my_array) {
x *= 2;
}
There is something like for each for arrays in C++ and that is iterators. Both loops are essentially identical with the only difference being - with an ordinary for loop you have an index which you might need depending on what type of data you are accessing and whether you need to do some calculations with the index and there is (probably) an increased chance of off-by-one errors. Whereas foreach loops just guarantee that will be executed as many times as there are elements in the array without exposing an index (which you can mimic) so as a I said they are essentially the same but their usage largely depends on the way you manipulate your data.
"For Each" syntax is used to iterate through a collection of objects, while a for loop is a loop that will execute for a given range. C++ does have for_each in its STL and can be used to iterate through linear object containers such as a vector.
In other languages with a foreach construct, they're usually convenience for not having to index into the collection you're looping over. That is, you're given the next object in the collection without having access to (or need for) the index itself. If you need the index for some reason, you'll usually need the for loop, though in some languages you have access to the counter in their 'foreach'.
as experience test , FOR is more Faster than FOREACH
Is there any way to check if a given index of an array exists?
I am trying to set numerical index but something like 1, 5, 6,10. And so I want to see if these indexes already exist and if they do just increase another counter.
I normally work with php but I am trying to do this in c++, so basically I am trying to ask if there is an isset() way to use with c++
PS: Would this be easier with vectors? If so, can anyone point me to a good vector tutorial? Thanks
In C++, the size of an array is fixed when it is declared, and while you can access off the end of the declared array size, this is very dangerous and the source of hard-to-track-down bugs:
int i[10];
i[10] = 2; // Legal but very dangerous! Writing on memory you don't know about
It seems that you want array-like behavior, but without all elements being filled. Traditionally, this is in the realms of hash-tables. Vectors are not such a good solution here as you will have empty elements taking up space, much better is something like a map, where you can test if an element exists by searching for it and interpreting the result:
#include <map>
#include <string>
// Declare the map - integer keys, string values
std::map<int, std::string> a;
// Add an item at an arbitrary location
a[2] = std::string("A string");
// Find a key that isn't present
if(a.find(1) == a.end())
{
// This code will be run in this example
std::cout << "Not found" << std::endl;
}
else
{
std::cout << "Found" << std::endl;
}
One word of warning: Use the above method to find if a key exists, rather than something like testing for a default value
if(a[2] == 0)
{
a[2] = myValueToPutIn;
}
as the behavior of a map is to insert a default constructed object on the first access of that key value, if nothing is currently present.
My personal vote is for using a vector. They will resize dynamically, and as long as you don't do something stupid (like try and access an element that doesn't exist) they are quite friendly to use.
As for tutorials the best thing I could point you towards is a google search
To do this without vectors, you can simply cross-check the index you are tying to access with the size of array. Like: if(index < array_size) it is invalid index.
In case the size is not known to you, you can find it using the sizeof operator.
For example:
int arr[] = {5, 6, 7, 8, 9, 10, 1, 2, 3};
int arr_size = sizeof(arr)/sizeof(arr[0]);
It sounds to me as though really a map is closest to what you want. You can use the Map class in the STL (standard template library)(http://www.cppreference.com/wiki/stl/map/start).
Maps provide a container for objects which can be referenced by a key (your "index").