I'm currently writing an Polynomial-class in C++, which should represent an polynomial of the following form:
p(x) = a_0 + a_1*x^1 + a_2*x^2 + ... + a_i*x^i
where a_0, ..., a_i are all int's.
The class internally uses an member variable a_ of typestd::vector<int> to store the constant factors a_0, ..., a_i. To access the constant factors the operator[] is overloaded in the following way:
Read and write:
int &operator[](int i)
{
return a_.at(i);
}
This will fail when trying to change one of the factors a_i with:
i > degree of polynomial = a_.size() - 1
Read-only:
int operator[](int i) const
{
if (i > this->degree()) {
return 0;
}
return a_.at(i);
}
The slightly different implementation allows rather comfortable looping over the factors of two different sized polynomials (without worrying about the degree of the polynomial).
Sadly I seem to miss something here, since the operator+-overloading (which makes use of this comfortable read-only-operator[]) fails.
operator+-overloading:
Polynomial operator*(const Polynomial &other) {
Polynomial res(this->degree() + other.degree());
for (int i = 0; i <= res.degree(); ++i) {
for (int k = 0; k <= i; ++k) {
res[i] += (*this)[k] * other[i-k];
}
}
return res;
}
Don't mind the math involved. The important point is, that the i is always in the range of
0 <= i < res.a_.size()
thus writing to res[i] is valid. However (*this)[k] and other[i-k] try to read from indices which don't necessarily lay in the range [0, (*this).a_.size() - 1].
This should be fine with our read-only-implementation of the operator[] right? I still get an error trying to access a_ at invalid indices. What could cause the compiler to use the read-write-implementation in the line:
res[i] += (*this)[k] * other[i-k];
Especially the part on the right side of the equality.
I'm certain the error is caused by the "wrong" use of the read-and-write-operator[]. Because with an additional check fixes the invalid access:
if (k <= this->degree() && i-k <= other.degree()) {
res[i] += (*this)[k] * other[i-k];
}
What am I missing with the use of the operator[]-overloading? Why isn't the read-only-operator[] used here?
(*this)[k] is using the non-const this as the function containing it is not const.
Hence the non-const overload of [] is preferred by the compiler.
You could get round this using an ugly const_cast, but really you ought to keep the behaviour of the two versions of the [] operator as similar as possible. Besides, the std::vector overload of [] doesn't insist on the index being bounds checked, as opposed to at which must be. Your code is a deviation from this and therefore could confuse readers of your code.
Related
I was trying to solve this problem on leet code it works fine on my vs code and gcc compiler but i'm getting this
Runtime error: Address sanitizer Heap buffer Overflow error message with a long list of address or something on the website. Help me fix it. Here's the code
class Solution
{
public:
char nextGreatestLetter(vector<char> v, char a)
{
int l=v.size()-1;
if (v[l] < a)
{
return v[0];
}
int i = 0;
while (v[i] <= a)
{
i++;
}
return v[i];
}
};
p.s the array is sorted in ascending order
This code snippet has a lot of problems:
The while loop isn't guaranteed to terminate. If the last character of v is == a, then the first v[l] < a test will be false, but v[i] <= a might be true all the way through the array (it looks like v is meant to be pre-sorted into ascending order), which will have you eventually accessing v[i] for a value of i >= v.size(). That is an illegal/undefined array access and might be the source of the error message you report, if the test platform had strict bounds-checking enabled.
The logic of returning v[0] if a is greater than any character in v (again, inferring from the loop that v is supposed to be pre-sorted into ascending order) also seems flawed. Why not return the value of a instead? The caller can easily see that if the return value was <= a, then there clearly was no element of v greater than a.
It's almost certainly worth your time to handle cases where the passed-in v array is empty (v.size() == 0) or not actually pre-sorted, i.e. by caching n = v.size() and changing the loop condition to while (i < n && v[i] <= a). Don't let fragile functions creep into your codebase!
Say I have a vector containing only positive, real elements defined like this:
Eigen::VectorXd v(1.3876, 8.6983, 5.438, 3.9865, 4.5673);
I want to generate a new vector v2 that has repeated the elements in v some k times. Then I want to apply k different functions to each of the repeated elements in the vector.
For example, if v2 was v repeated 2 times and I applied floor() and ceil() as my two functions, the result based on the above vector would be a column vector with values: [1; 2; 8; 9; 5; 6; 3; 4; 4; 5]. Preserving the order of the original values is important here as well. These values are also a simplified example, in practice, I'm generating vectors v with ~100,000 or more elements and would like to make my code as vectorizable as possible.
Since I'm coming to Eigen and C++ from Matlab, the simplest approach I first took was to just convert this Nx1 vector into an Nx2 matrix, apply floor to the first column and ceil to the second column, take the transpose to get a 2xN matrix and then exploit the column-major nature of the matrix and reshape the 2xN matrix into a 2Nx1 vector, yielding the result I want. However, for large vectors, this would be very slow and inefficient.
This response by ggael effectively addresses how I could repeat the elements in the input vector by generating a sequence of indices and indexing the input vector. I could just then generate more sequences of indices to apply my functions to the relevant elements v2 and copy the result back to their respective places. However, is this really the most efficient approach? I dont fully grasp copy-on-write and move semantics, but I think the second indexing expressions would be in a sense redundant?
If that is true, then my guess is that a solution here would be some sort of nullary or unary expression where I could define an expression that accepts the vector, some index k and k expressions/functions to apply to each element and spits out the vector I'm looking for. I've read the Eigen documentation on the subject, but I'm struggling to build a functional example. Any help would be appreciated!
So, if I understand you correctly, you don't want to replicate (in terms of Eigen methods) the vector, you want to apply different methods to the same elements and store the result for each, correct?
In this case, computing it sequentially once per function is the easiest route. Most CPUs can only do one (vector) memory store per clock cycle, anyway. So for simple unary or binary operations, your gains have an upper bound.
Still, you are correct that one load is technically always better than two and it is a limitation of Eigen that there is no good way of achieving this.
Know that even if you manually write a loop that would generate multiple outputs, you should limit yourself in the number of outputs. CPUs have a limited number of line-fill buffers. IIRC Intel recommended using less than 10 "output streams" in tight loops, otherwise you could stall the CPU on those.
Another aspect is that C++'s weak aliasing restrictions make it hard for compilers to vectorize code with multiple outputs. So it might even be detrimental.
How I would structure this code
Remember that Eigen is column-major, just like Matlab. Therefore use one column per output function. Or just use separate vectors to begin with.
Eigen::VectorXd v = ...;
Eigen::MatrixX2d out(v.size(), 2);
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();
Following the KISS principle, this is good enough. You will not gain much if anything by doing something more complicated. A bit of multithreading might gain you something (less than factor 2 I would guess) because a single CPU thread is not enough to max out memory bandwidth but that's about it.
Some benchmarking
This is my baseline:
int main()
{
int rows = 100013, repetitions = 100000;
Eigen::VectorXd v = Eigen::VectorXd::Random(rows);
Eigen::MatrixX2d out(rows, 2);
for(int i = 0; i < repetitions; ++i) {
out.col(0) = v.array().floor();
out.col(1) = v.array().ceil();
}
}
Compiled with gcc-11, -O3 -mavx2 -fno-math-errno I get ca. 5.7 seconds.
Inspecting the assembler code finds good vectorization.
Plain old C++ version:
double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
for(std::ptrdiff_t j = 0; j < rows; ++j) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}
40 seconds instead of 5! This version actually does not vectorize because the compiler cannot prove that the arrays don't alias each other.
Next, let's use fixed size Eigen vectors to get the compiler to generate vectorized code:
double* outfloor = out.data();
double* outceil = outfloor + out.outerStride();
const double* inarr = v.data();
std::ptrdiff_t j;
for(j = 0; j + 4 <= rows; j += 4) {
const Eigen::Vector4d vj = Eigen::Vector4d::Map(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector4d::Map(outfloor + j) = floorval;
Eigen::Vector4d::Map(outceil + j) = ceilval;;
}
if(j + 2 <= rows) {
const Eigen::Vector2d vj = Eigen::Vector2d::MapAligned(inarr + j);
const auto floorval = vj.array().floor();
const auto ceilval = vj.array().ceil();
Eigen::Vector2d::Map(outfloor + j) = floorval;
Eigen::Vector2d::Map(outceil + j) = ceilval;;
j += 2;
}
if(j < rows) {
const double vj = inarr[j];
outfloor[j] = std::floor(vj);
outceil[j] = std::ceil(vj);
}
7.5 seconds. The assembler looks fine, fully vectorized. I'm not sure why performance is lower. Maybe cache line aliasing?
Last attempt: We don't try to avoid re-reading the vector but we re-read it blockwise so that it will be in cache by the time we read it a second time.
const int blocksize = 64 * 1024 / sizeof(double);
std::ptrdiff_t j;
for(j = 0; j + blocksize <= rows; j += blocksize) {
const auto& vj = v.segment(j, blocksize);
auto outj = out.middleRows(j, blocksize);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();
}
const auto& vj = v.tail(rows - j);
auto outj = out.bottomRows(rows - j);
outj.col(0) = vj.array().floor();
outj.col(1) = vj.array().ceil();
5.4 seconds. So there is some gain here but not nearly enough to justify the added complexity.
So I have this algorithm and I am trying to determine the basic operation for an algorithm analysis problem.
here is the code:
median(int array[]){
int k = array.length();
int n = k/2;
for(int i = 0; i < k; i++){
int numsmaller = 0;
int numequal = 0;
for(int j = 0; j < k; k++){
if(array[j] < array[i]){
numsmaller++;
}else
if(array[j] == array[i]){
numequal++;
}
if(numsmaller < n && n <= (numsmaller + numequal){
return array[i]
}
}//inner loop
}//outter loop
}//end of function
I am under the current impression that the basic operation of this Algorithm is the two if statements within the inner loop of the function.
What is confusing me is that, I am unsure if the basic operation is the boolean expression itself which would be executed every iteration checking if array[j] < array[i] and if array[j] is equal to array[i]. Or weather the basic operation is the code execution that occurs when either of the if statements are true. Can someone please give me a solid explanation in terms of algorithm analysis what the basic operation of this algorithm would be :) please and much thanks
Basic operations may be things like:
Array indexing
Conditionals, i.e. if (x == y)
Assignments, i.e. x = 10
And even basic math operations, i.e. y + 2
Note this is not an exhaustive list by any means. Also note that the worst case scenario of some code requires the maximum number of basic operations to be performed; so in the following code, you'll see three basic operations in the worst case.
if (variable == true) {
int x = y + 2;
}
...this is because we really just composed several of the above list items. We have to perform the first conditional no matter one (one basic op) but after that the "worst case scenario" is when variable = true, because we then continue to perform an assignment. Of course in order to compute the non-obvious value that x will assume via the assignment, we have to perform another basic operation (arithmetic between y and 2) which gives us a total of three basic operations.
So in your case, the basic operations performed in the inner loop are the conditionals, the incrementing (basically assignment) of a variable given one of the conditions are met, and the two conditionals plus arithmetic done in the
if(numsmaller < n && n <= (numsmaller + numequal)
line.
Hopefully this helps.
I am new to C++ and attempting to create a "BigInt" class. I decided to base most of the implementation on reading the numbers into vectors.
So far I have only written the copy constructor for an input string.
Largenum::Largenum(std::string input)
{
for (std::string::const_iterator it = input.begin(); it!=input.end(); ++it)
{
number.push_back(*it- '0');
}
}
The problem I am having is with the addition function. I have created a function which seems to work after I tested it a few times, but as you can see its highly inefficient. I have 2 different vectors such as:
std::vector<int> x = {1,3,4,5,9,1};
std::vector<int> y = {2,4,5,6};
The way I thought to solve this problem was to add 0s before the shorter, in this case y vector to make both vectors have the same size such as:
x = {1,3,4,5,9,1};
y = {0,0,2,4,5,6};
Then to add them using elementary style addition.
I don't want to add 0s infront of vector Y as it would be slow with a large number. My current solution is to reverse the vector, then push_back the appropriate amount of 0s, then reverse it back. This may be slower then simply inserting at the front it seems, I have not tested yet.
The problem is that after I do all of the addition on the vectors and push_back the result. I am left with a backward vector and I need to use reverse yet again! There has got to be a much better way then my method but I am stuck on finding it. Ideally I would make A const as well. Here is the code of the function:
Largenum Largenum::operator+(Largenum &A)
{
bool carry = 0;
Largenum sum;
std::vector<int>::size_type max = std::max(A.number.size(), this->number.size());
std::vector<int>::size_type diff = std::abs (A.number.size()-this->number.size());
if (A.number.size()>this->number.size())
{
std::reverse(this->number.begin(), this->number.end());
for (std::vector<int>::size_type i = 0; i<(max-diff); ++i) this->number.push_back(0);
std::reverse(this->number.begin(), this->number.end());
}
else if (this->number.size() > A.number.size())
{
std::reverse(A.number.begin(), A.number.end());
for (std::vector<int>::size_type i = 0; i<(max-diff); ++i) A.number.push_back(0);
std::reverse(A.number.begin(), A.number.end());
}
for (std::vector<int>::size_type i = max; i!=0; --i)
{
int num = (A.number[i-1] + this->number[i-1] + carry)%10;
sum.number.push_back(num);
(A.number[i-1] + this->number[i-1] + carry >= 10) ? carry = 1 : carry = 0;
}
if (carry) sum.number.push_back(1);
reverse(sum.number.begin(), sum.number.end());
return sum;
}
If anyone has any input that would be great, this is my first program using classes in C++ and its fairly overwhelming.
I think your function is quite close to the most optimal one I have seen. Still here are few suggestions how to improve it:
Decimal numeric system is quite inefficient, you have a lot of digits for big numbers. Better use a higher base to reduce the number of digits you have to add. Reading and writing such numbers in human readable representation will be a bit harder, but you will optimize the operations several times, because you will have less digits.
When implementing big integers I represent them in reverse order, thus I have the least significant digit at position with index 0, and the most significant one at the end of the array. This way when carry forces you to add a new digit you only perform a push_back, not a whole reverse.
One issue: integer modulus is pretty slow on modern processors, even compared to branch misprediction. Rather than doing an explicit %10, try this for your third for-loop:
int num = A.number[i-1] + this->number[i-1] + carry;
if(num >= 10)
{
carry = 1;
num -= 10;
}
else
{
carry = 0;
}
sum.number.push_back(num);
I'm a programming student, and for a project I'm working on, on of the things I have to do is compute the median value of a vector of int values. I'm to do this using only the sort function from the STL and vector member functions such as .begin(), .end(), and .size().
I'm also supposed to make sure I find the median whether the vector has an odd number of values or an even number of values.
And I'm Stuck, below I have included my attempt. So where am I going wrong? I would appreciate if you would be willing to give me some pointers or resources to get going in the right direction.
Code:
int CalcMHWScore(const vector<int>& hWScores)
{
const int DIVISOR = 2;
double median;
sort(hWScores.begin(), hWScores.end());
if ((hWScores.size() % DIVISOR) == 0)
{
median = ((hWScores.begin() + hWScores.size()) + (hWScores.begin() + (hWScores.size() + 1))) / DIVISOR);
}
else
{
median = ((hWScores.begin() + hWScores.size()) / DIVISOR)
}
return median;
}
There is no need to completely sort the vector: std::nth_element can do enough work to put the median in the correct position. See my answer to this question for an example.
Of course, that doesn't help if your teacher forbids using the right tool for the job.
You are doing an extra division and overall making it a bit more complex than it needs to be. Also, there's no need to create a DIVISOR when 2 is actually more meaningful in context.
double CalcMHWScore(vector<int> scores)
{
size_t size = scores.size();
if (size == 0)
{
return 0; // Undefined, really.
}
else
{
sort(scores.begin(), scores.end());
if (size % 2 == 0)
{
return (scores[size / 2 - 1] + scores[size / 2]) / 2;
}
else
{
return scores[size / 2];
}
}
}
The accepted answer uses std::sort which does more work than we need it to. The answers that use std::nth_element don't handle the even size case correctly.
We can do a little better than just using std::sort. We don't need to sort the vector completely in order to find the median. We can use std::nth_element to find the middle element. Since the median of a vector with an even number of elements is the average of the middle two, we need to do a little more work to find the other middle element in that case. std::nth_element ensures that all elements preceding the middle are less than the middle. It doesn't guarantee their order beyond that so we need to use std::max_element to find the largest element preceding the middle element.
int CalcMHWScore(std::vector<int> hWScores) {
assert(!hWScores.empty());
const auto middleItr = hWScores.begin() + hWScores.size() / 2;
std::nth_element(hWScores.begin(), middleItr, hWScores.end());
if (hWScores.size() % 2 == 0) {
const auto leftMiddleItr = std::max_element(hWScores.begin(), middleItr);
return (*leftMiddleItr + *middleItr) / 2;
} else {
return *middleItr;
}
}
You might want to consider returning a double because the median may be a fraction when the vector has an even size.
const int DIVISOR = 2;
Don't do this. It just makes your code more convoluted. You've probably read guidelines about not using magic numbers, but evenness vs. oddness of numbers is a fundamental property, so abstracting this out provides no benefit but hampers readability.
if ((hWScores.size() % DIVISOR) == 0)
{
median = ((hWScores.begin() + hWScores.size()) + (hWScores.begin() + (hWScores.size() + 1))) / DIVISOR);
You're taking an iterator to the end of the vector, taking another iterator that points one past the end of the vector, adding the iterators together (which isn't an operation that makes sense), and then dividing the resulting iterator (which also doesn't make sense). This is the more complicated case; I'll explain what to do for the odd-sized vector first and leave the even-sized case as an exercise for you.
}
else
{
median = ((hWScores.begin() + hWScores.size()) / DIVISOR)
Again, you're dividing an iterator. What you instead want to do is to increment an iterator to the beginning of the vector by hWScores.size() / 2 elements:
median = *(hWScores.begin() + hWScores.size() / 2);
And note that you have to dereference iterators to get values out of them. It'd be more straightforward if you used indices:
median = hWScores[hWScores.size() / 2];
I give below a sample program that is somewhat similar to the one in Max S.'s response. To help the OP advance his knowledge and understanding, I have made a number of changes. I have:
a) changed the call by const reference to call by value, since sort is going to want to change the order of the elements in your vector, (EDIT: I just saw that Rob Kennedy also said this while I was preparing my post)
b) replaced size_t with the more appropriate vector<int>::size_type (actually, a convenient synonym of the latter),
c) saved size/2 to an intermediate variable,
d) thrown an exception if the vector is empty, and
e) I have also introduced the conditional operator (? :).
Actually, all of these corrections are straight out of Chapter 4 of "Accelerated C++" by Koenig and Moo.
double median(vector<int> vec)
{
typedef vector<int>::size_type vec_sz;
vec_sz size = vec.size();
if (size == 0)
throw domain_error("median of an empty vector");
sort(vec.begin(), vec.end());
vec_sz mid = size/2;
return size % 2 == 0 ? (vec[mid] + vec[mid-1]) / 2 : vec[mid];
}
I'm not exactly sure what your restrictions on the user of member functions of vector are, but index access with [] or at() would make accessing elements simpler:
median = hWScores.at(hWScores.size() / 2);
You can also work with iterators like begin() + offset like you are currently doing, but then you need to first calculate the correct offset with size()/2 and add that to begin(), not the other way around. Also you need to dereference the resulting iterator to access the actual value at that point:
median = *(hWScores.begin() + hWScores.size()/2)