concatenate string_views in constexpr - c++

I'm trying to concatenate string_views in a constexpr.
The following is a simplified version of my code:
#include <iostream>
#include <string_view>
using namespace std::string_view_literals;
// concatenate two string_views by copying their bytes
// into a newly created buffer
constexpr const std::string_view operator+
(const std::string_view& sv1, const std::string_view& sv2)
{
char buffer[sv1.size()+sv2.size()] = {0};
for(size_t i = 0; i < sv1.size(); i++)
buffer[i] = sv1[i];
for(size_t i = sv1.size(); i < sv1.size()+sv2.size(); i++)
buffer[i] = sv2[i-sv1.size()];
return std::string_view(buffer, sv1.size()+sv2.size());
}
int main()
{
const std::string_view sv1("test1;");
const std::string_view sv2("test2;");
std::cout << sv1 << "|" << sv2 << ": " << (sv1+sv2+sv1) << std::endl;
std::cout << "test1;"sv << "|" << "test2;"sv << ": " <<
("test1;"sv+"test2;"sv) << std::endl;
return 0;
}
However this code does not produce the result I expected. Instead of printing test1;test2;test1 and test1;test2; it prints out correct characters mixed with random characters as if I'm accessing uninitialized memory.
test1;|test2;: F��<��itest;
test1;|test2;: est1;te`�i
However if I remove the constexpr specifier and replace the string_views with strings the above code prints the expected output.
test1;|test2;: test1;test2;test1;
test1;|test2;: test1;test2;
Either I'm missing some obvious mistake in my code or there is something about constexpr that I don't understand (yet). Is it the way I'm creating the buffer for the new string_view? What else could I do? Or is what I'm trying to do impossible? Maybe there is someone who can shed light to this for me.

Your task is fundamentally impossible, since string_view, by definition, needs to have continuous non-owning storage from start to finish. So it'll be impossible to manage the lifetime of the data.
You need to create some kind of concatenated_string<> custom range as your return type if you want to do something like this.
As to the specific reason your code is yielding weird results, it's simply because buffer does not exist anymore when the function exits.

In return std::string_view(buffer, sv1.size()+sv2.size()); the returned string_view views the buffer which goes out of scope, so you essentially have a dangling reference.

Related

How do handle varying array sizes as arguments for a function?

I managed to convert an over 5000 line code of Fortran 77 program to C++ manually but the conversion didn't go according to plan. So I am trying to debug the C++ program using my Fortran 77 program. In fortran I developed a subroutine that takes an array and prints out the array index and its value into a comma delimited file. I am trying to do the similar thing in C++. but run foul of the "double temp1[tempi]" declaration. The array does not have to be the same size in all the calls to the function. So I cant code it a say "double temp1[21]" because the next time it is 25. Fortran passes arrays by reference. What do you propose I do.
I managed to to this for the Fortran program. The idea is to take the variable memory dump from the c++ program and compare the values in Excel using vba to see which one changes the most and then to focus on that variable in the C++ program a starting debugging point.
c++ code logic:
void singlearrayd(double temp1[tempi], int tempi, string str1){
for (int md_i = 1; md_i <= tempi; md_i++){
cout << temp1[md_i] << "," << str1 << "(" << md_i << ")";
}
}
int main(){
double askin[22];
double fm[26];
singlearrayd(askin,22,"askin");
singlearrayd(fm,26,"fm");
return 0;
}
Fortran 77 code logic:
PROGRAM PRINT_MEMORY
real*8 :: ASKIN(21)
real*8 :: FM(25)
CALL SINGLEARRAYD(ASKIN,21,"ASKIN")
CALL SINGLEARRAYD(FM,25,"FM")
END PRINT_MEMORY
SUBROUTINE SINGLEARRAYD(TEMP1,TEMPI,STR1)
IMPLICIT NONE
CHARACTER(LEN=*) :: STR1
INTEGER*4 MD_I,TEMPI
REAL*8, DIMENSION(1:TEMPI) :: TEMP1
DO MD_I = 1, TEMPI
WRITE(51,'(ES25.16E3,A1,A25,A1,I5,A1)') TEMP1(MD_I),',',STR1,'(',
1 MD_I,')'
ENDDO
ENDSUBROUTINE SINGLEARRAYD
There are multiple problems in your code.
In C++, a native array (like your askin in main()) is converted into a pointer when passed to a function. So there is no need to declare a dimension on the array in the argument list BUT it is still necessary to pass a second argument, as you are specifying the size.
This means the C++ function should have the form
void singlearrayd(double temp1[], int tempi, std::string str1)
or (equivalently)
void singlearrayd(double *temp1, int tempi, std::string str1)
Note in the above that I have specified the type of the third argument by its full name as std::string. In a lot of cases, it is better avoid using namespace std.
The second problem is that you are assuming Fortran array indexing and C++ array indexing are the same. In reality, Fortran array indexing is 1-based (the first element of an array has index one, by default) and C++ array indexing is 0-based (the first element on an array has index zero). Using Fortran array indexing in C++ causes undefined behaviour, because it will access elements outside the valid range.
The third (potential) problem is that your function defines two variables named md_i (one in the function, and one within the loop). It is better to avoid doing that.
Addressing all of the above will turn your function to (in full)
void singlearrayd(double temp1[], int tempi, std::string str1)
{
for (int md_i = 0; md_i < tempi; ++md_i) // note the differences here carefully
{
cout << temp1[md_i] << "," << str1 << "(" << md_i << ")";
}
}
The fourth problem is that main() in C++ returns int, not void.
The fifth problem is that main() does not initialize the arrays before singlearrayd() prints them. In Fortran, arrays that are local to a function are (often) zero-initialised. In C++, they are uninitialised by default, so accessing their values (e.g. to print them) gives undefined behaviour.
int main()
{
double askin[21] = {0.0}; // initialise the first element. Other elements are initialised to zero
double fm[21] = {0.0};
singlearrayd(askin,21,"askin");
singlearrayd(fm,25,"fm");
}
That will get your code working. Practically, however, there are improvements possible. The first improvement is to use a standard container rather than an array. Standard containers know their size, so that allows simplifying your function. Second, pass non-trivial arguments (like containers or strings) by reference - and preferably const reference if no change is being made to the argument. Unlike Fortran, where function arguments are often passed by reference BY DEFAULT, it is necessary to DELIBERATELY introduce references in C++.
#include <vector>
void singlearrayd(const std::vector<double> &temp1, const std::string &str1)
{
for (std::size_t md_i = 0; md_i < temp1.size(); ++md_i)
{
cout << temp1[md_i] << "," << str1 << "(" << md_i << ")";
}
}
int main()
{
std::vector<double> askin(21); // askin has 21 elements, initialised to zero
std::vector<double> fm(21);
singlearrayd(askin, "askin");
singlearrayd(fm, "fm");
}
C++ containers also support iterators - which are safer in practice AND often more efficient - than using array indexing. I'll leave it as an exercise for you to learn how to use those.
A key message however: don't assume that a simple mechanical translation from Fortran to C++ will work. You have already demonstrated pitfalls of such an assumption. Take the time to learn C++ BEFORE trying to translate too much code from Fortran to C++. That is necessary both to get the C++ code working correctly and also to get it running efficiently.
A more modern implementation would be
#include <string>
#include <array>
#include <iostream>
template <std::size_t size, class U>
void singlearrayd(const std::array<U, size>& temp1, const std::string& str1){
int i = 0;
for (const auto& x : temp1)
std::cout << x << "," << str1 << "(" << (i++) << ")";
}
int main(){
std::array<double, 21> askin;
std::array<double, 21> fm;
singlearrayd(askin, "askin");
singlearrayd(fm, "fm");
return 0;
}
Please note that in the code above the two arrays askin and fm are not initialized. Presumably, in the real code you would have already initialized them before calling singlarrayd.
Also, remember that main must return an int.
Thank you for your valuable insight and comments. I think the best approach was use
void singlearrayd(double *temp1, int tempi, std::string str1)
Extending this idea and doing some more research using google I was able to extend this idea to handle 2D and 3D arrays.
void doublearrayd(double *temp1, int tempi, int tempj, std::string str1){
for (int md_j = 1; md_j<tempj; md_j++){
for (int md_i = 1; md_i<tempi; md_i++){
std::cout << *(temp1 + md_i*tempj + md_j) << "," << str1 << "(" << md_i << ";" << md_j << ")" << std::endl;
}
}
}
void triplearrayd(double *temp1, int tempi, int tempj, int tempk, std::string str1){
for (int md_k = 1; md_k < tempk; md_k++){
for (int md_j = 1; md_j<tempj; md_j++){
for (int md_i = 1; md_i<tempi; md_i++){
std::cout << *(temp1 + md_i*tempj*tempk + md_j*tempk + md_k) << "," << str1 << "(" << md_i << ";" << md_j << ";" << md_k << ")" << std::endl;
}
}
}
}
https://en.wikipedia.org/wiki/Row-_and_column-major_order
How can I pass a dynamic multidimensional array to a function?

Passing string 'by value' change in local value reflect in original value

Why is the change of my local variable's value getting reflected into original variable? I am passing it by value in C++.
#include <string>
#include <iostream>
void test(std::string a)
{
char *buff = (char *)a.c_str();
buff[2] = 'x';
std::cout << "In function: " << a;
}
int main()
{
std::string s = "Hello World";
std::cout << "Before : "<< s << "\n" ;
test(s);
std::cout << "\n" << "After : " << s << std::endl;
return 0;
}
Output:
Before : Hello World
In function: Hexlo World
After : Hexlo World
As soon as you wrote
buff[2] = 'x';
and compiled your code all bets were off. Per [string.accessors]
const charT* c_str() const noexcept;
Returns: A pointer p such that p + i == &operator[](i) for each i in [0,size()].
Complexity: constant time.
Requires: The program shall not alter any of the values stored in the character array.
emphasis mine
Since you are not allowed to modify the characters that the pointer points to but you do, you have undefined behavior. The compiler at this point is allowed to do pretty much whatever it wants. Trying to figure out why it did what it did is meaningless as any other compiler might not do this.
The moral of the story is do not cast const away unless you are really sure that you know what you are doing and if you do you need to, then document the code to show you know what you are doing.
Your std::string implementation uses reference counting and makes a deep copy only if you modify the string via its operator[] (or some other method). Casting the const char* return value of c_str() to char* will lead to undefined behavior.
I believe since C++11 std::string must not do reference counting anymore, so switching to C++11 might be enough to make your code work (Edit: I did not actually check that before, and it seems my assumption was wrong).
To be on the safe side, consider looking for a string implementation that guarantees deep copying (or implement one yourself).
#include <cstring>
#include <string>
#include <iostream>
void test(std::string a)
{
// modification trough valid std::string API
a[2] = 'x';
const char *buff = a.c_str(); // only const char* is available from API
std::cout << "In function: " << a << " | Trough pointer: " << buff;
// extraction to writeable char[] buffer
char writeableBuff[100];
// unsafe, possible attack trough buffer overflow, don't use in real code
strcpy(writeableBuff, a.c_str());
writeableBuff[3] = 'y';
std::cout << "\n" << "In writeable buffer: " << writeableBuff;
}
int main()
{
std::string s = "Hello World";
std::cout << "Before : "<< s << "\n" ;
test(s);
std::cout << "\n" << "After : " << s << std::endl;
return 0;
}
Output:
Before : Hello World
In function: Hexlo World | Trough pointer: Hexlo World
In writeable buffer: Hexyo World
After : Hello World

C++ snippet OK with MSVC but not with g++

I'm new to C++ and I try to adapt a program snippet which generates "weak compositions" or Multisets found here on stackoverflow but I run - to be quite frank - since hours into problems.
First of all, the program runs without any complaint under MSVC - but not on gcc.
The point is, that I have read many articles like this one here on stackoverflow, about the different behaviour of gcc and msvc and I have understood, that msvc is a bit more "liberal" in dealing with this situation and gcc is more "strict". I have also understood, that one should "not bind a non-const reference to a temporary (internal) variable."
But I am sorry, I can not fix it and get this program to work under gcc - again since hours.
And - if possible - a second question: I have to introduce a global variable
total, which is said to be "evil", although it works well. I need this value of total, however I could not find a solution with a non-global scope.
Thank you all very much for your assistance.
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int total = 0;
string & ListMultisets(unsigned au4Boxes, unsigned au4Balls, string & strOut = string(), string strBuild = string()) {
unsigned au4;
if (au4Boxes > 1) for (au4 = 0; au4 <= au4Balls; au4++)
{
stringstream ss;
ss << strBuild << (strBuild.size() == 0 ? "" : ",") << au4Balls - au4;
ListMultisets(au4Boxes - 1, au4, strOut, ss.str());
}
else
{
stringstream ss;
ss << mycount << ".\t" << "(" << strBuild << (strBuild.size() == 0 ? "" : ",") << au4Balls << ")\n";
strOut += ss.str();
total++;
}
return strOut;
}
int main() {
cout << endl << ListMultisets(5,3) << endl;
cout << "Total: " << total << " weak compositions." << endl;
return 0;
}
C++ demands that a reference parameter to an unnamed temporary (like string()) must either be a const reference or an r-value reference.
Either of those reference types will protect you from modifying a variable that you don't realize is going to be destroyed within the current expression.
Depending on your needs, it could would to make it a value parameter:
string ListMultisets( ... string strOut = string() ... ) {
or it could would to make it a function-local variable:
string ListMultisets(...) {
string strOut;
In your example program, either change would work.
Remove the default value for the strOut parameter.
Create a string in main and pass it to the function.
Change the return type of the function to be int.
Make total a local variable ListMultisets(). Return total rather than strOut (you are returning the string value strOut as a reference parameter.)
The signature of the new ListMultisets will look like:
int ListMultisets(unsigned au4Boxes, unsigned au4Balls, string & strOut)
I'll let you figure out the implementation. It will either be easy or educational.
Your new main function will look like:
int main() {
string result;
int total = ListMultisets(5,3, result);
cout << endl << result << endl;
cout << "Total: " << total << " weak compositions." << endl;
return 0;
}

Simulate std::vector with mixed const and non-const elements

I'd like to simulate a std::vector that has mixed const and non-const elements. More specifically, I want to have functions that operate on a vector and are allowed to see the entire vector but may only write to specific elements. The elements that can and cannot be written will be determined at runtime and may change during runtime.
One solution is to create a container that holds an array of elements and an equal sized array of booleans. All non-const access would be through a function that checks against the boolean array if the write is valid and throws an exception otherwise. This has the downside of adding a conditional to every write.
A second solution might be to have the same container but this time write access is done by passing an array editing function to a member function of the container. The container member function would let the array editing function go at the array and then check that it didn't write to the non-writable elements. This has the downside that the array editing function could be sneaky and pass around non-const pointers to the array elements, let the container function check that all is well, and then write to non-writable elements.
The last issue seems difficult to solve. It seems like offering direct writable access ever means we have to assume direct writable access always.
Are there better solutions?
EDIT: Ben's comment has a good point I should have addressed in the question: why not a vector of const and a vector of non-const?
The issue is that the scenario I have in mind is that we have elements that are conceptually part of one single array. Their placement in that array is meaningful. To use vectors of const and non-const requires mapping the single array that exist in concept to the two vectors that would implement it. Also, if the list of writable elements changes then the elements or pointers in the two vectors would need to be moved about.
I think you can accomplish what you wish with the following class, which is very simplified to illustrate the main concept.
template <typename T>
struct Container
{
void push_back(bool isconst, T const& item)
{
data.push_back(std::make_pair(isconst, item));
}
T& at(size_t index)
{
// Check whether the object at the index is const.
if ( data[index].first )
{
throw std::runtime_error("Trying to access a const-member");
}
return data[index].second;
}
T const& at(size_t index) const
{
return data[index].second;
}
T const& at(size_t index, int dummy) // Without dummy, can't differentiate
// between the two functions.
{
return data[index].second;
}
T const& at(size_t index, int dummy) const // Without dummy, can't differentiate
// between the two functions.
{
return data[index].second;
}
std::vector<std::pair<bool, T> > data;
};
Here's a test program and its output.
#include <stdio.h>
#include <iostream>
#include <utility>
#include <stdexcept>
#include <vector>
//--------------------------------
// Put the class definition here.
//--------------------------------
int main()
{
Container<int> c;
c.push_back(true, 10);
c.push_back(false, 20);
try
{
int value = c.at(0); // Show throw exception.
}
catch (...)
{
std::cout << "Expected to see this.\n";
}
int value = c.at(0, 1); // Should work.
std::cout << "Got c[0]: " << value << "\n";
value = c.at(1); // Should work.
std::cout << "Got c[1]: " << value << "\n";
value = c.at(1, 1); // Should work.
std::cout << "Got c[1]: " << value << "\n";
// Accessing the data through a const object.
// All functions should work since they are returning
// const&.
Container<int> const& cref = c;
value = cref.at(0); // Should work.
std::cout << "Got c[0]: " << value << "\n";
value = cref.at(0, 1); // Should work.
std::cout << "Got c[0]: " << value << "\n";
value = cref.at(1); // Should work.
std::cout << "Got c[1]: " << value << "\n";
value = cref.at(1, 1); // Should work.
std::cout << "Got c[1]: " << value << "\n";
// Changing values ... should only work for '1'
try
{
c.at(0) = 100; // Show throw exception.
}
catch (...)
{
std::cout << "Expected to see this.\n";
}
c.at(1) = 200; // Should work.
std::cout << "Got c[1]: " << c.at(1) << "\n";
}
Output from running the program:
Expected to see this.
Got c[0]: 10
Got c[1]: 20
Got c[1]: 20
Got c[0]: 10
Got c[0]: 10
Got c[1]: 20
Got c[1]: 20
Expected to see this.
Got c[1]: 200

what does C++ string erase return *this mean?

So the C++ string function
string& erase ( size_t pos = 0, size_t n = npos )
returns *this. What does that mean? Why do I need it to return anything?
Example
string name = "jimmy";
name.erase(0,1);
will erase j and become immy, but why do I need it to return anything at all?
For method chaining. For example, after you erase, you can call == on it to check something:
string name = "jimmy";
bool b = name.erase(0,1) == "immy";
It is only for convenience, for example you can chain calls like this:
name.erase(0,1).erase(3,1);
In your example you don't need it to return anything, because the expression:
name.erase(0,1)
is equivalent to:
((void)name.erase(0,1), name)
So for example you could write:
while(name.erase(0,1).size()) {
std::cout << name << '\n';
}
or:
while((name.erase(0,1), name).size()) {
std::cout << name << '\n';
}
or:
while(name.erase(0,1), name.size()) {
std::cout << name << '\n';
}
or:
while(true) {
name.erase(0,1);
if (!name.size()) break;
std::cout << name << '\n';
}
The standard has decided to give you the choice, probably on the basis that it might as well use the return value for something rather than "waste" it.
Basically, it sometimes saves a little bit of code that repeats a variable name or takes a reference to an intermediate result.
Some people think that functions that modify the object they're called on should not return anything (the idea being to limit the use of functions with side-effects to one per statement). In C++ they just have to live with the fact that the designers of the standard library disagree.
You can do things like this:
void process(std::string const &s) {}
process(name.erase(0,1)); //self-explanatory?
std::cout << name.erase(0,1) << std::endl;
//etc
And things which the other answers has mentioned.