Converting std::vector container into an std::set using std::transform - c++

More specifically I have a vector of some struct
std::vector<SomeStruct> extensions = getThoseExtensions();
where someStructVariable.extensionName returns a string.
And I want to create a set of extensionName, something like this std::set<const char*>.
Process is fairly straightforward when done using some for loops but I want to use std::transform from <algorithm> instead.
std::transform has four parameters.
1,2. First range (to transform first range from and to)
3. Second range/inserter (to transform second range)
4. A function
This is what I have so far
auto lambdaFn =
[](SomeStruct x) -> const char* { return x.extensionName; };
std::transform(availableExtensions.begin(),
availableExtensions.end(),
std::inserter(xs, xs.begin()),
lambdaFn);
because there's no "proper context" for std::back_inserter in std::set I'm using std::inserter(xs, xs.begin()).
The problem is I'm trying to return stack mem in my lambda function. So how do I get around this problem?
Oddly enough if I remove return from the function it works just like I expect it to! But I don't understand why and that strikes fear of future repercussion.
EDIT:
I'm using several structs in place of SomeStruct like VkExtensionProperties defined in vulkan_core
typedef struct VkExtensionProperties {
char extensionName[VK_MAX_EXTENSION_NAME_SIZE];
uint32_t specVersion;
} VkExtensionProperties;
From Khronos specs

You probably can't create a set of char * unless all instances of extensionName with the same value point to the same char array (it would store unique pointers instead of unique values). If you use std::set<std::string> instead this will both work and only store unique values and solve your variable lifetime problem as std::string takes care of copying (or moving) itself for you where necessary:
auto lambdaFn =
[](const SomeStruct& x) { return std::string(x.extensionName); };
std::set<std::string> xs;
std::transform(availableExtensions.begin(),
availableExtensions.end(),
std::inserter(xs, xs.begin()),
lambdaFn);

One way to do what you want is with the following lambda
auto lambda = [](const SomeStruct& x) -> const char* { return x.extensions.data();};
The problem with this is, that you are saving pointers to memory owned by someone else (those strings). When they are destroyed (this seems to be the case at the end of the function), the pointer will be dangling. You can get around this by allocating memory in your lambda and copying the data:
auto lambda = [](const SomeStruct & x) -> const char* {
char* c = new char[x.extensions.length()+1];
std::strcpy(c, x.extensions.data());
return c;
}
But then you have to do memory management yourself (i.e. remember to free those const char*). And that is a bad idea. You should probably reconsider what you are doing. Why are you using const char* here and not std:: string?
Please remember that the typical use for const char* is to save string literals i C-code, i.e. the code
const char* str = "Hello World!";
creates a char array of sufficient size in the static section of the memory, initializes it with the string (a compile time constant) and then saves a pointer to that in str. This is also why this has to be a const char* since another pointer refering to an equal string literal may (or may not) point the exactly the same char array and you don't want to enable change there. So don't just use const char* because you see strings in C saved in those const char* without anyone needing to free them later.

There are a couple of things you can do here.
If you own the definition of SomeStruct, it is best if you changed that member to std::string.
Short of that, see if you lambda can take by-ref parameter const auto& obj. This will not create a copy and point back to the object the container has. However, I am still afraid of this solution since this smells like bad class design where ownership and lifetime of members is ambiguous.

Related

How to insert structure pointer into stl vector and display the content

I am calling a function in a loop which takes argument as structure pointer (st *ptr) and i need to push_back this data to a STL vector and display the content in a loop.How can i do it? please help.
struct st
{
int a;
char c;
};
typedef struct st st;
function(st *ptr)
{
vector<st*>myvector;
vector<st*>:: iterator it;
myvector.push_back(ptr);
it=myvector.begin();
cout<<(*it)->a<<(*it)->c<<endl;
}
is this correct? i am not getting the actual output.
Code snippet-----
void Temperature_sensor::temp_notification()//calling thread in a class------
{
cout<<"Creating thread to read the temperature"<<endl;
pthread_create(&p1,NULL,notifyObserver_1,(void*)(this));
pthread_create(&p2,NULL,notifyObserver_2,(void*)(this));
pthread_join(p1,NULL);
pthread_join(p2,NULL);
}
void* Temperature_sensor::notifyObserver_1(void *data)
{
Temperature_sensor *temp_obj=static_cast<Temperature_sensor *>(data);
(temp_obj)->it=(temp_obj)->observers.begin();
ifstream inputfile("temp.txt");//Reading a text file
while(getline(inputfile,(temp_obj)->line))
{
stringstream linestream((temp_obj)->line);
getline(linestream,(temp_obj)->temperature,':');
getline(linestream,(temp_obj)->temp_type,':');
cout<<(temp_obj)->temperature<<"---"<<(temp_obj)->temp_type<<endl;
stringstream ss((temp_obj)->temperature);
stringstream sb((temp_obj)->temp_type);
sb>>(temp_obj)->c_type;
ss>>(temp_obj)->f_temp;
cout<<"____"<<(temp_obj)->f_temp<<endl;
(temp_obj)->a.temp=(temp_obj)->f_temp;
(temp_obj)->a.type=(temp_obj)->c_type;
cout<<"------------------q"<<(temp_obj)->a.type<<endl;
(*(temp_obj)->it)->update(&(temp_obj)->a);//Calling the function -------
}
input file temp.txt
20:F
30:C
40:c
etc
void Temperature_monitor::update(st *p) {}//need to store in a vector------
If you use a std::vector you should do something like this:
std::vector<st> v; //use st as type of v
//read
for(auto const& i : v) {
std::cout << i.param1 << ' ' << i.param2;
}
//push_back
v.push_back({param1, param2});
Of course, you could have more than 2 params.
Could you please share sample input data and expected output?
With your code it will always create a new vector and put 1 structure object there.
if you want to have single vector store all structure objects, then declare vector in calling function of "function"
It looks like you're allocating a buffer data of type void* with malloc() or a similar function, then casting data to Temperature_sensor*. It also appears that Temperature_sensor is a class with std::string members, which you are attempting to assign to and print.
This will not work because std::string is not a POD type, and so the std::string constructor is never actually invoked (likewise, Temperature_sensor is not a POD type because it has non-POD members, and its constructor is therefore never invoked).
To construct the objects correctly you need to use operator new() in place of malloc() like so
Temperature_sensor *tsensor = new Temperature_sensor;
Temperature_sensor *five_tsensors = new Temperature_sensor[5];
It would be more idiomatic to use a smart pointer like std::unique_ptr or std::shared_ptr instead of using operator new() (and operator delete()) directly, and best/most idiomatic to use a std::vector. Any of these methods will construct the allocated objects correctly.
You should also strongly consider dramatically simplifying the Temperature_sensor class. It appears to have numerous instance variables that redundantly store the same information in different formats, and which would make more sense as local variables inside your functions.
You also don't need to be creating all those std::stringstream's; consider using std::stod() and std::stoi() to convert strings to floating-point or integers, and std::to_string() to convert numbers to strings.

What is the exact behaviour of delete and delete[]?

Why is this code wrong? Am I missing something regarding the behaviour of delete and delete[]?
void remove_stopwords(char** strings, int* length)
{
char** strings_new = new char*[*length];
int length_new = 0;
for(int i=0; i<*length; i++) {
if(is_common_keyword(strings[i]) == 0) {
strings_new[length_new] = strings[i];
length_new++;
}
else {
delete strings[i];
strings[i] = nullptr;
}
}
delete[] strings;
strings = new char*[length_new];
for(int i=0; i<length_new; i++) {
strings[i] = strings_new[i];
}
delete[] strings_new;
*length = length_new;
}
Explanations: this code should take an array of C-style strings and remove some particular strings of them; the array of C-style strings was created using new[] and every C-style string was created using new. The result of the code is that no word is canceled, but the array is only sliced.
I don't see any problem in the use of new[] or delete[] in the code shown.
No, wait.
I see a lot¹ of problems, but your intent is clear and the code seems doing what you want it to do.
The only logical problem I notice is that you're passing strings by value (it's a char** and reassigning it in the function will not affect the caller variable containing the pointer). Changing the signature to
void remove_stopwords(char**& strings, int* length)
so a reference is passed instead should fix it.
(1) Using std::vector<const char *> would seem more logical, even better an std::vector<std::string> if possible, that would take care of all allocations and deallocations.
every C-style string was created using new.
I suspect this is your problem -- C style strings are char arrays, so you can't readily create them with new, you need to use new[]. Which means you need to use delete[].
As #6502 pointed out, your basic problem is fairly simple: you're passing a char **, and attempting to modify it (not what it points at) in the function.
You're using that as a dynamically allocated array of strings, so what you're modifying is just the copy of the pointer that was passed into the function. Since you (apparently) want the function to modify what was passed into it, you need to either pass a char *** (ugh!) or char **& (still quite awful).
You really should use a vector<std::string> for the data. At least in my opinion, the code to remove the stop words should be written as a generic algorithm, something on this general order:
template <typename InIt, typename OutIt>
void remove_stop_words(InIt b, InIt e, OutIt d) {
std::remove_copy_if(b, e, d,
[](std:string const &s) { is_stop_word(s); });
}
With this, the calling code would look something like this:
// read input
std::vector<std::string> raw_input { std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>() };
// Filter out stop words:
std::vector<std::string> filtered_words;
remove_stop_words(raw_input.begin(), raw_input.end(),
std::back_inserter(filtered_words));
In a case like this, however, you don't really need to store the raw input words into a vector at all. You can pass an istream_iterator directly to remove_stop_words, and have it just produce the desired result:
std::ifstream in("raw_input.txt");
std::vector<std::string> filtered_words;
remove_stop_words(std::istream_iterator<std::string>(in),
std::istream_iterator<std::string>(),
std::back_inserter(filtered_words));
As an aside, you could also consider using a Boost filter_iterator instead. This would/will allow you to do the filtering in the iterator as you read the data rather than in an algorithm applied to the iterator.

constructing char*const* from string

I am trying to convert a string to a const*char* in order to be able to call a library function. My code is as follows:
// myVec is simply a vector<string>
vector<string> myVec;
/* stuff added to myVec
* it is a vector of words that were seperated by whitespace
* for example myVec[0]=="Hey"; myVec[1]=="Buck"; myVec[2]=="Rogers"; etc...
*/
char*const* myT = new char*[500]; //I believe my problem stems from here
for(int z=0; z<myVec.size(); z++) {
string temp=myVec[z]+=" ";
myT[z]=temp.c_str();
}
//execv call here
I am constructing this for the second parameter of execv().
Compiler always throws various errors, and when I fix one another one pops up (seems rather circular from the solutions/google-fu I have employed).
The signature of execv expects the array of arguments to point to modifyable C style strings. So contrary to what the other answers suggest, c_str() is not such a good idea.
While not guaranteed in C++03, the fact is that all implementations of std::string that I know of store the data in a contiguous NULL terminated block of memory (this is guaranteed in C++11), so you can use that to your advantage: Create a vector of pointers to modifiable character arrays, initialize the values with the buffers for the strings in your input vector and pass the address of that block of data to execv:
std::vector<char*> args;
args.reserve(myVec.size()+1);
for (std::vector<std::string>::iterator it=myVec.begin(); it != myVec.end(); ++it) {
args.push_back(&((*it)[0]);
}
args.push_back(0); // remember the null termination:
execv("prog", &args[0]);
There are two fundamental problems which need addressing. The
first is a compiler error: the pointers in the array pointed to
by myT are const, so you cannot assign to them. Make myT
char const** myT;. The second problem is that what you are
assigning to them is a pointer into a local variable, which
will be destructed when it goes out of scope, so the pointers
will dangle.
Does the function you are calling really need the extra white
space at the end? (You mentioned execv somewhere, I think.
If that's the function, the extra whitespace will do more harm
than good.) If not, all you have to do is:
std::vector<char const*> myT( myVec.size() + 1 );
std::transform( myVec.begin(), myVec.end(), myT.begin(),
[]( std::string const& arg ) { return arg.c_str(); } );
execv( programPath, &myT[0] );
If you can't count on C++11 (which is still usually the case),
you can probably do something similar with boost::bind;
otherwise, just write the loop yourself.
If you do need to transform the strings in myVec in some way,
the best solution is still to copy them into a second
std::vector<std::string>, with the transformation, and use
this.
(BTW: do you really want to modify the contents of myVec, by
using += on each element in the loop?)

Which container to use for String-Interning

My goal is to do string-interning. For this I am looking for a hashed
container class that can do the following:
allocate only one block of memory per node
different userdata size per node
The value type looks like this:
struct String
{
size_t refcnt;
size_t len;
char data[];
};
Every String object will have a different size. This will be accomplished with
opereator new + placement new.
So basically I want to allocate the Node myself and push it in the container later.
Following containers are not suitable:
std::unordored_set
boost::multi_index::*
Cannot allocate different sized nodes
boost::intrusive::unordered_set
Seems to work at first. But has some drawbacks. First of all you have to allocate
the bucket array and maintain the load-factor yourself. This is just unnecessary
and error-prone.
But another problem is harder to solve: You can only search for objects that have the
type String. But it is inefficient to allocate a String everytime you look for an entry
and you only have i.e. a std::string as input.
Are there any other hashed containers that can be used for this task?
I don't think you can do that with any of the standard containers.
What you can do is store the pointer to String and provide custom hash and cmp functors
struct StringHash
{
size_t operator() (String* str)
{
// calc hash
}
};
struct StringCmp
{
bool operator() (String* str1, String* str2)
{
// compare
}
};
std::unordered_set<String*, StringHash, StringCmp> my_set;
Your definition for String won't compile in C++; the obvious
solution is to replace the data field with a pointer (in which
case, you can put the structures themselves in
std::unordered_set).
It's possible to create an open ended struct in C++ with
something like the following:
struct String
{
int refcnt;
int len;
char* data()
{
return reinterpret_cast<char*>(this + 1);
}
};
You're skating on thin ice if you do, however; for types other
than char, there is a risk that this + won't be
appropriately aligned.
If you do this, then your std::unordered_set will have to
contain pointers, rather than the elements, so I doubt you'll
gain anything for the effort.

C++: Return multiple NEW arrays

I think this is an easy issue, but it's driving me crazy: I want to return multiple arrays from one method, for which the calling method does not know their size in advance. So I have to create those Arrays inside the method (in contrast to just filling them) and I am not able to return them using return.
So what I would want is a method signature like this:
void giveMeArray(int[] *anArray)
Method signature has only one parameter to simplify the examples, please assume I could also have a signature like
void giveMeArrays(int[] *anArray, float[] *anotherArray)
Inside that method giveMeArray I would construct the array with
*anArray = new int[5];
and I would call that method using
int[] result;
giveMeArray(&result);
However, all this (starting with the method signature) is at least syntactically wrong. Please excuse that I don't have the compiler errors at hand by now, I'm pretty sure some of you will know what's wrong.
EDIT I know that std::vector would be the best (meaning cleanest) approach. However, folks, that wasn't the question.
Return a single vector (this is C++ afterall)
void giveMeArray(std::vector<int>& anArray)
{
anArray = std::vecotr<int>(5);
}
Return a vector of vectors:
void giveMeArray(std::vector<std::vector<int> >& anArray)
void giveMeArray(int **anArray);
int *result;
giveMeArray(&result);
std::vector<int> giveMeArray() {
std::vector<int> ret;
ret.resize(5);
return ret;
}
Nice resource cleanup, bounds checking in debug modes, etc. Good for all the family.
Consider wrapping the arrays in a class or struct.
struct Arrays {
int *ints;
int intCount;
double *doubles;
int doubleCount;
};
Arrays giveMeArrays() {
Arrays arrays;
arrays.ints = new int[10];
arrays.intCount = 10;
arrays.doubles = new double[20];
arrays.doubleCount = 20;
return arrays;
}
An alternative is to use a std::pair<> or a std::tuple<>, but in my experience any use of those eventually becomes a named type. The fact that they are all part of the result of your function suggests they may have enough coherence to be an object. Having a user-defined type makes it easier to pass the data around, and so to refactor code. You may even find that giveMeArrays() becomes a member function of this object.
Replacing ints/intCount with std::vector<int> would be better, if possible. If not, you may want to give Arrays more responsibility for memory management, disable copying while allowing moving, and so forth.