creating large number of object pointers - c++

I have defined a class like this:
class myClass {
private:
int count;
string name;
public:
myClass (int, string);
...
...
};
myClass::myClass(int c, string n)
{
count = c;
name = n;
}
...
...
I have also a *.txt file which in each line there is a name:
David
Jack
Peter
...
...
Now I read the file line by line and create a new object pointer for each line and store all objects in a vector. The function is like this:
vector<myClass*> myFunction (string fileName)
{
vector<myClass*> r;
myClass* obj;
ifstream infile(fileName);
string line;
int count = 0;
while (getline(infile, line))
{
obj = new myClass (count, line);
r.push_back(obj);
count++;
}
return r;
}
For small *.txt files I have no problem. However, sometimes my *.txt files contain more than 1 million lines. In these cases, the program is dramatically slow. Do you have any suggestion to make it faster?

First, find faster io than std streams.
Second, can you use string views instead of strings? They are C++17, but there are C++11 and earlier versions everywhere.
Third,
myClass::myClass(int c, string n) {
count = c;
name = n;
}
should read
myClass::myClass(int c, std::string n):
count(c),
name(std::move(n))
{}
which would make a difference for long names. None for short ones due to "small string optimization".
Forth, stop making vectors of pointers. Create vectors of values.
Fifth, failing that, find a more efficient way to allocate/deallocate the objects.

One thing you can do is directly move the string you've read from the file into the objects you're creating:
myClass::myClass(int c, string n)
: count{c}, name{std::move(n)}
{ }
You could also benchmark:
myClass::myClass(int c, string&& n)
: count{c}, name{std::move(n)}
{ }
The first version above will make a copy of line as the function is called, then let the myClass object take over the dynamically allocated buffer used for that copy. The second version (with string&& n argument), will let the myClass object rip out line's buffer directly: that means less copying of textual data but also line's likely to be stripped of any buffer as each line of the file is read in. Hopefully your allocation will normally be able to see from the input buffer how large a capacity line needs to read in the next line, and avoid any extra allocations/copying. As always, measure when you've reason to care.
You'd likely get a small win by reserving space for your vector up front, though the fact that you're storing pointers in the vector instead of storing myClass objects by value makes any vector resizing relatively cheap. Countering that, storing pointers does mean you're doing an extra dynamic allocation.
Another thing you can do is increase the stream buffer size: see pubsetbuf and the example therein.
If speed it extremely important, you should memory map the file and store pointers into the memory mapped region, instead of copying from the file stream buffer into distinct dynamically-allocated memory regions inside distinct strings. This could easily make a dramatic difference - perhaps as much as an order of magnitude - but a lot depends on the speed of your disk etc. so benchmark both if you've reason to care.

Related

Getting input from text file and storing into array but text file contains more than 20.000 strings

Getting inputs from a text file and storing it into an array but text file contains more than 20.000 strings. I'm trying to read strings from the text file and store them into a huge-sized array. How can I do that?
I can not use vectors.
Is it possible to do it without using a hash table?
Afterward, I will try to find the most frequently used words using sorting.
You requirement is to NOT use any standard container like for example a std::vector or a std::unordered_map.
In this case we need to create a dynamic container by ourself. That is not complicated. And we can use this even for storing strings. So, I will even not use std::string in my example.
I created some demo for you with ~700 lines of code.
We will first define the term "capacity". This is the number of elements that could be stored in the container. It is the currently available space. It has nothing to do, how many elements are really stored in the container.
But there is one and the most important functionality of a dynamic container. It must be able to grow. And this is always necessary, if we want to store add more elements to the container, as its capacity.
So, if we want to add an additional element at the end of the container, and if the number of elements is >= its capacity, then we need to reallocate bigger memory and then copy all the old elements to the new memory space. For such events, we will usually double the capacity. This should prevent frequent reallocations and copying activities.
Let me show you one example for a push_back function, which could be implemented like this:
template <typename T>
void DynamicArray<T>::push_back(const T& d) { // Add a new element at the end
if (numberOfElements >= capacity) { // Check, if capacity of this dynamic array is big enough
capacity *= 2; // Obviously not, we will double the capacity
T* temp = new T[capacity]; // Allocate new and more memory
for (unsigned int k = 0; k < numberOfElements; ++k)
temp[k] = data[k]; // Copy data from old memory to new memory
delete[] data; // Release old memory
data = temp; // And assign newly allocated memory to old pointer
}
data[numberOfElements++] = d; // And finally, store the given data at the end of the container
}
This is a basic approach. I use templates in order to be able to store any type in the dynamic array.
You could get rid of the templates, by deleting all template stuff and replacing "T" with your intended data type.
But, I would not do that. See, how easy we can create a "String" class. We just typedef a dynamic array for chars as "String".
using String = DynamicArray<char>;
will give us basic string functionality. And if we want to have a dynamic array of strings later, we can write:
using StringArray = DynamicArray<String>;
and this gives us a DynamicArray<DynamicArray<char>>. Cool.
For this special classes we can overwrite some operators, which will make the handling and our life even more simple.
Please look in the provided code
And, to be able to use the container in the typical C++ environment, we can add full iterator capability. That makes life even more simple.
This needs really some typing effort, but is not complicated. And, it will make life really simpler.
You also wanted to create a hash map. For counting words.
For that we will create a key/value pair. The key is the String that we defined above and the value will be the frequency counter.
We implement a hash function which should be carefully selected to work with strings, has a high entropy and give good results for the bucket size of the hash map.
The hash map itself is a dynamic container. We will also add iterator functionality to it.
For all this I drafted some 700 lines of code for you. You can take this as an example for your further studies.
It can also be easily enhanced with additional functionality.
But caveat: I did only some basic tests and I even used raw pointers for owned memory. This can be done in a schoolproject to learn some dynamic memory management, but not in reality.
Additionlly. You can replace all this code, by simply using std::string, std::vector and std::unordered_map. Nobody would use such code and reinvent the wheel.
But it may give you some ideas on how to implement similar things.
Because Stackoverlof limits the answer size to 32000 characters, I will put the main part on github.
Please click here.
I will just show you main so that you can see how easy the solution can be used:
int main() {
// Open file and check, if it could be opened
std::ifstream ifs{ "r:\\test.txt" };
if (ifs) {
// Define a dynamic array for strings
StringArray stringArray{};
// Use overwritten extraction operator and read all strings from the file to the dynamic array
ifs >> stringArray;
// Create a dynamic hash map
HashMap hm{};
// Now count the frequency of words
for (const String& s : stringArray)
hm[s]++;
// Put the resulting key/value pairs into a dynamic array
DynamicArray<Item> items(hm.begin(), hm.end());
// Sort in descending order by the frequency
std::sort(items.begin(), items.end(), [](const Item& i1, const Item& i2) { return i1.count > i2.count; });
// SHow resulton screen
for (const auto& [string, count] : items)
std::cout << std::left << std::setw(20) << string << '\t' << count << '\n';
}
else std::cerr << "\n\nError: Could not open source file\n\n";
}
You do not need to keep the whole file in memory to count frequency of words. You only need to keep a single entry and some data structure to count the frequencies, for example a std::unordered_map<std::string,unsigned>.
Not tested:
std::unordered_map<std::string,unsigned> processFileEntries(std::ifstream& file) {
std::unordered_map<std::string,unsigned> freq;
std::string word;
while ( file >> entry ) {
++freqs[entry];
}
return freq;
}
For more efficient reading or more elaborated processing you could also read chunks of the file (eg 100 words), process chunks, and then continue with the next chunk.
Assuming you're using C-Style / raw arrays you could do something like:
const size_t number_of_entries = count_entries_in_file();
//Make sure we actually have entries
assert(number_of_entries > 0);
std::string* file_entries = new std::string[number_of_entries];
//fill file_entries with the files entries
//...
//release heap memory again, so we don't create a leak
delete[] file_entries;
file_entries = nullptr;
You can use a std::map to get the frequency of each word in your text file. One example for reference is given below:
#include <iostream>
#include <map>
#include <string>
#include <sstream>
#include <fstream>
int main()
{
std::ifstream inputFile("input.txt");
std::map<std::string, unsigned> freqMap;
std::string line, word;
if(inputFile)
{
while(std::getline(inputFile, line))//go line by line
{
std::istringstream ss(line);
while(ss >> word)//go word by word
{
++freqMap[word]; //increment the count value corresponding to the word
}
}
}
else
{
std::cout << "input file cannot be opened"<<std::endl;
}
//print the frequency of each word in the file
for(auto myPair: freqMap)
{
std::cout << myPair.first << ": " << myPair.second << std::endl;
}
return 0;
}
The output of the above program can be seen here.

How to insert structure pointer into stl vector and display the content

I am calling a function in a loop which takes argument as structure pointer (st *ptr) and i need to push_back this data to a STL vector and display the content in a loop.How can i do it? please help.
struct st
{
int a;
char c;
};
typedef struct st st;
function(st *ptr)
{
vector<st*>myvector;
vector<st*>:: iterator it;
myvector.push_back(ptr);
it=myvector.begin();
cout<<(*it)->a<<(*it)->c<<endl;
}
is this correct? i am not getting the actual output.
Code snippet-----
void Temperature_sensor::temp_notification()//calling thread in a class------
{
cout<<"Creating thread to read the temperature"<<endl;
pthread_create(&p1,NULL,notifyObserver_1,(void*)(this));
pthread_create(&p2,NULL,notifyObserver_2,(void*)(this));
pthread_join(p1,NULL);
pthread_join(p2,NULL);
}
void* Temperature_sensor::notifyObserver_1(void *data)
{
Temperature_sensor *temp_obj=static_cast<Temperature_sensor *>(data);
(temp_obj)->it=(temp_obj)->observers.begin();
ifstream inputfile("temp.txt");//Reading a text file
while(getline(inputfile,(temp_obj)->line))
{
stringstream linestream((temp_obj)->line);
getline(linestream,(temp_obj)->temperature,':');
getline(linestream,(temp_obj)->temp_type,':');
cout<<(temp_obj)->temperature<<"---"<<(temp_obj)->temp_type<<endl;
stringstream ss((temp_obj)->temperature);
stringstream sb((temp_obj)->temp_type);
sb>>(temp_obj)->c_type;
ss>>(temp_obj)->f_temp;
cout<<"____"<<(temp_obj)->f_temp<<endl;
(temp_obj)->a.temp=(temp_obj)->f_temp;
(temp_obj)->a.type=(temp_obj)->c_type;
cout<<"------------------q"<<(temp_obj)->a.type<<endl;
(*(temp_obj)->it)->update(&(temp_obj)->a);//Calling the function -------
}
input file temp.txt
20:F
30:C
40:c
etc
void Temperature_monitor::update(st *p) {}//need to store in a vector------
If you use a std::vector you should do something like this:
std::vector<st> v; //use st as type of v
//read
for(auto const& i : v) {
std::cout << i.param1 << ' ' << i.param2;
}
//push_back
v.push_back({param1, param2});
Of course, you could have more than 2 params.
Could you please share sample input data and expected output?
With your code it will always create a new vector and put 1 structure object there.
if you want to have single vector store all structure objects, then declare vector in calling function of "function"
It looks like you're allocating a buffer data of type void* with malloc() or a similar function, then casting data to Temperature_sensor*. It also appears that Temperature_sensor is a class with std::string members, which you are attempting to assign to and print.
This will not work because std::string is not a POD type, and so the std::string constructor is never actually invoked (likewise, Temperature_sensor is not a POD type because it has non-POD members, and its constructor is therefore never invoked).
To construct the objects correctly you need to use operator new() in place of malloc() like so
Temperature_sensor *tsensor = new Temperature_sensor;
Temperature_sensor *five_tsensors = new Temperature_sensor[5];
It would be more idiomatic to use a smart pointer like std::unique_ptr or std::shared_ptr instead of using operator new() (and operator delete()) directly, and best/most idiomatic to use a std::vector. Any of these methods will construct the allocated objects correctly.
You should also strongly consider dramatically simplifying the Temperature_sensor class. It appears to have numerous instance variables that redundantly store the same information in different formats, and which would make more sense as local variables inside your functions.
You also don't need to be creating all those std::stringstream's; consider using std::stod() and std::stoi() to convert strings to floating-point or integers, and std::to_string() to convert numbers to strings.

Why is the dynamically allocated array attribute of my class template only able to store one item?

I am trying to expand the functionality of a class template I created. Previously it allowed you to use key-value pairs of any type but only if you knew the size of the arrays at compile time. It looked like this:
template <typename K, typename V, int N>
class KVList {
size_t arraySize;
size_t numberOfElements;
K keys[N];
V values[N];
public:
KVList() : arraySize(N), numberOfElements(0) { }
// More member functions
}
I wanted to be able to use this for a dynamic number of elements decided at run-time, so I changed the code to this:
template <typename K, typename V>
class KVList {
size_t arraySize;
size_t numberOfElements;
K* keys;
V* values;
public:
KVList(size_t size) : numberOfElements(0) {
arraySize = size;
keys = new K[size];
values = new V[size];
}
~KVList() {
delete[] keys;
keys = nullptr;
delete[] values;
values = nullptr;
}
// More member functions
}
The new constructor has one parameter which is the size that will be used for the KVList. It still starts the numberOfElements at 0 because both of these uses would start the KVList empty, but it does set arraySize to the value of the size parameter. Then it dynamically allocated memory for the arrays of keys and values. An added destructor deallocates the memory for these arrays and then sets them to nullptr.
This compiles and runs, but it only stores the first key and first value I try to add to it. There is a member function in both that adds a key-value pair to the arrays. I tested this with the Visual Studio 2015 debugger and noticed it storing the first key-value pair fine, and then it attempts to store the next key-value pair in the next index, but the data goes no where. And the debugger only shows one slot in each array. When I attempt to cout the data I thought I stored at that second index, I get a very small number (float data type was trying to be stored), not the data I was trying to store.
I understand it might be worth using the vectors to accomplish this. However, this is an expansion on an assignment I completed in my C++ class in school and my goal with doing this was to try to get it done, and understand what might cause issues doing it this way, since this is the obvious way to me with the knowledge I have so far.
EDIT: Code used to add a key-value pair:
// Adds a new element to the list if room exists and returns a reference to the current object, does nothing if no room exists
KVList& add(const K& key, const V& value) {
if (numberOfElements < arraySize) {
keys[numberOfElements] = key;
values[numberOfElements] = value;
numberOfElements++;
}
return *this;
}
EDIT: Code that calls add():
// Temp strings for parts of a grade record
string studentNumber, grade;
// Get each part of the grade record
getline(fin, studentNumber, subGradeDelim); // subGradeDelim is a char whose value is ' '
getline(fin, grade, gradeDelim); // gradeDelim is a char whose value is '\n'
// Attempt to parse and store the data from the temp strings
try {
data.add(stoi(studentNumber), stof(grade)); // data is a KVList<size_t, float> attribute
}
catch (...) {
// Temporary safeguard, will implement throwing later
data.add(0u, -1);
}
Code used to test displaying the info:
void Grades::displayGrades(ostream& os) const {
// Just doing first two as test
os << data.value(0) << std::endl;
os << data.value(1);
}
Code in main cpp file used for testing:
Grades grades("w6.dat");
grades.displayGrades(cout);
Contents of w6.dat:
1022342 67.4
1024567 73.5
2031456 79.3
6032144 53.5
1053250 92.1
3026721 86.5
7420134 62.3
9762314 58.7
6521045 34.6
Output:
67.4
-1.9984e+18
The problem (or at least one of them) is with this line from your pastebin:
data = KVList<size_t, float>(records);
This seemingly innocent line is doing a lot. Because data already exists, being default constructed the instance that you entered the body of the Grades constructor, this will do three things:
It will construct a KVList on the right hand side, using its constructor.
It will call the copy assignment operator and assign what we constructed in step 1 to data.
The object on the right hand side gets destructed.
You may be thinking: what copy assignment operator, I never wrote one. Well, the compiler generates it for you automatically. Actually, in C++11, generating a copy assignment operator automatically with an explicit destructor (as you have) is deprecated; but it's still there.
The problem is that the compiler generated copy assignment operator does not work well for you. All your member variables are trivial types: integers and pointers. So they just copied over. This means that after step 2, the class has just been copied over in the most obvious way. That, in turn, means that for a brief instance, there is an object on the left and right, that both have pointers pointing to the same place in memory. When step 3 fires, the right hand object actually goes ahead and deletes the memory. So data is left with pointers pointing to random junk memory. Writing to this random memory is undefined behavior, so your program may do (not necessarily deterministic) strange things.
There are (to be honest) many issues with how your explicit resource managing class is written, too many to be covered here. I think that in Accelerated C+, a really excellent book, it will walk you through these issues, and there is an entire chapter covering every single detail of how to properly write such a class.

C++ Allocate Memory Without Activating Constructors

I'm reading in values from a file which I will store in memory as I read them in. I've read on here that the correct way to handle memory location in C++ is to always use new/delete, but if I do:
DataType* foo = new DataType[sizeof(DataType) * numDataTypes];
Then that's going to call the default constructor for each instance created, and I don't want that. I was going to do this:
DataType* foo;
char* tempBuffer=new char[sizeof(DataType) * numDataTypes];
foo=(DataType*) tempBuffer;
But I figured that would be something poo-poo'd for some kind of type-unsafeness. So what should I do?
And in researching for this question now I've seen that some people are saying arrays are bad and vectors are good. I was trying to use arrays more because I thought I was being a bad boy by filling my programs with (what I thought were) slower vectors. What should I be using???
Use vectors!!! Since you know the number of elements, make sure that you reserve the memory first (by calling myVector.reserve(numObjects) before you then insert the elements.).
By doing this, you will not call the default constructors of your class.
So use
std::vector<DataType> myVector; // does not reserve anything
...
myVector.reserve(numObjects); // tells vector to reserve memory
You can use ::operator new to allocate an arbitrarily sized hunk of memory.
DataType* foo = static_cast<DataType*>(::operator new(sizeof(DataType) * numDataTypes));
The main advantage of using ::operator new over malloc here is that it throws on failure and will integrate with any new_handlers etc. You'll need to clean up the memory with ::operator delete
::operator delete(foo);
Regular new Something will of course invoke the constructor, that's the point of new after all.
It is one thing to avoid extra constructions (e.g. default constructor) or to defer them for performance reasons, it is another to skip any constructor altogether. I get the impression you have code like
DataType dt;
read(fd, &dt, sizeof(dt));
If you're doing that, you're already throwing type safety out the window anyway.
Why are you trying to accomplish by not invoking the constructor?
You can allocate memory with new char[], call the constructor you want for each element in the array, and then everything will be type-safe. Read What are uses of the C++ construct "placement new"?
That's how std::vector works underneath, since it allocates a little extra memory for efficiency, but doesn't construct any objects in the extra memory until they're actually needed.
You should be using a vector. It will allow you to construct its contents one-by-one (via push_back or the like), which sounds like what you're wanting to do.
I think you shouldn't care about efficiency using vector if you will not insert new elements anywhere but at the end of the vector (since elements of vector are stored in a contiguous memory block).
vector<DataType> dataTypeVec(numDataTypes);
And as you've been told, your first line there contains a bug (no need to multiply by sizeof).
Building on what others have said, if you ran this program while piping in a text file of integers that would fill the data field of the below class, like:
./allocate < ints.txt
Then you can do:
#include <vector>
#include <iostream>
using namespace std;
class MyDataType {
public:
int dataField;
};
int main() {
const int TO_RESERVE = 10;
vector<MyDataType> everything;
everything.reserve( TO_RESERVE );
MyDataType temp;
while( cin >> temp.dataField ) {
everything.push_back( temp );
}
for( unsigned i = 0; i < everything.size(); i++ ) {
cout << everything[i].dataField;
if( i < everything.size() - 1 ) {
cout << ", ";
}
}
}
Which, for me with a list of 4 integers, gives:
5, 6, 2, 6

How can I make my char buffer more performant?

I have to read a lot of data into:
vector<char>
A 3rd party library reads this data in many turns. In each turn it calls my callback function whose signature is like this:
CallbackFun ( int CBMsgFileItemID,
unsigned long CBtag,
void* CBuserInfo,
int CBdataSize,
void* CBdataBuffer,
int CBisFirst,
int CBisLast )
{
...
}
Currently I have implemented a buffer container using an STL Container where my method insert() and getBuff are provided to insert a new buffer and getting stored buffer. But still I want better performing code, so that I can minimize allocations and de-allocations:
template<typename T1>
class buffContainer
{
private:
class atomBuff
{
private:
atomBuff(const atomBuff& arObj);
atomBuff operator=(const atomBuff& arObj);
public:
int len;
char *buffPtr;
atomBuff():len(0),buffPtr(NULL)
{}
~atomBuff()
{
if(buffPtr!=NULL)
delete []buffPtr;
}
};
public :
buffContainer():_totalLen(0){}
void insert(const char const *aptr,const unsigned long &alen);
unsigned long getBuff(T1 &arOutObj);
private:
std::vector<atomBuff*> moleculeBuff;
int _totalLen;
};
template<typename T1>
void buffContainer< T1>::insert(const char const *aPtr,const unsigned long &aLen)
{
if(aPtr==NULL,aLen<=0)
return;
atomBuff *obj=new atomBuff();
obj->len=aLen;
obj->buffPtr=new char[aLen];
memcpy(obj->buffPtr,aPtr,aLen);
_totalLen+=aLen;
moleculeBuff.push_back(obj);
}
template<typename T1>
unsigned long buffContainer<T1>::getBuff(T1 &arOutObj)
{
std::cout<<"Total Lenght of Data is: "<<_totalLen<<std::endl;
if(_totalLen==0)
return _totalLen;
// Note : Logic pending for case size(T1) > T2::Value_Type
int noOfObjRqd=_totalLen/sizeof(T1::value_type);
arOutObj.resize(noOfObjRqd);
char *ptr=(char*)(&arOutObj[0]);
for(std::vector<atomBuff*>::const_iterator itr=moleculeBuff.begin();itr!=moleculeBuff.end();itr++)
{
memcpy(ptr,(*itr)->buffPtr,(*itr)->len);
ptr+= (*itr)->len;
}
std::cout<<arOutObj.size()<<std::endl;
return _totalLen;
}
How can I make this more performant?
If my wild guess about your callback function makes sense, you don't need anything more than a vector:
std::vector<char> foo;
foo.reserve(MAGIC); // this is the important part. Reserve the right amount here.
// and you don't have any reallocs.
setup_callback_fun(CallbackFun, &foo);
CallbackFun ( int CBMsgFileItemID,
unsigned long CBtag,
void* CBuserInfo,
int CBdataSize,
void* CBdataBuffer,
int CBisFirst,
int CBisLast )
{
std::vector<char>* pFoo = static_cast<std::vector<char>*>(CBuserInfo);
char* data = static_cast<char*>CBdataBuffer;
pFoo->insert(pFoo->end(), data, data+CBdataSize);
}
Depending on how you plan to use the result, you might try putting the incoming data into a rope datastructure instead of vector, especially if the strings you expect to come in are very large. Appending to the rope is very fast, but subsequent char-by-char traversal is slower by a constant factor. The tradeoff might work out for you or not, I don't know what you need to do with the result.
EDIT: I see from your comment this is no option, then. I don't think you can do much more efficient in the general case when the size of the data coming in is totally arbitrary. Otherwise you could try to initially reserve enough space in the vector so that the data will fit without or at most one reallocation in the average case or so.
One thing I noticed about your code:
if(aPtr==NULL,aLen<=0)
I think you mean
if(aPtr==NULL || aLen<=0)
The main thing you can do is avoid doing quite so much copying of the data. Right now, when insert() is called, you're copying the data into your buffer. Then, when getbuff() is called, you're copying the data out to a buffer they've (hopefully) specified. So, to get data from outside to them, you're copying each byte twice.
This part:
arOutObj.resize(noOfObjRqd);
char *ptr=(char*)(&arOutObj[0]);
Seems to assume that arOutObj is really a vector. If so, it would be a whole lot better to rewrite getbuff as a normal function taking a (reference to a) vector instead of being a template that really only works for one type of parameter.
From there, it becomes a fairly simple matter to completely eliminate one copy of the data. In insert(), instead of manually allocating memory and tracking the size, put the data directly into a vector. Then, when getbuff() is called, instead of copying the data into their buffer, just give then a reference to your existing vector.
class buffContainer {
std::vector<char> moleculeBuff;
public:
void insert(char const *p, unsigned long len) {
Edit: Here you really want to add:
moleculeBuff.reserve(moleculeBuff.size()+len);
End of edit.
std::copy(p, p+len, std::back_inserter(moleculeBuff));
}
void getbuff(vector<char> &output) {
output = moleculeBuff;
}
};
Note that I've changed the result of getbuff to void -- since you're giving them a vector, its size is known, and there's no point in returning the size. In reality, you might want to actually change the signature a bit, to just return the buffer:
vector<char> getbuff() {
vector<char> temp;
temp.swap(moleculeBuff);
return temp;
}
Since it's returning a (potentially large) vector by value, this depends heavily on your compiler implementing the named return value optimization (NRVO), but 1) the worst case is that it does about what you were doing before anyway, and 2) virtually all reasonably current compilers DO implement NRVO.
This also addresses one other detail your original code didn't (seem to). As it was, getbuff returns some data, but if you call it again, it (apparently doesn't keep track of what data has already been returned, so it returns it all again. It keeps allocating data, but never deletes any of it. That's what the swap is for: it creates an empty vector, and then swaps that with the one that's being maintained by buffContainer, so buffContainer now has an empty vector, and the filled one is handed over to whatever called getbuff().
Another way to do things would be to take the swap a step further: basically, you have two buffers:
one owned by buffContainer
one owned by whatever calls getbuffer()
In the normal course of things, we can probably expect that the buffer sizes will quickly reach some maximum size. From there on, we'd really like to simply re-cycle that space: read some data into one, pass it to be processed, and while that's happening, read data into the other.
As it happens, that's pretty easy to do too. Change getbuff() to look something like this:
void getbuff(vector<char> &output) {
swap(moleculeBuff, output);
moleculeBuff.clear();
}
This should improve speed quite a bit -- instead of copying data back and forth, it just swaps one vector's pointer to the data with the others (along with a couple other details like the current allocation size, and used size of the vector). The clear is normally really fast -- for a vector (or any type without a dtor) it'll just set the number of items in the vector to zero (if the items have dtors, it has to destroy them, of course). From there, the next time insert() is called, the new data will just be copied into the memory the vector already owns (until/unless it needs more space than the vector had allocated).