Inserting an item at the beginning of a protobuf list - c++

I'm trying to insert an item at the beginning of a protobuf list of messages. add_foo appends the item to end. Is there an easy way to insert it at the beginning?

There's no built-in way to do this with protocol buffers AFAIK. Certainly the docs don't seem to indicate any such option.
A reasonably efficient way might be to add the new element at the end as normal, then reverse iterate through the elements, swapping the new element in front of the previous one until it's at the front of the list. So e.g. for a protobuf message like:
message Bar {
repeated bytes foo = 1;
}
you could do:
Bar bar;
bar.add_foo("two");
bar.add_foo("three");
// Push back new element
bar.add_foo("one");
// Get mutable pointer to repeated field
google::protobuf::RepeatedPtrField<std::string> *foo_field(bar.mutable_foo());
// Reverse iterate, swapping new element in front each time
for (int i(bar.foo_size() - 1); i > 0; --i)
foo_field->SwapElements(i, i - 1);
std::cout << bar.DebugString() << '\n';

Related

Reading from unordered_multiset results in crash

While refactoring some old code a cumbersome multilevel-map developed in-house was replaced by an std::undordered_multiset.
The multilevel-map was something like [string_key1,string_val] . A complex algorithm was applied to derive the keys from string_val and resulted in duplicate string_val being stored in the map but with different keys.
Eventually at some point of the application the multilevel-map was iterated to get the string_val and its number of occurrences.
It replaced was an std::unordered_multilevelset and string_val are just inserted to it. It seems much simpler than having an std::map<std::string,int> and checking-retrieving-updating the counter for every insertion.
What I want to do retrieve the number of occurrences of its inserted element, but I do not have the keys beforehands. So I iterate over the buckets but my program crashes upon creation of the string.
// hash map declaration
std::unordered_multiset<std::string> clevel;
// get element and occurences
for (size_t cbucket = clevel->bucket_count() - 1; cbucket != 0; --cbucket)
{
std::string cmsg(*clevel->begin(cbucket));
cmsg += t_str("times=") + \
std::to_string(clevel->bucket_size(cbucket));
}
I do not understand what is going on here, tried to debug it but I am somehow stack( overflown ?) :) . Program crashes in std::string cmsg(*it);
You should consider how multiset actually works as a hashtable. For example reading this introduction you should notice that hash maps actually preallocate their internal buckets , and the number of buckets is optimized.
Therefore if you insert element "hello" , you will probably get a number of buckets already created, but only the one corresponding to hash("hello") will actually have an element that you may dereference. The rest will be let's say invalid.
Dereferencing the iterator to the begin of every bucket results in SEGV which is your case here.
To remedy this situation you should check every time that begin is not past the end.
for (size_t cbucket = clevel->bucket_count() - 1; cbucket != 0; --cbucket)
{
auto it = clevel->begin(cbucket);
if (it != clevel->end(cbucket))
{
std::string cmsg(*it);
cmsg += t_str("times=") + \
std::to_string(clevel->bucket_size(cbucket));
}
}

How to iterate through a list while adding items to it

I have a list of line segments (a std::vector<std::pair<int, int> > that I'd like to iterate through and subdivide. The algorithm would be, in psuedocode:
for segment in vectorOfSegments:
firstPoint = segment.first;
secondPoint = segment.second;
newMidPoint = (firstPoint + secondPoint) / 2.0
vectorOfSegments.remove(segment);
vectorOfSegments.push_back(std::make_pair(firstPoint, newMidPoint));
vectorOfSegments.push_back(std::make_pair(newMidPoint, secondPoint));
The issue that I'm running into is how I can push_back new elements (and remove the old elements) without iterating over this list forever.
It seems like the best approach may be to make a copy of this vector first, and use the copy as a reference, clear() the original vector, and then push_back the new elements to the recently emptied vector.
Is there a better approach to this?
It seems like the best approach may be to make a copy of this vector first, and use the copy as a reference, clear() the original vector, and then push_back the new elements to the recently emptied vector.
Almost. You don't need to copy-and-clear; move instead!
// Move data from `vectorOfSegments` into new vector `original`.
// This is an O(1) operation that more than likely just swaps
// two pointers.
std::vector<std::pair<int, int>> original{std::move(vectorOfSegments)};
// Original vector is now in "a valid but unspecified state".
// Let's run `clear()` to get it into a specified state, BUT
// all its elements have already been moved! So this should be
// extremely cheap if not a no-op.
vectorOfSegments.clear();
// We expect twice as many elements to be added to `vectorOfSegments`
// as it had before. Let's reserve some space for them to get
// optimal behaviour.
vectorOfSegments.reserve(original.size() * 2);
// Now iterate over `original`, adding to `vectorOfSegments`...
Don't remove elements while you insert new segments. Then, when finished with inserting you could remove the originals:
int len=vectorOfSegments.size();
for (int i=0; i<len;i++)
{
std::pair<int,int>& segment = vectorOfSegments[i];
int firstPoint = segment.first;
int secondPoint = segment.second;
int newMidPoint = (firstPoint + secondPoint) / 2;
vectorOfSegments.push_back(std::make_pair(firstPoint, newMidPoint));
vectorOfSegments.push_back(std::make_pair(newMidPoint, secondPoint));
}
vectorOfSegments.erase(vectorOfSegments.begin(),vectorOfSegments.begin()+len);
Or, if you want to replace one segment by two new segments in one pass, you could use iterators like here:
for (auto it=vectorOfSegments.begin(); it != vectorOfSegments.end(); ++it)
{
std::pair<int,int>& segment = *it;
int firstPoint = segment.first;
int secondPoint = segment.second;
int newMidPoint = (firstPoint + secondPoint) / 2;
it = vectorOfSegments.erase(it);
it = vectorOfSegments.insert(it, std::make_pair(firstPoint, newMidPoint));
it = vectorOfSegments.insert(it+1, std::make_pair(newMidPoint, secondPoint));
}
As Lightning Racis in Orbit pointed out, you should do a reserve before either of these approaches. In the first case do reserve(vectorOfSegmets.size()*3), in the latter reserve(vectorOfSegmets.size()*2+1)
This is easiest solved by using an explicit index variable like this:
for(size_t i = 0; i < segments.size(); i++) {
... //other code
if(/*condition when to split segments*/) {
Point midpoint = ...;
segments[i] = Segment(..., midpoint); //replace the segment by the first subsegment
segments.emplace_back(Segment(midpoint, ...)); //add the second subsegment to the end of the vector
i--; //reconsider the first subsegment
}
}
Notes:
segments.size() is called in each iteration of the loop, so we really reconsider all appended segments.
The explicit index means that the std::vector<> is free to reallocate in the emplace_back() call, there are no iterators/pointers/references that can become invalid.
I assumed that you don't care about the order of your vector because you add the new segments to the end of the vector. If you do care, you might want to use a linked list to avoid quadratic complexity of your algorithm as insertion/deletion to/from an std::vector<> has linear complexity. In my code I avoid insertion/deletion by replacing the old segment.
Another approach to retain order would be to ignore order at first and then reestablish order via sorting. Assuming a good sorting algorithm, that is O(n*log(n)) which is still better than the naive O(n^2) but worse than the O(n) of the linked list approach.
If you don't want to reconsider the new segments, just use a constant size and omit the counter decrement:
size_t count = segments.size();
for(size_t i = 0; i < count; i++) {
... //other code
if(/*condition when to split segments*/) {
Point midpoint = ...;
segments[i] = Segment(..., midpoint); //replace the segment by the first subsegment
segments.emplace_back(Segment(midpoint, ...)); //add the second subsegment to the end of the vector
}
}

How to add random strings alphabetically to an array using a Bi-Sectional Search

This is my add function. I haven't finished it yet, I have to add strings into the array using sequential search to find the insertion point to add in alphabetical order. I just included this b/c we use it when we add random strings.
void StringList::add(string s)
{
str[numberOfStrings++]=s;
}
This is my bisectional search function
int StringList::bsearch(string key, int start, int end)
{
int middle = (end + start)/2;
if(key>str[middle])
{
return bsearch(key, middle+1, end);
}
else if (key<str[middle])
{
return bsearch(key, start, middle);
}
else if(start==end)
{
return -1;
}
}
Here is my code to add a random number of strings to the array. (In a seperate cpp file that uses a transducer)
if((token[0]=="ADDRAND")||(token[0]=="AR"))
{
int count = stringToInt(token[1]);
for(int i=0;i<count;i++)
{
stringList.add(randString(20));
}
result = "Random Strings added.\n";
}
How do I use the Bi-sectional search to add the random strings to the array in alphabetical order using this?
The first thing to do would be to modify your bsearch() so that it may return the index where to insert the string if not found (a good start may be returning start instead of -1, but I don't know if it's enough).
If even you are able to find the index where to insert, you have to push all the elements after that index by a step to make place for that string. So this will take linear time for each step. I don't think you would be able to use your bi-sectional search method to dynamically update array and keep it sorted. Either insertion of string in array or searching of index will take time if you try to keep sorted strings in contiguous string indices.
For this purpose, you need to use some tree structure such as AVL Tree or RB tree. Moreover stl set can serve the purpose if you need readymade structure.

How to extract an element from a deque?

Given the following code :
void World::extractStates(deque<string> myDeque)
{
unsigned int i = 0;
string current; // current extracted string
while (i < myDeque.size()) // run on the entire vector and extract all the elements
{
current = myDeque.pop_front(); // doesn't work
// do more stuff
}
}
I want to extract each iteration the element at the front , but pop_front() is a void
method . How can I get the element (at the front) then ?
Regards
Use front to read the item and pop_front to remove it.
current = myDeque.front();
myDeque.pop_front();
This way of doing things may seem counter-productive, but it is necessary in order for deque to provide adequate exception-safety guarantees.

How to delete arbitrary objects in repeated field? (protobuf)

I have some entries in the repeated field in my proto. Now I want delete some of them. How can I accomplish this? There is a function to delete the last element, but I want to delete arbitrary elements. I cant just swap them because the order is important.
I could swap with next until end, but isn't there a nicer solution?
For Protobuf v3
iterator RepeatedField::erase(const_iterator position) can delete at arbitrary position.
For Protobuf v2
You can use the DeleteSubrange(int start, int num) in RepeatedPtrField class.
If you want to delete a single element then you have to call this method as DeleteSubrange(index_to_be_del, 1). It will remove the element at that index.
According to the API docs, there isn't a way to arbitrarily remove an element from within a repeated field, just a way to remove the last one.
...
We don't provide a way to remove any element other than the last
because it invites inefficient use, such as O(n^2) filtering loops
that should have been O(n). If you want to remove an element other
than the last, the best way to do it is to re-arrange the elements so
that the one you want removed is at the end, then call RemoveLast()
...
What I usually do in these cases is to create a new Protobuf (PB) message. I iterate the repeated fields of the existing message and add them (except the ones you don't want anymore) to the new PB message.
Here is example:
message GuiChild
{
optional string widgetName = 1;
//..
}
message GuiLayout
{
repeated ChildGuiElement children = 1;
//..
}
typedef google_public::protobuf::RepeatedPtrField<GuiChild> RepeatedField;
typedef google_public::protobuf::Message Msg;
GuiLayout guiLayout;
//Init children as necessary..
GuiChild child;
//Set child fileds..
DeleteElementsFromRepeatedField(*child, guiLayout->mutable_children());
void DeleteElementsFromRepeatedField(const Msg& msg, RepeatedField* repeatedField)
{
for (RepeatedField::iterator it = repeatedField->begin(); it != repeatedField->end(); it++)
{
if (google_public::protobuf::util::MessageDifferencer::Equals(*it, msg))
{
repeatedField->erase(it);
break;
}
}
}
Although there's no straight-forward method you still can do this (for custom message using reflection). Code below removes count repeated field items starting from row index.
void RemoveFromRepeatedField(
const google::protobuf::Reflection *reflection,
const google::protobuf::FieldDescriptor *field,
google::protobuf::Message *message,
int row,
int count)
{
int size = reflection->FieldSize(*message, field);
// shift all remaining elements
for (int i = row; i < size - count; ++i)
reflection->SwapElements(message, field, i, i + count);
// delete elements from reflection
for (int i = 0; i < count; ++i)
reflection->RemoveLast(message, field);
}