Stack overflow or tail recursion? - c++

I'm trying to find out if the following code piece is good or bad practice. It's about a html query string that should be parsed by my API. It's very convenient to use recursion to trim an arbitrary amount of '?' off the query string.
However, I'm wondering if this could potentially result in a stack overflow due to the uncontrollable recursion depth. My hope is that such cases are guaranteed to be tail-optimized but I'm not sure about it. Is there such guarantee?
Demo
#include <string_view>
#include <cstdio>
static auto digest_query(std::string_view query) -> void
{
if (query.front() == '?') {
// printf("%.*s\n", (int)query.size(), query.data());
return digest_query(query.substr(1));
}
// Do other stuff...
}
int main()
{
digest_query("???????key=value");
}

Is there such guarantee?
No there isn't.

The recursion depth is not "uncontrolled", it is exactly equal to the number of leading '?'. Every level will host a string variable on the stack, but the string data itself is allocated on the heap.
So there is absolutely no risk of overflow, but this code is so inefficient ! It will involve - recursive calls, - string allocation/deallocations, - string copies. All of this perfectly useless. I call it disaster.
Should I add that I find it pretty unreadable (so unexpected) compared to a regex query or a straightforward loop ?

Related

How to omit the empty arguments in fmt::format?

Is there a way to omit the empty string literals ("") in the argument list of the fmt::format function?
I have the below snippet which gives the desired output:
#include <string>
#include <fmt/core.h>
int main( )
{
const std::string application_layer_text_head { fmt::format( "{:*<5}[Application Layer]{:*<51}\n\n", "", "" ) };
fmt::print( "{}", application_layer_text_head );
}
Output:
*****[Application Layer]***************************************************
So instead of writing this: fmt::format( "{:*<5}[Application Layer]{:*<51}\n\n", "", "" ) can we remove the empty literals and write this: fmt::format( "{:*<5}[Application Layer]{:*<51}\n\n" )? I tried it but it failed to compile. Those two literals don't really serve any purpose so I want to find a way to not write them.
Just to clarify, I only want to have 5 * in the beginning and then [Application Layer] and then 51 * and then 2 \n.
Formatting markup is meant for formatting a string with some piece of user-provided data. The particulars of the specialized syntax within formatting can adjust how this formatting works, even inserting characters and the like. But this functionality is meant to be a supplement to the basic act: taking some user-provided object and injecting it into a string.
So no, format doesn't have a mechanism to allow you to avoid providing the user-provided data that is the entire reason format exists in the first place.
It should also be noted that the very meaning of the text after the : in a format specifier is defined based on the type of the object being formatted. The "*<5" means "align to 5 characters using '*' characters to fill in the blanks" only because you provided a string for that particular format parameter. So not only does format not provide a way to do this, it functionally cannot. You have to tell it what type is being used because this is an integral part of processing what "*<5" means.
As noted already, format can't do this. But your worries about string concatenation being expensive are misplaced; repeated application of operator+ is expensive (performs new allocations, copies all existing data and new data, discards old data, over and over), but in-place concatenation with operator+= and append is cheap, especially if you pre-reserve (so you're allocating once up-front and populating, not relying on amortized growth patterns in reallocation to save you). Even without pre-reserve, std::string follows amortized growth patterns, so repeated in-place concatenation is amortized O(1) (per character added), not O(n) in the size of the data so far.
The following should be, essentially by definition, at least as fast as formatting, though at the expense of a larger number of lines of code to perform the pre-reserve to prevent reallocation:
#include <string>
#include <string_view>
#include <fmt/core.h>
using namespace std::literals;
int main( )
{
// Make empty string, and reserve enough space for final form
auto application_layer_text_head = std::string();
application_layer_text_head.reserve(5 + "[Application Layer]"sv.size() + 51 + "\n\n"sv.size());
// append is in-place, returning reference to original string, so it can be chained
// Using string_view literals to avoid need for any temporary runtime allocated strings,
// while still allowing append to use known length concatenation to save scanning for NUL
application_layer_text_head.append(5, '*').append("[Application Layer]"sv).append(51, '*').append("\n\n"sv);
fmt::print("{}", application_layer_text_head);
}
If you were okay with some of the concatenations potentially performing reallocation, and a final move construction to move the resources from the temporary reference to a real string, it simplifies to a one-liner:
const auto application_layer_text_head = std::move(std::string(5, '*').append("[Application Layer]"sv).append(51, '*').append("\n\n"sv));
or, given that 5 asterisks is short enough to type, the even shorter/simpler version:
const auto application_layer_text_head = std::move("*****[Application Layer]"s.append(51, '*').append("\n\n"sv));
But keeping it to a two-liner avoids the move construction and is a little safer:
auto application_layer_text_head = "*****[Application Layer]"s;
application_layer_text_head.append(51, '*').append("\n\n"sv);
Yeah, none of those are quite as pretty as a single format literal, even with "ugly" empty placeholders. If you prefer the look of the format string, just pass along the empty placeholders the way you're already doing.

How to fill a string variable with a string in c++ [duplicate]

I want to create a function that will take a string and an integer as parameters and return a string that contains the string parameter repeated the given number of times.
For example:
std::string MakeDuplicate( const std::string& str, int x )
{
...
}
Calling MakeDuplicate( "abc", 3 ); would return "abcabcabc".
I know I can do this just by looping x number of times but I'm sure there must be a better way.
I don't see a problem with looping, just make sure you do a reserve first:
std::string MakeDuplicate( const std::string& str, int x )
{
std::string newstr;
newstr.reserve(str.length()*x); // prevents multiple reallocations
// loop...
return newstr;
}
At some point it will have to be a loop. You may be able to hide the looping in some fancy language idiom, but ultimately you're going to have to loop.
For small 'x' simple loop is your friend. For large 'x and relatively short 'str' we can think of a "smarter" solution by reusing already concatenated string.
std::string MakeDuplicate( const std::string& str, unsigned int x ) {
std::string newstr;
if (x>0) {
unsigned int y = 2;
newstr.reserve(str.length()*x);
newstr.append(str);
while (y<x) {
newstr.append(newstr);
y*=2;
}
newstr.append(newstr.c_str(), (x-y/2)*str.length());
}
return newstr;
}
Or something like that :o) (I think it can be written in a nicer way but idea is there).
EDIT: I was intersted myself and did some tests comparing three solutions on my notebook with visual studio (reuse version, simple loop with preallocation, simple copy&loop-1 without preallocation). Results as expected: for small x(<10) preallocation version is generally fastest, no preallocation was tiny bit slower, for larger x speedup of 'reuse' version is really significant (log n vs n complexity). Nice, I just can't think of any real problem that could use it :o)
There is an alternative to a loop, its called recursion, and of recursion tail-recursion is the nicest variety since you can theoretically do it till the end of time -- just like a loop :D
p.s., tail-recursion is often syntactic sugar for a loop -- however in the case of procedural languages (C++), the compiler is generally at loss, so the tail-recursion is not optimised and you might run out of memory (but if you wrote a recursion that runs out of memory than you have bigger problems) :D
more downvotes please !!
recursion is obviously not a construct used in computer science for the same job as looping

Reversing input, output string is unexpectedly empty

#include <iostream>
#include <string>
using namespace std;
int main()
{
string s1;
cin>>s1;
int n=s1.size();
string s2;
for(int a=0;a<n;a++)
{
s2[a]=s1[n-1-a];
}
cout<<s2;
}
However I am not getting any output, But I can print elements of reversed string. Can anybody please help.
string s2; // no size allocated
for(int a=0;a<n;a++)
{
s2[a]=s1[n-1-a]; // write to somewhere in s2 that doesn't exist yet
You are writing into elements of s2 that you never created. This is Undefined Behaviour, and anything may happen, including appearing to work normally. In this case, you are probably overwriting some random place in memory. It might crash, or that memory might really exist but not break anything else right away. You might even be able to read that back later, but it would only seem to work by pure accident.
You could catch this problem by always using s2.at(a) to access the data, as it will do a range check for you. ([] does not do a range check). There's a cost to this of course, and sometimes people will skip it in circumstances where they are certain the index cannot be out of bounds. That's debateable. In this case, even though you were probably sure you got the maths right, it still would have helped catch this bug.
You need to either create that space up front, i.e. by creating a string full of the right number of dummy values, or create the space for each element on demand with push_back. I'd probably go with the first option:
string s2(s1.size(), '\0'); // fill with the right number of NULs
for(int a=0;a<n;a++)
{
s2.at(a)=s1.at(n-1-a); // overwrite the NULs
You might want to choose a printable dummy character that doesn't appear in your test data, for example '#', since then it becomes very visible when you print it out if you have failed to correctly overwrite some element. E.g. if you try to reverse "abc" but when you print it out you get "cb#" it would be obvious you have some off-by-one error.
The second option is a bit more expensive since it might have to do several allocations and copies as the string grows, but here's how it would look:
string s2; // no space allocated yet
for(int a=0;a<n;a++)
{
s2.push_back(s1.at(n-1-a)); // create space on demand
I assume you are doing this as a learning exercise, but I would recommend against writing your own reverse, since the language provides it in the library.
You are utilizing something called Undefined Behaviour in your code. You try to access element of s2 at a position, but your string does not have that many chars (it's empty).
You can use std::string::push_back function to add a character on the last postion, so your code would look like this:
for(int a=0;a<n;a++)
{
s2.push_back(s1[n-1-a]);
}
EDIT, to address the other question, your console window probably closes before you can notice. That's why "you don't get any output".
Try this: How to stop C++ console application from exiting immediately?
You can use the inbuilt reverse function in c++
#include<bits/stdc++.h>
using namespace std;
int main()
{
string s1 = "string1";
reverse(s1.begin(),s1.end());
cout << s1;
return 0;
}
Hope that helps :)
You can just construct the string with reverse iterators:
std::string reverse ( std::string const& in )
{
return std::string { in.crbegin(), in.crend() };
}

Why is this program giving runtime error?

I am trying to make a function thats reads a character string from the stdin and stores it in a character vector and also stores the position of the special character -(#) in an integer vector. It is given that the input will consist only of lowercase alphabets and the special character i.e '#'. Both the character and integer vectors are global. I can't figure out why I am getting runtime error. Here is my code:-
vector<int> v;
vector<char> s;
inline int input() //called in main when we have to read input
{
char p=getchar();
register int i=0;
while((p>='a'&&p<='z')||(p=='#'))
{
s.push_back(p);
if (p=='#')
{
v.push_back(i);
}
p=getchar();
i++;
}
return 0;
}
while((p>='a'&&p<='z')||(p=='#))
You don't have '#'
It would help if you put the nature of the error that you get at run-time. It would also help if you gave examples of inputs that cause this error.
A few observations, which may give you some pointers to the cause of the error:
Your vectors are global variables. It would be much better if you passed them into the function and do not store them at global scope. This will allow you to much better track where they are being accessed and changed which will make your code much more maintainable.
Neither vector gets cleared at the start of the function so will continue to build up through subsequent calls. This may or may not be what you want to do.
The function will terminate early if you do not either type a lowercase letter or '#'. This looks deliberate, but of course any punctuation, capitals, numbers, or spaces will cause early termination.
You always return 0 from the function. If the function is not written to return a value it should be declared as void.
I would also remove your use of inline and register which are unlikely to give you anything of an appreciable speed increase.

C++ faster way to do string addition?

I'm finding standard string addition to be very slow so I'm looking for some tips/hacks that can speed up some code I have.
My code is basically structured as follows:
inline void add_to_string(string data, string &added_data) {
if(added_data.length()<1) added_data = added_data + "{";
added_data = added_data+data;
}
int main()
{
int some_int = 100;
float some_float = 100.0;
string some_string = "test";
string added_data;
added_data.reserve(1000*64);
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
stringstream fragment;
fragment<<some_int <<","<<some_float<<","<<some_string;
add_to_string(fragment.str(),added_data);
}
return;
}
Doing some basic profiling, I'm finding that a ton of time is being used in the for loop. Are there some things I can do that will significantly speed this up? Will it help to use c strings instead of c++ strings?
String addition is not the problem you are facing. std::stringstream is known to be slow due to it's design. On every iteration of your for-loop the stringstream is responsible for at least 2 allocations and 2 deletions. The cost of each of these 4 operations is likely more than that of the string addition.
Profile the following and measure the difference:
std::string stringBuffer;
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
char buffer[128];
sprintf(buffer, "%i,%f,%s",some_int,some_float,some_string.c_str());
stringBuffer = buffer;
add_to_string(stringBuffer ,added_data);
}
Ideally, replace sprintf with _snprintf or the equivalent supported by your compiler.
As a rule of thumb, use stringstream for formatting by default and switch to the faster and less safe functions like sprintf, itoa, etc. whenever performance matters.
Edit: that, and what didierc said: added_data += data;
You can save lots of string operations if you do not call add_to_string in your loop.
I believe this does the same (although I am not a C++ expert and do not know exactly what stringstream does):
stringstream fragment;
for(int ii=0;ii<1000;ii++)
{
//variables manipulated here
some_int = ii;
some_float += ii;
some_string.assign(ii%20,'A');
//then we concatenate the strings!
fragment<<some_int<<","<<some_float<<","<<some_string;
}
// inlined add_to_string call without the if-statement ;)
added_data = "{" + fragment.str();
I see you used the reserve method on added_data, which should help by avoiding multiple reallocations of the string as it grows.
You should also use the += string operator where possible:
added_data += data;
I think that the above should save up some significant time by avoiding unecessary copies back and forth of added_data in a temporary string when doing the catenation.
This += operator is a simpler version of the string::append method, it just copies data directly at the end of added_data. Since you made the reserve, that operation alone should be very fast (almost equivalent to a strcpy).
But why going through all this, when you are already using a stringstream to handle input? Keep it all in there to begin with!
The stringstream class is indeed not very efficient.
You may have a look at the stringstream class for more information on how to use it, if necessary, but your solution of using a string as a buffer seems to avoid that class speed issue.
At any rate, stay away from any attempt at reimplementing the speed critical code in pure C unless you really know what you are doing. Some other SO posts support the idea of doing it,, but I think it's best (read safer) to rely as much as possible on the standard library, which will be enhanced over time, and take care of many corner cases you (or I) wouldn't think of. If your input data format is set in stone, then you might start thinking about taking that road, but otherwise it's premature optimization.
If you start added_data with a "{", you would be able to remove the if from your add_to_string method: the if gets executed exactly once, when the string is empty, so you might as well make it non-empty right away.
In addition, your add_to_string makes a copy of the data; this is not necessary, because it does not get modified. Accepting the data by const reference should speed things up for you.
Finally, changing your added_data from string to sstream should let you append to it in a loop, without the sstream intermediary that gets created, copied, and thrown away on each iteration of the loop.
Please have a look at Twine used in LLVM.
A Twine is a kind of rope, it represents a concatenated string using a
binary-tree, where the string is the preorder of the nodes. Since the
Twine can be efficiently rendered into a buffer when its result is used,
it avoids the cost of generating temporary values for intermediate string
results -- particularly in cases when the Twine result is never
required. By explicitly tracking the type of leaf nodes, we can also avoid
the creation of temporary strings for conversions operations (such as
appending an integer to a string).
It may helpful in solving your problem.
How about this approach?
This is a DevPartner for MSVC 2010 report.
string newstring = stringA & stringB;
i dont think strings are slow, its the conversions that can make it slow
and maybe your compiler that might check variable types for mismatches.