Boost property tree issue when converting to Unicode

Boost property tree issue when converting to Unicode - c++

Ok, first off I'm not a C++ developer by nature; I've managed to put some stuff together and it works fine, but I'm sure through the eyes of an expert it looks like garbage =)
So I have a freeware app that I've made which uses Property Tree from the Boost libraries. I developed the entire app (in VS2010) with the Use Multi-Byte Character Set setting. I decided it was time to go through and update the app to support Unicode as there are some folks with complex character sets that I'd like to better support.
I went through the tedious process of changing all the references and calls to use wide strings, all the necessary conversions. However, I'm completely stumped at one point, the only two compiler errors I have left.
They both come from stream_translator.hpp (/boost/property_tree/), lines 33 and 36 (as noted below):
template <typename Ch, typename Traits, typename E, typename Enabler = void>
struct customize_stream
{
static void insert(std::basic_ostream<Ch, Traits>& s, const E& e) {
s << e; //line 33
}
static void extract(std::basic_istream<Ch, Traits>& s, E& e) {
s >> e; //line 36
if(!s.eof()) {
s >> std::ws;
}
}
};
The error at line 33 is:
Error 347 error C2679: binary '<<' : no operator found which takes a right-hand operand of type 'const std::wstring' (or there is no acceptable conversion) {...}\boost_1_49_0\boost\property_tree\stream_translator.hpp 33 1
..and the error at line 36 is:
Error 233 error C2678: binary '>>' : no operator found which takes a left-hand operand of type 'std::basic_istream<_Elem,_Traits>' (or there is no acceptable conversion) {...}\boost_1_49_0\boost\property_tree\stream_translator.hpp 36 1
From what I've been able to walk backwards through, it's coming from within stream_translator.hpp ultimately beginning as a call to get a value [e.g. ptree.get("some.path", "default value here")]
I really have absolutely no idea how to resolve this issue and cannot seem to find anything online to help me understand what exactly the problem is. Any tips or info would be greatly appreciated.
EDIT
So I commented out everything relating to ptree until it would compile, then began adding them back in. It turns out I can call .get fine, it's get_child where the error # line 36 pops up (haven't done the other project yet, where the wstring issue is).
To simplify things, here is the effective sequence of the calls, which are fine until get_child is called:
boost::property_tree::ptree pt;
boost::property_tree::read_xml("Config.xml", pt);
int iAppSetting = pt.get("config.settings.AppSetting",1); //<- works fine
ptree ptt;
ptt = pt.get_child("config.Applications"); //<- adding this line causes the line 36 error

Guessing that your problem was the same I ran into... There are wide character versions of Boost.PropertyTree for unicode support.
For Config.xml that is setup like this:
<?xml version="1.0"?>
<Zoo>
<Monkey>
<Food>Bananas</Food>
</Monkey>
</Zoo>
Use code similar to this to parse it:
// Load up the property tree for wide characters
boost::property_tree::wptree pt;
boost::property_tree::read_xml("Config.xml", pt);
// Iterate
BOOST_FOREACH(wptree::value_type const& v, pt.get_child(L"Zoo"))
{
if( v.first == L"Monkey" )
{
wstring foodType = v.second.get<wstring>(L"Food");
}
}

Related

Getting Error Debug Assertion Failed: Expression c >= -1 && c <= 255

I have following code which works well on my ubuntu system:
#include <algorithm>
// ... other functions
bool IsHexPrefixed(const std::string& input) {
return input.substr(0, 2) == "0x";
}
std::string StripHexPrefix(const std::string& input) {
return IsHexPrefixed(input) ? input.substr(2, input.length()) : input;
}
bool IsHexString(const std::string& input) {
std::string stripped_string_ = StripHexPrefix(input);
return std::all_of(stripped_string_.begin(), stripped_string_.end(), ::isxdigit);
}
// ... some other functions
On Windows 10 via cmd, VSCode, and Visual Studio 2019 I get pop-up mentioning the Debug Assertion Error on Windows as well as Visual Studio 2019.
The line on which this error is coming is std::all_of() function call in IsHexString() function.
I tried to use exceptions and find out where the exception is coming, but no solution is found yet. I also tried to use Breakpoint but that is also not helping to get the cause.
What could be the reason for this error?
EDIT:
The string that I passed to IsHexString() function is 000002C479F17CC0.

The reason is just what the assertion says. isxdigit is undefined if it's argument is not represented as unsigned char or EOF(see notes here).
Since it takes an int argument, it's highly likely your string contains chars in range 129-255 (probably by containing non-ASCII text), so they get promoted to negative integer numbers.
The linked cppreference page also has a workaround to avoid promotion issues that you could apply to you case:
std::all_of(stripped_string_.begin(), stripped_string_.end(),
[](unsigned char c){ return std::isxdigit(c); });
Another possibility is that StripHexPrefix function corrupts your string causing the problem above.

Compiler error for using operator '>>' with ifstream

I am trying to use a custom template for IO, and I getting an error :
"error C2678: binary '>>' : no operator found which takes a left-hand operand of
type 'std::ifstream' (or there is no acceptable conversion)"
I have searched and found only suggestions to try including more headers, and have tried including: string, fstream, iostream, istream, vector. I can use an fstream.get(), but I am trying to get space delimited strings. (The format of my file is lines like this: "String1 = String2")
Here is my code:
template <typename OutType>
OutType Read(std::ifstream& in)
{
OutType out;
in >> out;
return out;
}
Any suggestions are very much appreciated! Thanks!
(P.S. Not sure if it will matter for compiler considerations, but I am using Visual Studio 2013.)

The problem is your OutType (which you have not shown us) has no operator>>(istream&, OutType&). You need to define one for every possible OutType.

How you are expecting OutType is known to >> operator? It understands primitives like int,char, etc., but if you want to make OutType available to << you should overload the operator.

How to find in my program a "const char* + int" expression

I'm in a source code migration and the converter program did not convert concatenation of embedded strings with integers. Now I have lots of code with this kind of expressions:
f("some text" + i);
Since C/C++ will interpret this as an array subscript, f will receive "some text", or "ome text", or "me text"...
My source language converts the concatenation of an string with an int as an string concatenation. Now I need to go line by line through the source code and change, by hand, the previous expression to:
f("some text" + std::to_string(i));
The conversion program managed to convert local "String" variables to "std::string", resulting in expressions:
std::string some_str = ...;
int i = ...;
f(some_str + i);
Those were easy to fix because with such expressions the C++ compiler outputs an error.
Is there any tool to find automatically such expressions on source code?

Easy! Just replace all the + with -&:
find . -name '*.cpp' -print0 | xargs -0 sed -i '' 's/+/-\&/g'
When trying to compile your project you will see, between other errors, something like this:
foo.cpp:9:16: error: 'const char *' and 'int *' are not pointers to compatible types
return f(s -& i);
~ ^~~~
(I'm using clang, but other compilers should issue similar errors)
So you just have to filter the compiler output to keep only those errors:
clang++ foo.cpp 2>&1 | grep -F "error: 'const char *' and 'int *' are not pointers to compatible types"
And you get:
foo.cpp:9:16: error: 'const char *' and 'int *' are not pointers to compatible types
foo.cpp:18:10: error: 'const char *' and 'int *' are not pointers to compatible types

You can try flint, an open-source lint program for C++ developed and used at Facebook. It has blacklisted token sequences feature (checkBlacklistedSequences). You can add your token sequence to the checkBlacklistedSequences function and flint will report them.
in checkBlacklistedSequences function, I added the sequence string_literal + number
BlacklistEntry([tk!"string_literal", tk!"+", tk!"number"],
"string_literal + number problem!\n",
true),
then compile and test
$ cat -n test.cpp
1 #include <iostream>
2 #include <string>
3
4 using namespace std;
5
6 void f(string str)
7 {
8 cout << str << endl;
9 }
10
11 int main(int argc, char *argv[])
12 {
13 f("Hello World" + 2);
14
15 f("Hello World" + std::to_string(2));
16
17 f("Hello World" + 2);
18
19 return 0;
20 }
$ ./flint test.cpp
test.cpp(13): Warning: string_literal + number problem!
test.cpp(17): Warning: string_literal + number problem!
flint has two versions (old version developed in C++ and new version in D language), I made my changes in D version.

I'm not familiar with a lot of tools which can do that, but I think grep can be helpful in some measure.
In the root directory of your source code, try:
grep -rn '".\+"\s*+\s*' .
, which can find out all the files which containt a line like "xxxxx" +, hope this can help you find all the lines you need.
If all the integers are constant, you can alter the grep experssion as:
grep -rn '".\+"\s*+\s*[0-9]*' .
And you can also include the ( before the string constant:
grep -rn '(".\+"\s*+\s*[0-9]*' .
This may be not the "correct" answer, but I hope this can help you.

You may not need an external tool. Instead, you can take advantage of C++ one-user-defined-conversion rule. Basically, you need to change the argument of your f function from const char*/std::string to a type, that is implicitly convertible only from either a string literal (const char[size]) or an std::string instance (what you get when you add std::to_string in the expression).
#include <string>
#include <iostream>
struct string_proxy
{
std::string value;
string_proxy(const std::string& value) : value(value) {}
string_proxy(std::string&& value) : value(std::move(value)) {}
template <size_t size>
string_proxy(const char (&str)[size]) : value(str) {}
};
void f(string_proxy proxy)
{
std::cout << proxy.value << std::endl;
}
int main()
{
f("this works"); // const char[size]
f("this works too: " + std::to_string(10)); // std::string
f("compile error!" + 10); // const char*
return 0;
}
Note that this is not going to work on MSVC, at least not in 2012 version; it's likely a bug, since there are no warning emitted either. It works perfectly fine in g++ and clang (you can quickly check it here).

I've found a very simple way to detect this issue. Regular expression nor a lint won't match more complex expressions like the following:
f("Hello " + g(i));
What I need is to somehow do type inference, so I'm letting the compiler to do it. Using an std::string instead of a literal string raises an error, so I wrote a simple source code converter to translate all the string literals to the wrapped std::string version, like this:
f(std::string("Hello ") + g(i));
Then, after recompiling the project, I'd see all the errors. The source code is on GitHub, in 48 lines of Python code:
https://gist.github.com/alejolp/3a700e1730e0328c68de

If your case is exactly as
"some text in quotations" + a_numeric_variable_or_constant
then Powergrep or similar programs will let you to scan all files for
("[^"]+")\s*\+\s*(\w+)
and replace with
\1 + std::to_string(\2)
This will bring the possible matches to you but i strongly recommend first preview what you are replacing. Because this will also replace the string variables.
Regular expressions cannot understand the semantics of your code so they cannot be sure that if they are integers. For that you need a program with a parser like CDT or static code analyzers. But unfortunately i do not know any that can do that. So to sum i hope regex helps :)
PS: For the worst case if the variables are not numeric then compiler will give you error because to_string function doesn't accept anything than numeric values. May be later then you can manually replace only them which i can only hope won't be more.
PS 2: Some may think that Powergrep is expensive. You can use trial for 15 day with full functionality.

You can have a try at the Map-Reduce Clang plugin.
The tool was developped at Google to do just this kind of refactoring, mixing strong type-checking and regexp.
(see video presentation here ).

You can use C++ typecasting operator & create a new class which can overload the operator + to your need. You can replace the int to new class "Integer" & perform the required overloading. This requires no changes or word replacing in the main function invocation.
class Integer{
long i;
std::string formatted;
public:
Integer(int i){i = i;}
operator char*(){
return (char*)formatted.c_str();}
friend Integer operator +( char* input, Integer t);
};
Integer operator +( char* input, Integer integer) {
integer.formatted = input + std::to_string(integer.i);
return integer;
}
Integer i = ....
f("test" + i); //executes the overloaded operator

i'm assuming for function f(some_str + i); your definition should be like this
void f(std::string value)
{
// do something.
}
if you declare some other class like AdvString to implement Operator + for intergers. if your declare your function like this below code. it will work like this implementation f(some_str + i);
void f(AdvString value)
{
// do something.
}
sample implementation is here https://github.com/prasaathviki/advstring

g++ regex crash on (possibly unsyntactic) expression

I figure the following program should either complain it can't compile the regular expression or else treat it as legal and compile it fine (I don't have the standard so I can't say for sure whether the expression is strictly legal; certainly reasonable interpretations are possible). Anyway, what happens with g++ (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1 is that, when run, it crashes hard
*** Error in `./a.out': free(): invalid next size (fast): 0x08b51248 ***
in the guts of the library.
Questions are:
a) it's bug, right? I assume (perhaps incorrectly) the standard doesn't say std::regex can crash if it doesn't like the syntax. (msvc eats it fine, fwiw)
b) if it's a bug, is there some easy way to see whether it's been reported or not (my first time poking around gnu-land bug systems was intimidating)?
#include <iostream>
#include <regex>
int main(void)
{
const char* Pattern = "^(%%)|";
std::regex Machine;
try {
Machine = Pattern;
}
catch(std::regex_error e)
{
std::cerr << "regex could not compile pattern: "
<< Pattern << "\n"
<< e.what() << std::endl;
throw;
}
return 0;
}

I would put this in a comment, but I can't, so...
I don't know if you already know, but it seems to be the pipe | character at the end that's causing your problems. It seems like the character representation of | as a last character (since "^(%%)|a" works fine for me) given by g++ is making a mess when regex tries to call free();
The standard (or at least the online draft I'm reading) claims that:
28.8
Class template basic_regex
[re.regex]
1 For a char-like type charT, specializations of class template basic_regex represent regular expressions
constructed from character sequences of charT characters. In the rest of 28.8, charT denotes a given char-
like type. Storage for a regular expression is allocated and freed as necessary by the member functions of
class basic_regex.
2 Objects of type specialization of basic_regex are responsible for converting the sequence of charT objects
to an internal representation. It is not specified what form this representation takes, nor how it is accessed by
algorithms that operate on regular expressions.
[ Note: Implementations will typically declare some function
templates as friends of basic_regex to achieve this — end note ]
and later,
basic_regex& operator=(const charT* ptr);
3 Requires: ptr shall not be a null pointer.
4 Effects: returns assign(ptr).
So unless g++ thinks const char* Pattern ="|"; is a null ptr (I would imagine not...),
I guess it's a bug?
EDIT: Incidentally, consecutive || (even when not at the end) seem to cause a segmentation fault for me also.

Example of overloading C++ extraction operator >> to parse data

I am looking for a good example of how to overload the stream input operator (operator>>) to parse some data with simple text formatting. I have read this tutorial but I would like to do something a bit more advanced. In my case I have fixed strings that I would like to check for (and ignore). Supposing the 2D point format from the link were more like
Point{0.3 =>
0.4 }
where the intended effect is to parse out the numbers 0.3 and 0.4. (Yes, this is an awfully silly syntax, but it incorporates several ideas I need). Mostly I just want to see how to properly check for the presence of fixed strings, ignore whitespace, etc.
Update:
Oops, the comment I made below has no formatting (this is my first time using this site).
I found that whitespace can be skipped with something like
std::cin >> std::ws;
And for eating up strings I have
static bool match_string(std::istream &is, const char *str){
size_t nstr = strlen(str);
while(nstr){
if(is.peek() == *str){
is.ignore(1);
++str;
--nstr;
}else{
is.setstate(is.rdstate() | std::ios_base::failbit);
return false;
}
}
return true;
}
Now it would be nice to be able to get the position (line number) of a parsing error.
Update 2:
Got line numbers and comment parsing working, using just 1 character look-ahead. The final result can be seen here in AArray.cpp, in the function parse(). The project is a (de)serializable C++ PHP-like array class.

Your operator>>(istream &, object &) should get data from the input stream, using its formatted and/or unformatted extraction functions, and put it into your object.
If you want to be more safe (after a fashion), construct and test an istream::sentry object before you start. If you encounter a syntax error, you may call setstate( ios_base::failbit ) to prevent any other processing until you call my_stream.clear().
See <istream> (and istream.tcc if you're using SGI STL) for examples.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Boost property tree issue when converting to Unicode - c++

Related

Getting Error Debug Assertion Failed: Expression c >= -1 && c <= 255

Compiler error for using operator '>>' with ifstream

How to find in my program a "const char* + int" expression

g++ regex crash on (possibly unsyntactic) expression

Example of overloading C++ extraction operator >> to parse data

Categories

Resources