How to print the content of a string as "sting literal source" - c++

Suppose s is
a
b
c
const std::string s =
std::cout << R"( s )" << std::endl;
How to std::cout the content of the string in raw literal? I mean the cout return the value in this format: "a\nb\nc".
I need to transform a very large text into a std::string.
I cant use fileread as i need to define its value inside the src.

What you would need to do is to scan the string, and replace all occurrences of the characters you are interested in (such as carriage return, tab, etc) with printable escape sequence and than print this new text.
Here is somewhat crude proof of concept:
std::string escape(std::string_view src) {
std::string ret;
ret.reserve(src.size() * 2); // at worst, the string consists solely of escapable symbols
static constexpr std::array escapable = {std::make_pair('\t', 't'),
std::make_pair('\n', 'n')}; // add more chars as needed, note that the array is sorted
for (const char ch: src) {
std::pair search_pair{ch, ' '};
auto esc_char = std::equal_range(escapable.begin(), escapable.end(), search_pair, [](auto& a, auto& b) { return a.first < b.first; });
if (esc_char.first != escapable.end()) {
ret.push_back('\\');
ret.push_back(esc_char.first->second);
} else {
ret.push_back(ch);
}
}
return ret;
}
Now, you can use it:
const std::string str = "A\nbub\tfuf\n";
std::cout << escape(str) << "\n";
Above snippet prints A\nbub\tfuf\n

You could be interested by the JSON specification.
You could consider serializing your data in JSON format using open source C++ libraries like jsoncpp
You could also consider using some YAML format with the yaml-cpp library
You could be interested by the SWIG tool which generates C++ glue code.
You could consider using binary data formats like XDR.
You should specify (on paper, with a pencil) your data format in EBNF notation and use ANTLR or GNU bison to generate the parser (the printer is easier to code)
The RefPerSys project (an open source symbolic artificial intelligence system, GPLv3+ licensed) is persisting data in textual format. You may borrow some code are re-use it in your application, if you obey to that GPL license.
Look also into Qt or POCO frameworks, but notice that DWORD64 is not a standard C++ type. See this C++ reference and read a recent C++ standard (like n3337 or better).
Consider generating your C++ serializing code
With tools like GNU m4 or GPP (or your own one).
Pitrat's book Artificial Beings: the Conscience of a Conscious Machine (ISBN-13: 978-1848211018) should give you valuable insight and intuitions.

You can load this text file into a std::string like this:
Store the text in your file, e.g. mystring.txt, as a raw string literal in the format R"(raw_characters)":
R"(Run.M128A XmmRegisters[16];
BYTE Reserved4[96];", Run.CONTEXT64 := " DWORD64 P1Home;
DWORD64 P2Home;
...
)"
#include the file into a string:
namespace
{
const std::string mystring =
#include "mystring.txt"
;
}
Your IDE might flag this up as a syntax error, but it isn't. What you're doing is loading the contents of file directly into the string at compile time.
Finally print the string:
std::cout << mystring << std::endl;
Why not just save the escaped version of the string in the file?
Any way, here's a function to 'escape' characters:
#include <iostream>
#include <string>
#include <unordered_map>
std::string replace_all(const std::string &mystring)
{
const std::unordered_map<char, std::string> lookup =
{ {'\n', "\\n"}, {'\t', "\\t"}, {'"', "\\\""} };
std::string new_string;
new_string.reserve(mystring.length() * 2);
for (auto c : mystring)
{
auto it = lookup.find(c);
if (it != lookup.end())
new_string += it->second;
else
new_string += c;
}
return new_string;
}
int main() {
std::string mystring = R"(Run.M128A XmmRegisters[16];
BYTE Reserved4[96];", Run.CONTEXT64 := " DWORD64 P1Home;
DWORD64 P2Home;
DWORD64 P3Home;
DWORD64 P4Home;
DWORD64 P5Home;
DWORD64 P6Home;)";
auto new_string = replace_all(mystring);
std::cout << new_string << std::endl;
return 0;
}
Here's a demo.

Related

How do I read a Windows-1252 file using Rcpp?

I want to force the input format when reading a file into Windows-1252 encoding together with Rcpp. I need this since I switch between Linux/Windows environments and while the files are consistently in 1252 encoding.
How do I adapt this to work:
String readFile(std::string path) {
std::ifstream t(path.c_str());
if (!t.good()){
std::string error_msg = "Failed to open file ";
error_msg += "'" + path + "'";
::Rf_error(error_msg.c_str());
}
const std::locale& locale = std::locale("sv_SE.1252");
t.imbue(locale);
std::stringstream ss;
ss << t.rdbuf();
return ss.str();
}
The above fails with:
Error in eval(expr, envir, enclos) :
locale::facet::_S_create_c_locale name not valid
I've also tried with "Swedish_Sweden.1252" that is the default for my system to no avail. I've tried #include <boost/locale.hpp> but that seems to be unavailable in Rcpp (v 0.12.0)/BH boost (v. 1.58.0-1).
Update:
After digging a little deeper into this I'm not sure if the gcc (v. 4.6.3) in RTools (v. 3.3) is built with locale support, this SO question points to that possibility. If there is any argument except "" or "C" works with std::locale() it would be interesting to know, I've tried a few more alternatives but nothing seems to work.
Fallback solution
I'm not entirely satisfied but it seems that using the base::iconv() fixes any issues with characters regardless of the original format, much thanks to the from="WINDOWS-1252"argument forcing the chars to be interpreted in the correct form, i.e. if we want to stay in Rcpp we can simply do:
String readFile(std::string path) {
std::ifstream t(path.c_str());
if (!t.good()){
std::string error_msg = "Failed to open file ";
error_msg += "'" + path + "'";
::Rf_error(error_msg.c_str());
}
const std::locale& locale = std::locale("sv_SE.1252");
t.imbue(locale);
std::stringstream ss;
ss << t.rdbuf();
Rcpp::StringVector ret = ss.str();
Environment base("package:base");
Function iconv = base["iconv"];
ret = iconv(ret, Named("from","WINDOWS-1252"),Named("to","UTF8"));
return ret;
}
Note that it is preferrable to wrap the function in R rather than getting the function from C++ and then calling it from there, it is both less code and improves performance improvement by a factor of 2 (checked with microbenchmark):
readFileWrapper <- function(path){
ret <- readFile(path)
iconv(ret, from = "WINDOWS-1252", to = "UTF8")
}

iterate over ini file on c++, probably using boost::property_tree::ptree?

My task is trivial - i just need to parse such file:
Apple = 1
Orange = 2
XYZ = 3950
But i do not know the set of available keys. I was parsing this file relatively easy using C#, let me demonstrate source code:
public static Dictionary<string, string> ReadParametersFromFile(string path)
{
string[] linesDirty = File.ReadAllLines(path);
string[] lines = linesDirty.Where(
str => !String.IsNullOrWhiteSpace(str) && !str.StartsWith("//")).ToArray();
var dict = lines.Select(s => s.Split(new char[] { '=' }))
.ToDictionary(s => s[0].Trim(), s => s[1].Trim());
return dict;
}
Now I just need to do the same thing using c++. I was thinking to use boost::property_tree::ptree however it seems I just can not iterate over ini file. It's easy to read ini file:
boost::property_tree::ptree pt;
boost::property_tree::ini_parser::read_ini(path, pt);
But it is not possible to iterate over it, refer to this question Boost program options - get all entries in section
The question is - what is the easiest way to write analog of C# code above on C++ ?
To answer your question directly: of course iterating a property tree is possible. In fact it's trivial:
#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/ini_parser.hpp>
int main()
{
using boost::property_tree::ptree;
ptree pt;
read_ini("input.txt", pt);
for (auto& section : pt)
{
std::cout << '[' << section.first << "]\n";
for (auto& key : section.second)
std::cout << key.first << "=" << key.second.get_value<std::string>() << "\n";
}
}
This results in output like:
[Cat1]
name1=100 #skipped
name2=200 \#not \\skipped
name3=dhfj dhjgfd
[Cat_2]
UsagePage=9
Usage=19
Offset=0x1204
[Cat_3]
UsagePage=12
Usage=39
Offset=0x12304
I've written a very full-featured Inifile parser using boost-spirit before:
Cross-platform way to get line number of an INI file where given option was found
It supports comments (single line and block), quotes, escapes etc.
(as a bonus, it optionally records the exact source locations of all the parsed elements, which was the subject of that question).
For your purpose, though, I think I'd recomment Boost Property Tree.
For the moment, I've simplified the problem a bit, leaving out the logic for comments (which looks broken to me anyway).
#include <map>
#include <fstream>
#include <iostream>
#include <string>
typedef std::pair<std::string, std::string> entry;
// This isn't officially allowed (it's an overload, not a specialization) but is
// fine with every compiler of which I'm aware.
namespace std {
std::istream &operator>>(std::istream &is, entry &d) {
std::getline(is, d.first, '=');
std::getline(is, d.second);
return is;
}
}
int main() {
// open an input file.
std::ifstream in("myfile.ini");
// read the file into our map:
std::map<std::string, std::string> dict((std::istream_iterator<entry>(in)),
std::istream_iterator<entry>());
// Show what we read:
for (entry const &e : dict)
std::cout << "Key: " << e.first << "\tvalue: " << e.second << "\n";
}
Personally, I think I'd write the comment skipping as a filtering stream buffer, but for those unfamiliar with the C++ standard library, it's open to argument that would be a somewhat roundabout solution. Another possibility would be a comment_iterator that skips the remainder of a line, starting from a designated comment delimiter. I don't like that as well, but it's probably simpler in some ways.
Note that the only code we really write here is to read one, single entry from the file into a pair. The istream_iterator handles pretty much everything from there. As such, there's little real point in writing a direct analog of your function -- we just initialize the map from the iterators, and we're done.

Reading a string from a file in C++

I'm trying to store strings directly into a file to be read later in C++ (basically for the full scope I'm trying to store an object array with string variables in a file, and those string variables will be read through something like object[0].string). However, everytime I try to read the string variables the system gives me a jumbled up error. The following codes are a basic part of what I'm trying.
#include <iostream>
#include <fstream>
using namespace std;
/*
//this is run first to create the file and store the string
int main(){
string reed;
reed = "sees";
ofstream ofs("filrsee.txt", ios::out|ios::binary);
ofs.write(reinterpret_cast<char*>(&reed), sizeof(reed));
ofs.close();
}*/
//this is run after that to open the file and read the string
int main(){
string ghhh;
ifstream ifs("filrsee.txt", ios::in|ios::binary);
ifs.read(reinterpret_cast<char*>(&ghhh), sizeof(ghhh));
cout<<ghhh;
ifs.close();
return 0;
}
The second part is where things go haywire when I try to read it.
Sorry if it's been asked before, I've taken a look around for similar questions but most of them are a bit different from what I'm trying to do or I don't really understand what they're trying to do (still quite new to this).
What am I doing wrong?
You are reading from a file and trying to put the data in the string structure itself, overwriting it, which is plain wrong.
As it can be verified at http://www.cplusplus.com/reference/iostream/istream/read/ , the types you used were wrong, and you know it because you had to force the std::string into a char * using a reinterpret_cast.
C++ Hint: using a reinterpret_cast in C++ is (almost) always a sign you did something wrong.
Why is it so complicated to read a file?
A long time ago, reading a file was easy. In some Basic-like language, you used the function LOAD, and voilĂ !, you had your file.
So why can't we do it now?
Because you don't know what's in a file.
It could be a string.
It could be a serialized array of structs with raw data dumped from memory.
It could even be a live stream, that is, a file which is appended continuously (a log file, the stdin, whatever).
You could want to read the data word by word
... or line by line...
Or the file is so large it doesn't fit in a string, so you want to read it by parts.
etc..
The more generic solution is to read the file (thus, in C++, a fstream), byte per byte using the function get (see http://www.cplusplus.com/reference/iostream/istream/get/), and do yourself the operation to transform it into the type you expect, and stopping at EOF.
The std::isteam interface have all the functions you need to read the file in different ways (see http://www.cplusplus.com/reference/iostream/istream/), and even then, there is an additional non-member function for the std::string to read a file until a delimiter is found (usually "\n", but it could be anything, see http://www.cplusplus.com/reference/string/getline/)
But I want a "load" function for a std::string!!!
Ok, I get it.
We assume that what you put in the file is the content of a std::string, but keeping it compatible with a C-style string, that is, the \0 character marks the end of the string (if not, we would need to load the file until reaching the EOF).
And we assume you want the whole file content fully loaded once the function loadFile returns.
So, here's the loadFile function:
#include <iostream>
#include <fstream>
#include <string>
bool loadFile(const std::string & p_name, std::string & p_content)
{
// We create the file object, saying I want to read it
std::fstream file(p_name.c_str(), std::fstream::in) ;
// We verify if the file was successfully opened
if(file.is_open())
{
// We use the standard getline function to read the file into
// a std::string, stoping only at "\0"
std::getline(file, p_content, '\0') ;
// We return the success of the operation
return ! file.bad() ;
}
// The file was not successfully opened, so returning false
return false ;
}
If you are using a C++11 enabled compiler, you can add this overloaded function, which will cost you nothing (while in C++03, baring optimizations, it could have cost you a temporary object):
std::string loadFile(const std::string & p_name)
{
std::string content ;
loadFile(p_name, content) ;
return content ;
}
Now, for completeness' sake, I wrote the corresponding saveFile function:
bool saveFile(const std::string & p_name, const std::string & p_content)
{
std::fstream file(p_name.c_str(), std::fstream::out) ;
if(file.is_open())
{
file.write(p_content.c_str(), p_content.length()) ;
return ! file.bad() ;
}
return false ;
}
And here, the "main" I used to test those functions:
int main()
{
const std::string name(".//myFile.txt") ;
const std::string content("AAA BBB CCC\nDDD EEE FFF\n\n") ;
{
const bool success = saveFile(name, content) ;
std::cout << "saveFile(\"" << name << "\", \"" << content << "\")\n\n"
<< "result is: " << success << "\n" ;
}
{
std::string myContent ;
const bool success = loadFile(name, myContent) ;
std::cout << "loadFile(\"" << name << "\", \"" << content << "\")\n\n"
<< "result is: " << success << "\n"
<< "content is: [" << myContent << "]\n"
<< "content ok is: " << (myContent == content)<< "\n" ;
}
}
More?
If you want to do more than that, then you will need to explore the C++ IOStreams library API, at http://www.cplusplus.com/reference/iostream/
You can't use std::istream::read() to read into a std::string object. What you could do is to determine the size of the file, create a string of suitable size, and read the data into the string's character array:
std::string str;
std::ifstream file("whatever");
std::string::size_type size = determine_size_of(file);
str.resize(size);
file.read(&str[0], size);
The tricky bit is determining the size the string should have. Given that the character sequence may get translated while reading, e.g., because line end sequences are transformed, this pretty much amounts to reading the string in the general case. Thus, I would recommend against doing it this way. Instead, I would read the string using something like this:
std::string str;
std::ifstream file("whatever");
if (std::getline(file, str, '\0')) {
...
}
This works OK for text strings and is about as fast as it gets on most systems. If the file can contain null characters, e.g., because it contains binary data, this doesn't quite work. If this is the case, I'd use an intermediate std::ostringstream:
std::ostringstream out;
std::ifstream file("whatever");
out << file.rdbuf();
std::string str = out.str();
A string object is not a mere char array, the line
ifs.read(reinterpret_cast<char*>(&ghhh), sizeof(ghhh));
is probably the root of your problems.
try applying the following changes:
char[BUFF_LEN] ghhh;
....
ifs.read(ghhh, BUFF_LEN);

Declare static variable inside a function call in C++

I have a C++ program with many thousands of string literals in the code which need to be translated, for example:
statusBar->Print( "My Message" );
I wrapped the string literals with a function which looks up the value in a dictionary and returns the translated version:
statusBar->Print( Translated( "My Message" ) );
The problem is that after profiling I've discovered that doing this look up all over the code is a performance problem. What I'd like to do is change lines like that to:
static const char * translatedMessage5 = Translated( "My Message" );
statusBar->Print( translatedMessage5 );
But due to the many thousands of instances of this in the code, it's going to be error prone (and a bit of a maintenance nightmare). I was hoping that I could turn Translated into a macro which declared the static variable in-line. This obviously doesn't work. Anyone have a better idea?
I/O time needed to print your message should be several orders of magnitude more than any dictionary lookup time. If this is not the case, you are doing something wrong.
Tried and tested software is available that does what you need. I suggest you either study GNU Gettext, which is used by every other FOSS project out there, or just use it in your program instead of a homebrew solution.
EDIT: With C++0x it is possible to do what you want, but still consider using GNU Gettext as your real l10n engine. Here's some proof-of-concept little code:
#include <iostream>
const char* realTranslate(const char* txt)
{
std::cout << "*** translated " << txt << std::endl;
return txt; // use a real translation here such as gnu gettext
}
#define Translate(txt) \
(([]()->const char* \
{static const char* out = realTranslate(txt); return out;})())
int main ()
{
for (int i = 0; i < 10; ++i)
{
std::cout << Translate("This is a message") << std::endl;
std::cout << Translate("This is a message") << std::endl;
std::cout << Translate("This is another message") << std::endl;
}
}
I'm not sure what the real C++ standard is going to specify, but under gcc-4.6 the realTranslate() function is called 3 times.
Can you change to unique error codes and index them into vector? This simplifies the code and the lookup, and adding additional error messages becomes trivial. Also, ensures error messages added in this manner are more visible (externally to this application, for example -- could easily be published to a "User Guide" or similar).
#include <string>
#include <vector>
enum ErrorMessages
{
my_message,
my_other_message,
...
msg_high
};
std::vector<std::string> error_messages;
void init()
{
error_messages.resize(msg_high);
error_messages[my_msg] = "My Message";
error_messages[my_other_msg] = "My Other Message";
...
}
const char* const Translate(const ErrorMessage msg)
{
return error_messages[msg].c_str();
}
void f()
{
statusBar->Print(Translated(my_msg));
}
This might not help you here, but what you could do is declare a std::map that would hold a map of hash -> text pairs. The question here is if calculating hash code on a string will be same level of effort as translating it, and this I don't know.
char * Translate(char *source)
{
static std::map<int, char*> sources;
static std::map<int, char*> results;
int hashcode = CalculateHashCode(source);
std::map<int, char*>::const_iterator it = sources.find( source );
if ( it != sources.end() )
{
return results[ hashcode ];
}
... code to translate ...
results[ hashcode ] = translated;
}

C++: How to extract a string from RapidXml

In my C++ program I want to parse a small piece of XML, insert some nodes, then extract the new XML (preferably as a std::string).
RapidXml has been recommended to me, but I can't see how to retrieve the XML back as a text string.
(I could iterate over the nodes and attributes and build it myself, but surely there's a build in function that I am missing.)
Thank you.
Althoug the documentation is poor on this topic, I managed to get some working code by looking at the source. Although it is missing the xml header which normally contains important information. Here is a small example program that does what you are looking for using rapidxml:
#include <iostream>
#include <sstream>
#include "rapidxml/rapidxml.hpp"
#include "rapidxml/rapidxml_print.hpp"
int main(int argc, char* argv[]) {
char xml[] = "<?xml version=\"1.0\" encoding=\"latin-1\"?>"
"<book>"
"</book>";
//Parse the original document
rapidxml::xml_document<> doc;
doc.parse<0>(xml);
std::cout << "Name of my first node is: " << doc.first_node()->name() << "\n";
//Insert something
rapidxml::xml_node<> *node = doc.allocate_node(rapidxml::node_element, "author", "John Doe");
doc.first_node()->append_node(node);
std::stringstream ss;
ss <<*doc.first_node();
std::string result_xml = ss.str();
std::cout <<result_xml<<std::endl;
return 0;
}
Use print function (found in rapidxml_print.hpp utility header) to print the XML node contents to a stringstream.
rapidxml::print reuqires an output iterator to generate the output, so a character string works with it. But this is risky because I can not know whether an array with fixed length (like 2048 bytes) is long enough to hold all the content of the XML.
The right way to do this is to pass in an output iterator of a string stream so allow the buffer to be expanded when the XML is being dumped into it.
My code is like below:
std::stringstream stream;
std::ostream_iterator<char> iter(stream);
rapidxml::print(iter, doc, rapidxml::print_no_indenting);
printf("%s\n", stream.str().c_str());
printf("len = %d\n", stream.str().size());
If you do build XML yourself, don't forget to escape the special characters. This tends to be overlooked, but can cause some serious headaches if it is not implemented:
< <
> >
& &
" "
&apos; &apos;
Here's how to print a node to a string straight from the RapidXML Manual:
xml_document<> doc; // character type defaults to char
// ... some code to fill the document
// Print to stream using operator <<
std::cout << doc;
// Print to stream using print function, specifying printing flags
print(std::cout, doc, 0); // 0 means default printing flags
// Print to string using output iterator
std::string s;
print(std::back_inserter(s), doc, 0);
// Print to memory buffer using output iterator
char buffer[4096]; // You are responsible for making the buffer large enough!
char *end = print(buffer, doc, 0); // end contains pointer to character after last printed character
*end = 0; // Add string terminator after XML
If you aren't yet committed to Rapid XML, I can recommend some alternative libraries:
Xerces - This is probably the defacto C++ implementation.
XMLite - I've had some luck with this minimal XML implementation. See the article at http://www.codeproject.com/KB/recipes/xmlite.aspx
Use static_cast<>
Ex:
rapidxml::xml_document<> doc;
rapidxml::xml_node <> * root_node = doc.first_node();
std::string strBuff;
doc.parse<0>(xml);
.
.
.
strBuff = static_cast<std::string>(root_node->first_attribute("attribute_name")->value());
Following is very easy,
std::string s;
print(back_inserter(s), doc, 0);
cout << s;
You only need to include "rapidxml_print.hpp" header in your source code.