string conversion c++ - c++

I have a string and the first element is for example 'a'. I already declared a variable called a ( so int a=1 for example). My question now is, how can I convert the whole string to numbers (a=1,b=2,c=3,...z=26)? Example:
string str="hello"; this has to be changed to "85121215" and then changed to 85121215.

// transformation itself doesn't care what encoding we use
std::string transform_string(std::string const &in, std::function<int(char)> op)
{
std::ostringstream out;
std::transform(in.begin(), in.end(),
std::ostream_iterator<int>(out),
op);
return out.str();
}
// the per-character mapping is easy to isolate
int ascii_az_map(char ch)
{
if (ch < 'a' || ch > 'z') {
std::ostringstream error;
error << "character '" << ch << "'=" << (int)ch
<< " not in range a-z";
throw std::out_of_range(error.str());
}
return 1 + ch - 'a';
}
// so we can support other encodings if necessary
// NB. ebdic_to_ascii isn't actually implemented here
int ebcdic_az_map(char ch)
{
return ascii_az_map(ebcdic_to_ascii(ch));
}
// and even detect the platform encoding automatically (w/ thanks to Phresnel)
// (you can still explicitly select a non-native encoding if you want)
int default_az_map(char ch)
{
#if ('b'-'a' == 1) && ('j' - 'i' == 1)
return ascii_az_map(ch);
#elif ('j'-'i' == 8)
return ebcdic_az_map(ch);
#else
#error "unknown character encoding"
#endif
}
// use as:
std::string str = "hello";
std::string trans = transform_string(str, ascii_az_map);
// OR ... transform_string(str, ebcdic_az_map);
Note that since the per-character mapping is completely isolated, it's really easy to change the mapping to a lookup table, support different encodings etc.

Your definition is a bit small:
"hello" = "85121215
h = 8
e = 5
l = 12
o = 15
I assume you mean that
a = 1
b = 2
...
z = 26
in which case it is not that hard:
std::string meh_conv(char c) {
switch(c) { // (or `switch(tolower(c))` and save some typing)
case 'a': case 'A': return "1";
case 'b': case 'B': return "2";
....
case 'z': case 'Z': return "26";
....
// insert other special characters here
}
throw std::range_error("meh");
}
std::string meh_conv(std::string const &src) {
std::string dest;
for (const auto c : s)
dest += meh_conv(c);
return dest;
}
or use std::transform():
#include <algorithm>
std::string dest;
std::transform (src.begin(), src.end(), back_inserter(dest),
meh_conv)
(doesn't work for different incoming and outgoing types, at least not as is)
Addendum.
You possibly want to parametrize the replacement map:
std::map<char, std::string> repl;
repl['a'] = repl['A'] = "0";
repl[' '] = " ";
std::string src = "hello";
std::string dest;
for (const auto c : src) dest += repl[c];

I wrote you a simple example. It creates a map what contains the a-1, b-2, c-3 ... pairs. Then concatenate the values using a stringstream:
#include <iostream>
#include <map>
#include <sstream>
int main()
{
std::string str = "abc";
std::map<char,int> dictionary;
int n = 1;
for(char c='a'; c<='z'; c++)
dictionary.insert(std::pair<char,int>(c,n++));
//EDIT if you want uppercase characters too:
n=1;
for(char c='A'; c<='Z'; c++)
dictionary.insert(std::pair<char,int>(c,n++));
std::stringstream strstream;
for(int i=0; i<str.size(); i++)
strstream<<dictionary[str[i]];
std::string numbers = strstream.str();
std::cout<<numbers;
return 0;
}
C++ experts probably going to kill me for this solution, but it works ;)

Easy Approach,
you can find mod of char with 96 (ASCII value before a), as result it will always give you values in range 1-26.
int value;
string s;
cin>>s;
for(int i=0; i<s.size();i++){
value = s[j]%96;
cout<<value<<endl;
}

Related

How to replace "pi" by "3.14"?

How to replace all "pi" from a string by "3.14"? Example: INPUT = "xpix" ___ OUTPUT = "x3.14x" for a string, not character array.
This doesn't work:
#include<iostream>
using namespace std;
void replacePi(string str)
{
if(str.size() <=1)
return ;
replacePi(str.substr(1));
int l = str.length();
if(str[0]=='p' && str[1]=='i')
{
for(int i=l;i>1;i--)
str[i+2] = str[i];
str[0] = '3';
str[1] = '.';
str[2] = '1';
str[3] = '4';
}
}
int main()
{
string s;
cin>>s;
replacePi(s);
cout << s << endl;
}
There is a ready to use function in the C++ lib. It is called: std::regex_replace. You can read the documentation in the CPP Reference here.
Since it uses regexes it is very powerful. The disadvantage is that it may be a little bit too slow during runtime for some uses case. But for your example, this does not matter.
So, a common C++ solution would be:
#include <iostream>
#include <string>
#include <regex>
int main() {
// The test string
std::string input{ "Pi is a magical number. Pi is used in many places. Go for Pi" };
// Use simply the replace function
std::string output = std::regex_replace(input, std::regex("Pi"), "3.14");
// Show the output
std::cout << output << "\n";
}
But my guess is that you are learning C++ and the teacher gave you a task and expects a solution without using elements from the std C++ library. So, a hands on solution.
This can be implemented best with a temporary string. You check character by character from the original string. If the characters do not belong to Pi, then copy them as is to new new string. Else, copy 3.14 to the new string.
At the end, overwrite the original string with the temp string.
Example:
#include <iostream>
#include <string>
using namespace std;
void replacePi(string& str) {
// Our temporay
string temp = "";
// Sanity check
if (str.length() > 1) {
// Iterate over all chararcters in the source string
for (size_t i = 0; i < str.length() - 1; ++i) {
// Check for Pi in source string
if (str[i] == 'P' and str[i + 1] == 'i') {
// Add replacement string to temp
temp += "3.14";
// We consumed two characters, P and i, so increase index one more time
++i;
}
else {
// Take over normal character
temp += str[i];
}
}
str = temp;
}
}
// Test code
int main() {
// The test string
std::string str{ "Pi is a magical number. Pi is used in many places. Go for Pi" };
// Do the replacement
replacePi(str);
// Show result
std::cout << str << '\n';
}
What you need is string::find and string::replace. Here is an example
size_t replace_all(std::string& str, std::string from, std::string to)
{
size_t count = 0;
std::string::size_type pos;
while((pos=str.find(from)) != str.npos)
{
str.replace(pos, from.length(), to);
count++;
}
return count;
}
void replacePi(std::string& str)
{
replace_all(str, "pi", "3.14");
}

Convert vector<string> to unsigned char array in C++

I have a string vector that holds some values. These values are supposed to be hex bytes but are being stored as strings inside this vector.
The bytes were read from inside a text file actually, something like this:
(contents of the text file)
<jpeg1>
0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60
</jpeg1>
so far, what my code does is, it starts reading the line after the {JPEG1} tag until the {/jpeg1} tag and then using the comma ',' as a delimeter it stores the bytes into the string vector.
After Splitting the string, the vector at the moment stores the values like this :
vector<string> myString = {"0xFF", "0xD8", "0xFF", "0xE0", "0x00", "0x10", "0x4A", "0x46", "0x49", "0x46", "0x00", "0x01", "0x01", "0x01", "0x00", "0x60"};
and if i print this i get the following:
0: 0xFF
1: 0xD8
2: 0xFF
3: 0xE0
4: 0x00
5: 0x10
6: 0x4A
7: 0x46
8: 0x49
9: 0x46
What I would want is that, I'd like to store these bytes inside an unsigned char array, such that each element be treated as a HEX byte and not a string value.
Preferably something like this :
unsigned char myHexArray[] = {0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60};
if i print this i get:
0:  
1: ╪
2:  
3: α
4:
5:
6: J
7: F
8: I
9: F
Solved!
Thanks for your help guys, so far "ranban282" solution has worked for me, I'll try solutions provided by other users as well.
I wouldn't even go through the std::vector<std::string> stage, you don't need it and it wastes a lot of allocations for no good reason; just parse the string to bytes "online".
If you already have an istream for your data, you can parse it straight from it, although I had terrible experiences about performance for it.
// is is some derived class of std::istream
std::vector<unsigned char> ret;
while(is) {
int val = 0;
is>>std::hex>>val;
if(!is) {
break; // failed conversion; remember to clean up the stream
// if you need it later!
}
ret.push_back(val);
if(is.getc()!=',') break;
}
If instead you have it in a string - as often happens when extracting data from an XML file, you can parse it either using istringstream and the code above (one extra string copy + generally quite slow), or parse it straight from the string using e.g. sscanf with %i; say that your string is in a const char *sz:
std::vector<unsigned char> ret;
for(; *sz; ++sz) {
int read = 0;
int val = 0;
if(sscanf(sz, " %i %n", &val, &read)==0) break; // format error
ret.push_back(val):
sz += read;
if(*sz && *sz != ',') break; // format error
}
// now ret contains the decoded string
If you are sure that the strings are always hexadecimal, regardless of the 0x prefix, and that whitespace is not present strtol is a bit more efficient and IMO nicer to use:
std::vector<unsigned char> ret;
for( ;*sz;++sz) {
char *endp;
long val = strtol(sz, &endp, 16);
if(endp==sz) break; // format error
sz = endp;
ret.push_back(val);
if(*sz && *sz!=',') break; // format error
}
If C++17 is available, you can use std::from_chars instead of strtol to cut out the locale bullshit, which can break your parsing function (although that's more typical for floating point parsing) and slow it down for no good reason.
OTOH, if the performance is critical but from_chars is not available (or if it's available but you measured that it's slow), it may be advantageous to hand roll the whole parser.
auto conv_digit = [](char c) -> int {
if(c>='0' && c<='9') return c-'0';
// notice: technically not guaranteed to work;
// in practice it'll work on anything that doesn't use EBCDIC
if(c>='A' && c<='F') return c-'A'+10;
if(c>='a' && c<='f') return c-'a'+10;
return -1;
};
std::vector<unsigned char> ret;
for(; *sz; ++sz) {
while(*sz == ' ') ++sz;
if(*sz!='0' || sz[1]!='x' || sz[1]!='X') break; // format error
sz+=2;
int val = 0;
int digit = -1;
const char *sz_before = sz;
while((digit = conv_digit(*sz)) >= 0) {
val=val*16+digit; // or, if you prefer: val = val<<4 | digit;
++sz;
}
if(sz==sz_before) break; // format error
ret.push_back(val);
while(*sz == ' ') ++sz;
if(*sz && *sz!=',') break; // format error
}
If you're using C++11, you can use the stoi function.
vector<string> myString = {"0xFF", "0xD8", "0xFF", "0xE0", "0x00", "0x10", "0x4A", "0x46", "0x49", "0x46", "0x00", "0x01", "0x01", "0x01", "0x00", "0x60"};
unsigned char* myHexArray=new unsigned char[myString.size()];
for (unsigned i=0;i<myString.size();i++)
{
myHexArray[i]=stoi(myString[i],NULL,0);
}
for (unsigned i=0;i<myString.size();i++)
{
cout<<myHexArray[i]<<endl;
}
The function stoi() was introduced by C++11. In order to compile with gcc, you should compile with the flags -std=c++11.
In case you're using an older version of c++ you can use strtol instead of stoi. Note that you need to convert the string to a character array first.
myHexArray[i]=strtol(myString[i].c_str(),NULL,0);
You can use std::stoul on each of your values and build your array using another std::vector like this:
std::vector<std::string> vs {"0xFF", "0xD8", "0xFF" ...};
std::vector<unsigned char> vc;
vc.reserve(vs.size());
for(auto const& s: vs)
vc.push_back((unsigned char) std::stoul(s, 0, 0));
Now you can access your array with:
vc.data(); // <-- pointer to unsigned char array
Here's a complete solution including a test and a rudimentary parser (for simplicity, it assumes that the xml tags are on their own lines).
#include <string>
#include <sstream>
#include <regex>
#include <iostream>
#include <iomanip>
#include <iterator>
const char test_data[] =
R"__(<jpeg1>
0xFF,0xD8,0xFF,0xE0,0x00,0x10,0x4A,0x46,0x49,0x46,0x00,0x01,0x01,0x01,0x00,0x60,
0x12,0x34,0x56,0x78,0x9a,0xbc,0xde,0xf0
</jpeg1>)__";
struct Jpeg
{
std::string name;
std::vector<std::uint8_t> data;
};
std::ostream& operator<<(std::ostream& os, const Jpeg& j)
{
os << j.name << " : ";
const char* sep = " ";
os << '[';
for (auto b : j.data) {
os << sep << std::hex << std::setfill('0') << std::setw(2) << std::uint32_t(b);
sep = ", ";
}
return os << " ]";
}
template<class OutIter>
OutIter read_bytes(OutIter dest, std::istream& source)
{
std::string buffer;
while (std::getline(source, buffer, ','))
{
*dest++ = static_cast<std::uint8_t>(std::stoul(buffer, 0, 16));
}
return dest;
}
Jpeg read_jpeg(std::istream& is)
{
auto result = Jpeg {};
static const auto begin_tag = std::regex("<jpeg(.*)>");
static const auto end_tag = std::regex("</jpeg(.*)>");
std::string line, hex_buffer;
if(not std::getline(is, line)) throw std::runtime_error("end of file");
std::smatch match;
if (not std::regex_match(line, match, begin_tag)) throw std::runtime_error("not a <jpeg_>");
result.name = match[1];
while (std::getline(is, line))
{
if (std::regex_match(line, match, end_tag)) { break; }
std::istringstream hexes { line };
read_bytes(std::back_inserter(result.data), hexes);
}
return result;
}
int main()
{
std::istringstream input_stream(test_data);
auto jpeg = read_jpeg(input_stream);
std::cout << jpeg << std::endl;
}
expected output:
1 : [ ff, d8, ff, e0, 00, 10, 4a, 46, 49, 46, 00, 01, 01, 01, 00, 60, 12, 34, 56, 78, 9a, bc, de, f0 ]

Convert escape sequences the way a compiler would [duplicate]

What's the easiest way to convert a C++ std::string to another std::string, which has all the unprintable characters escaped?
For example, for the string of two characters [0x61,0x01], the result string might be "a\x01" or "a%01".
Take a look at the Boost's String Algorithm Library. You can use its is_print classifier (together with its operator! overload) to pick out nonprintable characters, and its find_format() functions can replace those with whatever formatting you wish.
#include <iostream>
#include <boost/format.hpp>
#include <boost/algorithm/string.hpp>
struct character_escaper
{
template<typename FindResultT>
std::string operator()(const FindResultT& Match) const
{
std::string s;
for (typename FindResultT::const_iterator i = Match.begin();
i != Match.end();
i++) {
s += str(boost::format("\\x%02x") % static_cast<int>(*i));
}
return s;
}
};
int main (int argc, char **argv)
{
std::string s("a\x01");
boost::find_format_all(s, boost::token_finder(!boost::is_print()), character_escaper());
std::cout << s << std::endl;
return 0;
}
Assumes the execution character set is a superset of ASCII and CHAR_BIT is 8. For the OutIter pass a back_inserter (e.g. to a vector<char> or another string), ostream_iterator, or any other suitable output iterator.
template<class OutIter>
OutIter write_escaped(std::string const& s, OutIter out) {
*out++ = '"';
for (std::string::const_iterator i = s.begin(), end = s.end(); i != end; ++i) {
unsigned char c = *i;
if (' ' <= c and c <= '~' and c != '\\' and c != '"') {
*out++ = c;
}
else {
*out++ = '\\';
switch(c) {
case '"': *out++ = '"'; break;
case '\\': *out++ = '\\'; break;
case '\t': *out++ = 't'; break;
case '\r': *out++ = 'r'; break;
case '\n': *out++ = 'n'; break;
default:
char const* const hexdig = "0123456789ABCDEF";
*out++ = 'x';
*out++ = hexdig[c >> 4];
*out++ = hexdig[c & 0xF];
}
}
}
*out++ = '"';
return out;
}
Assuming that "easiest way" means short and yet easily understandable while not depending on any other resources (like libs) I would go this way:
#include <cctype>
#include <sstream>
// s is our escaped output string
std::string s = "";
// loop through all characters
for(char c : your_string)
{
// check if a given character is printable
// the cast is necessary to avoid undefined behaviour
if(isprint((unsigned char)c))
s += c;
else
{
std::stringstream stream;
// if the character is not printable
// we'll convert it to a hex string using a stringstream
// note that since char is signed we have to cast it to unsigned first
stream << std::hex << (unsigned int)(unsigned char)(c);
std::string code = stream.str();
s += std::string("\\x")+(code.size()<2?"0":"")+code;
// alternatively for URL encodings:
//s += std::string("%")+(code.size()<2?"0":"")+code;
}
}
One person's unprintable character is another's multi-byte character. So you'll have to define the encoding before you can work out what bytes map to what characters, and which of those is unprintable.
Have you seen the article about how to Generate Escaped String Output Using Spirit.Karma?

Remove extra white spaces in C++

I tried to write a script that removes extra white spaces but I didn't manage to finish it.
Basically I want to transform abc sssd g g sdg gg gf into abc sssd g g sdg gg gf.
In languages like PHP or C#, it would be very easy, but not in C++, I see. This is my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <cstring>
#include <unistd.h>
#include <string.h>
char* trim3(char* s) {
int l = strlen(s);
while(isspace(s[l - 1])) --l;
while(* s && isspace(* s)) ++s, --l;
return strndup(s, l);
}
char *str_replace(char * t1, char * t2, char * t6)
{
char*t4;
char*t5=(char *)malloc(10);
memset(t5, 0, 10);
while(strstr(t6,t1))
{
t4=strstr(t6,t1);
strncpy(t5+strlen(t5),t6,t4-t6);
strcat(t5,t2);
t4+=strlen(t1);
t6=t4;
}
return strcat(t5,t4);
}
void remove_extra_whitespaces(char* input,char* output)
{
char* inputPtr = input; // init inputPtr always at the last moment.
int spacecount = 0;
while(*inputPtr != '\0')
{
char* substr;
strncpy(substr, inputPtr+0, 1);
if(substr == " ")
{
spacecount++;
}
else
{
spacecount = 0;
}
printf("[%p] -> %d\n",*substr,spacecount);
// Assume the string last with \0
// some code
inputPtr++; // After "some code" (instead of what you wrote).
}
}
int main(int argc, char **argv)
{
printf("testing 2 ..\n");
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
return 1;
}
It doesn't work. I tried several methods. What I am trying to do is to iterate the string letter by letter and dump it in another string as long as there is only one space in a row; if there are two spaces, don't write the second character to the new string.
How can I solve this?
There are already plenty of nice solutions. I propose you an alternative based on a dedicated <algorithm> meant to avoid consecutive duplicates: unique_copy():
void remove_extra_whitespaces(const string &input, string &output)
{
output.clear(); // unless you want to add at the end of existing sring...
unique_copy (input.begin(), input.end(), back_insert_iterator<string>(output),
[](char a,char b){ return isspace(a) && isspace(b);});
cout << output<<endl;
}
Here is a live demo. Note that I changed from c style strings to the safer and more powerful C++ strings.
Edit: if keeping c-style strings is required in your code, you could use almost the same code but with pointers instead of iterators. That's the magic of C++. Here is another live demo.
Here's a simple, non-C++11 solution, using the same remove_extra_whitespace() signature as in the question:
#include <cstdio>
void remove_extra_whitespaces(char* input, char* output)
{
int inputIndex = 0;
int outputIndex = 0;
while(input[inputIndex] != '\0')
{
output[outputIndex] = input[inputIndex];
if(input[inputIndex] == ' ')
{
while(input[inputIndex + 1] == ' ')
{
// skip over any extra spaces
inputIndex++;
}
}
outputIndex++;
inputIndex++;
}
// null-terminate output
output[outputIndex] = '\0';
}
int main(int argc, char **argv)
{
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
printf("input: %s\noutput: %s\n", input, output);
return 1;
}
Output:
input: asfa sas f f dgdgd dg ggg
output: asfa sas f f dgdgd dg ggg
Since you use C++, you can take advantage of standard-library features designed for that sort of work. You could use std::string (instead of char[0x255]) and std::istringstream, which will replace most of the pointer arithmetic.
First, make a string stream:
std::istringstream stream(input);
Then, read strings from it. It will remove the whitespace delimiters automatically:
std::string word;
while (stream >> word)
{
...
}
Inside the loop, build your output string:
if (!output.empty()) // special case: no space before first word
output += ' ';
output += word;
A disadvantage of this method is that it allocates memory dynamically (including several reallocations, performed when the output string grows).
There are plenty of ways of doing this (e.g., using regular expressions), but one way you could do this is using std::copy_if with a stateful functor remembering whether the last character was a space:
#include <algorithm>
#include <string>
#include <iostream>
struct if_not_prev_space
{
// Is last encountered character space.
bool m_is = false;
bool operator()(const char c)
{
// Copy if last was not space, or current is not space.
const bool ret = !m_is || c != ' ';
m_is = c == ' ';
return ret;
}
};
int main()
{
const std::string s("abc sssd g g sdg gg gf into abc sssd g g sdg gg gf");
std::string o;
std::copy_if(std::begin(s), std::end(s), std::back_inserter(o), if_not_prev_space());
std::cout << o << std::endl;
}
You can use std::unique which reduces adjacent duplicates to a single instance according to how you define what makes two elements equal is.
Here I have defined elements as equal if they are both whitespace characters:
inline std::string& remove_extra_ws_mute(std::string& s)
{
s.erase(std::unique(std::begin(s), std::end(s), [](unsigned char a, unsigned char b){
return std::isspace(a) && std::isspace(b);
}), std::end(s));
return s;
}
inline std::string remove_extra_ws_copy(std::string s)
{
return remove_extra_ws_mute(s);
}
std::unique moves the duplicates to the end of the string and returns an iterator to the beginning of them so they can be erased.
Additionally, if you must work with low level strings then you can still use std::unique on the pointers:
char* remove_extra_ws(char const* s)
{
std::size_t len = std::strlen(s);
char* buf = new char[len + 1];
std::strcpy(buf, s);
// Note that std::unique will also retain the null terminator
// in its correct position at the end of the valid portion
// of the string
std::unique(buf, buf + len + 1, [](unsigned char a, unsigned char b){
return (a && std::isspace(a)) && (b && std::isspace(b));
});
return buf;
}
for in-place modification you can apply erase-remove technic:
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>
int main()
{
std::string input {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
input.erase(std::remove_if(input.begin(), input.end(), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}), input.end());
std::cout << input << "\n";
}
So you first move all extra spaces to the end of the string and then truncate it.
The great advantage of C++ is that is universal enough to port your code to plain-c-static strings with only few modifications:
void erase(char * p) {
// note that this ony works good when initial array is allocated in the static array
// so we do not need to rearrange memory
*p = 0;
}
int main()
{
char input [] {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
erase(std::remove_if(std::begin(input), std::end(input), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}));
std::cout << input << "\n";
}
Interesting enough remove step here is string-representation independent. It will work with std::string without modifications at all.
I have the sinking feeling that good ol' scanf will do (in fact, this is the C school equivalent to Anatoly's C++ solution):
void remove_extra_whitespaces(char* input, char* output)
{
int srcOffs = 0, destOffs = 0, numRead = 0;
while(sscanf(input + srcOffs, "%s%n", output + destOffs, &numRead) > 0)
{
srcOffs += numRead;
destOffs += strlen(output + destOffs);
output[destOffs++] = ' '; // overwrite 0, advance past that
}
output[destOffs > 0 ? destOffs-1 : 0] = '\0';
}
We exploit the fact that scanf has magical built-in space skipping capabilities. We then use the perhaps less known %n "conversion" specification which gives us the amount of chars consumed by scanf. This feature frequently comes in handy when reading from strings, like here. The bitter drop which makes this solution less-than-perfect is the strlen call on the output (there is no "how many bytes have I actually just written" conversion specifier, unfortunately).
Last not least use of scanf is easy here because sufficient memory is guaranteed to exist at output; if that were not the case, the code would become more complex due to buffering and overflow handling.
Since you are writing c-style, here's a way to do what you want.
Note that you can remove '\r' and '\n' which are line breaks (but of course that's up to you if you consider those whitespaces or not).
This function should be as fast or faster than any other alternative and no memory allocation takes place even when it's called with std::strings (I've overloaded it).
char temp[] = " alsdasdl gasdasd ee";
remove_whitesaces(temp);
printf("%s\n", temp);
int remove_whitesaces(char *p)
{
int len = strlen(p);
int new_len = 0;
bool space = false;
for (int i = 0; i < len; i++)
{
switch (p[i])
{
case ' ': space = true; break;
case '\t': space = true; break;
case '\n': break; // you could set space true for \r and \n
case '\r': break; // if you consider them spaces, I just ignore them.
default:
if (space && new_len > 0)
p[new_len++] = ' ';
p[new_len++] = p[i];
space = false;
}
}
p[new_len] = '\0';
return new_len;
}
// and you can use it with strings too,
inline int remove_whitesaces(std::string &str)
{
int len = remove_whitesaces(&str[0]);
str.resize(len);
return len; // returning len for consistency with the primary function
// but u can return std::string instead.
}
// again no memory allocation is gonna take place,
// since resize does not not free memory because the length is either equal or lower
If you take a brief look at the C++ Standard library, you will notice that a lot C++ functions that return std::string, or other std::objects are basically a wrapper to a well written extern "C" function. So don't be afraid to use C functions in C++ applications, if they are well written and you can overload them to support std::strings and such.
For example, in Visual Studio 2015, std::to_string is written exactly like this:
inline string to_string(int _Val)
{ // convert int to string
return (_Integral_to_string("%d", _Val));
}
inline string to_string(unsigned int _Val)
{ // convert unsigned int to string
return (_Integral_to_string("%u", _Val));
}
and _Integral_to_string is a wrapper to a C function sprintf_s
template<class _Ty> inline
string _Integral_to_string(const char *_Fmt, _Ty _Val)
{ // convert _Ty to string
static_assert(is_integral<_Ty>::value,
"_Ty must be integral");
char _Buf[_TO_STRING_BUF_SIZE];
int _Len = _CSTD sprintf_s(_Buf, _TO_STRING_BUF_SIZE, _Fmt, _Val);
return (string(_Buf, _Len));
}
Well here is a longish(but easy) solution that does not use pointers.
It can be optimized further but hey it works.
#include <iostream>
#include <string>
using namespace std;
void removeExtraSpace(string str);
int main(){
string s;
cout << "Enter a string with extra spaces: ";
getline(cin, s);
removeExtraSpace(s);
return 0;
}
void removeExtraSpace(string str){
int len = str.size();
if(len==0){
cout << "Simplified String: " << endl;
cout << "I would appreciate it if you could enter more than 0 characters. " << endl;
return;
}
char ch1[len];
char ch2[len];
//Placing characters of str in ch1[]
for(int i=0; i<len; i++){
ch1[i]=str[i];
}
//Computing index of 1st non-space character
int pos=0;
for(int i=0; i<len; i++){
if(ch1[i] != ' '){
pos = i;
break;
}
}
int cons_arr = 1;
ch2[0] = ch1[pos];
for(int i=(pos+1); i<len; i++){
char x = ch1[i];
if(x==char(32)){
//Checking whether character at ch2[i]==' '
if(ch2[cons_arr-1] == ' '){
continue;
}
else{
ch2[cons_arr] = ' ';
cons_arr++;
continue;
}
}
ch2[cons_arr] = x;
cons_arr++;
}
//Printing the char array
cout << "Simplified string: " << endl;
for(int i=0; i<cons_arr; i++){
cout << ch2[i];
}
cout << endl;
}
I don't know if this helps but this is how I did it on my homework. The only case where it might break a bit is when there is spaces at the beginning of the string EX " wor ds " In that case, it will change it to " wor ds"
void ShortenSpace(string &usrStr){
char cha1;
char cha2;
for (int i = 0; i < usrStr.size() - 1; ++i) {
cha1 = usrStr.at(i);
cha2 = usrStr.at(i + 1);
if ((cha1 == ' ') && (cha2 == ' ')) {
usrStr.erase(usrStr.begin() + 1 + i);
--i;//edit: was ++i instead of --i, made code not work properly
}
}
}
I ended up here for a slighly different problem. Since I don't know where else to put it, and I found out what was wrong, I share it here. Don't be cross with me, please.
I had some strings that would print additional spaces at their ends, while showing up without spaces in debugging. The strings where formed in windows calls like VerQueryValue(), which besides other stuff outputs a string length, as e.g. iProductNameLen in the following line converting the result to a string named strProductName:
strProductName = string((LPCSTR)pvProductName, iProductNameLen)
then produced a string with a \0 byte at the end, which did not show easily in de debugger, but printed on screen as a space. I'll leave the solution of this as an excercise, since it is not hard at all, once you are aware of this.

Escaping a C++ string

What's the easiest way to convert a C++ std::string to another std::string, which has all the unprintable characters escaped?
For example, for the string of two characters [0x61,0x01], the result string might be "a\x01" or "a%01".
Take a look at the Boost's String Algorithm Library. You can use its is_print classifier (together with its operator! overload) to pick out nonprintable characters, and its find_format() functions can replace those with whatever formatting you wish.
#include <iostream>
#include <boost/format.hpp>
#include <boost/algorithm/string.hpp>
struct character_escaper
{
template<typename FindResultT>
std::string operator()(const FindResultT& Match) const
{
std::string s;
for (typename FindResultT::const_iterator i = Match.begin();
i != Match.end();
i++) {
s += str(boost::format("\\x%02x") % static_cast<int>(*i));
}
return s;
}
};
int main (int argc, char **argv)
{
std::string s("a\x01");
boost::find_format_all(s, boost::token_finder(!boost::is_print()), character_escaper());
std::cout << s << std::endl;
return 0;
}
Assumes the execution character set is a superset of ASCII and CHAR_BIT is 8. For the OutIter pass a back_inserter (e.g. to a vector<char> or another string), ostream_iterator, or any other suitable output iterator.
template<class OutIter>
OutIter write_escaped(std::string const& s, OutIter out) {
*out++ = '"';
for (std::string::const_iterator i = s.begin(), end = s.end(); i != end; ++i) {
unsigned char c = *i;
if (' ' <= c and c <= '~' and c != '\\' and c != '"') {
*out++ = c;
}
else {
*out++ = '\\';
switch(c) {
case '"': *out++ = '"'; break;
case '\\': *out++ = '\\'; break;
case '\t': *out++ = 't'; break;
case '\r': *out++ = 'r'; break;
case '\n': *out++ = 'n'; break;
default:
char const* const hexdig = "0123456789ABCDEF";
*out++ = 'x';
*out++ = hexdig[c >> 4];
*out++ = hexdig[c & 0xF];
}
}
}
*out++ = '"';
return out;
}
Assuming that "easiest way" means short and yet easily understandable while not depending on any other resources (like libs) I would go this way:
#include <cctype>
#include <sstream>
// s is our escaped output string
std::string s = "";
// loop through all characters
for(char c : your_string)
{
// check if a given character is printable
// the cast is necessary to avoid undefined behaviour
if(isprint((unsigned char)c))
s += c;
else
{
std::stringstream stream;
// if the character is not printable
// we'll convert it to a hex string using a stringstream
// note that since char is signed we have to cast it to unsigned first
stream << std::hex << (unsigned int)(unsigned char)(c);
std::string code = stream.str();
s += std::string("\\x")+(code.size()<2?"0":"")+code;
// alternatively for URL encodings:
//s += std::string("%")+(code.size()<2?"0":"")+code;
}
}
One person's unprintable character is another's multi-byte character. So you'll have to define the encoding before you can work out what bytes map to what characters, and which of those is unprintable.
Have you seen the article about how to Generate Escaped String Output Using Spirit.Karma?