How to convert hex representation from URL (%) to std::string (chinese text)? - c++

Intro
I have some input that I need to convert to the correct Chinese characters but I think I'm stuck at the final number to string conversion. I have checked using this hex to text converter online tool that e6b9af corresponds to the text 湯.
MWE
Here is a minimal example that I made to illustrate the problem. The input is "%e6%b9%af" (obtained from an URL somewhere else).
#include <iostream>
#include <string>
std::string attempt(std::string path)
{
std::size_t i = path.find("%");
while (i != std::string::npos)
{
std::string sub = path.substr(i, 9);
sub.erase(i + 6, 1);
sub.erase(i + 3, 1);
sub.erase(i, 1);
std::size_t s = std::stoul(sub, nullptr, 16);
path.replace(i, 9, std::to_string(s));
i = path.find("%");
}
return path;
}
int main()
{
std::string input = "%E6%B9%AF";
std::string goal = "湯";
// convert input to goal
input = attempt(input);
std::cout << goal << " and " << input << (input == goal ? " are the same" : " are not the same") << std::endl;
return 0;
}
Output
湯 and 15120815 are not the same
Expected output
湯 and 湯 are the same
Additional question
Are all characters in foreign languages represented in 3 bytes or is that just for Chinese? Since my attempt assumes blocks of 3 bytes, is that a good assumption?

Based on your suggestions and changing an example from this other post. This is what I came up with.
#include <iostream>
#include <string>
#include <sstream>
std::string decode_url(const std::string& path)
{
std::stringstream decoded;
for (std::size_t i = 0; i < path.size(); i++)
{
if (path[i] != '%')
{
if (path[i] == '+')
decoded << ' ';
else
decoded << path[i];
}
else
{
unsigned int j;
sscanf(path.substr(i + 1, 2).c_str(), "%x", &j);
decoded << static_cast<char>(j);
i += 2;
}
}
return decoded.str();
}
int main()
{
std::string input = "%E6%B9%AF";
std::string goal = "湯";
// convert input to goal
input = decode_url(input);
std::cout << goal << " and " << input << (input == goal ? " are the same" : " are not the same") << std::endl;
return 0;
}
Output
湯 and 湯 are the same

Related

std::stof() rounds numbers, how to avoid

Im trying to get a float value from a file.txt into a string. When I output that value with std::stof(str) it gets rounded. Example, in the text file there's "101471.71", whet i use the std::stof(str) it returns "101472", how to I avoid this?
Here's a part of that code (some parts are in spanish, sorry :p):
double CaptureLine(std::string filepath, int fileline, int linesize)
{
std::fstream file;
std::string n_str, num_n;
int current_line = 0, n_size, filesize = FileSize(filepath);
char ch_n;
double n_float = 0.0;
int n_line = filesize - fileline;
file.open("registros.txt");
if (file.is_open()) {
while (!file.eof()) {
current_line++;
std::getline(file, n_str);
if (current_line == n_line) break;
}
if (current_line < n_line) {
std::cout << "Error" << std::endl;
return 1;
}
file.close();
}
n_size = n_str.length();
for (int i = linesize; i < n_size; i++) {
ch_n = n_str.at(i);
num_n.push_back(ch_n);
}
std::cout << ">>" << num_n << "<<\n";
n_float = std::stof(num_n); //Here's the error
return n_float;
}
The issue probably isn't with std::stof, but is probably with the default precision of 6 in std::cout. You can use std::setprecision to increase that precision and capture all of your digits.
Here's a program that demonstrates:
#include <iostream>
#include <iomanip>
#include <string>
int main() {
std::cout << 101471.71f << "\n";
std::cout << std::stof("101471.71") << "\n";
std::cout << std::setprecision(8) << 101471.71f << "\n";
std::cout << std::stof("101471.71") << "\n";
return 0;
}
Outputs:
101472
101472
101471.71
101471.71
Be aware that std::setprecision sticks to the std::cout stream after it's called. Notice how the above example calls it exactly once but its effect sticks around.

How to add commas to a string using recursion

I'm a beginner on programming. I'm coding a school assignment and its asking me to add commas to a string using recursion. I have most of it done but when I input a number greater than a million it doesn't add a comma before the first digit. This is what i have so far:
// commas - Convert a number (n) into a string, with commas
string commas(int n) {
ostringstream converted;
converted << n;
string number = converted.str();
int size = number.length();
if (size < 4 )
{
return number;
}
if (size >= 4 )
{
return number.substr(0, number.size() - 3) + "," + number.substr(number.size() - 3, number.length());
}
}
Any help would be greatly appreciated!
The algorithm is fairly simple. It is very similar to your solution except I added the part necessary for recursion. To understand how it works, remove tack_on. Here is example output:
1
10
100
These are the first groups that are returned when the terminating condition is reached (s.size() < 4). Then the rest of the groups are prefixed with a comma and "tacked on". The entire string is built using recursion. This is important because if you left number.substr(0, number.size() - 3) in, your output would look like this:
11,000
1010,000
100100,000
11,0001000,000
I use std::to_string which is C++11:
#include <iostream>
std::string addCommas(int n)
{
std::string s = std::to_string(n);
if (s.size() < 4) return s;
else
{
std::string tack_on = "," + s.substr(s.size() - 3, s.size());
return addCommas(n / 1000) + tack_on;
}
}
You only need to make minimal changes for the C++03/stringstream version:
#include <sstream>
std::ostringstream oss;
std::string addCommas(int n)
{
oss.str(""); // to avoid std::bad_alloc
oss << n;
std::string s = oss.str();
// etc
}
Testing:
int main()
{
std::cout << addCommas(1) << "\n";
std::cout << addCommas(10) << "\n";
std::cout << addCommas(100) << "\n";
std::cout << addCommas(1000) << "\n";
std::cout << addCommas(10000) << "\n";
std::cout << addCommas(100000) << "\n";
std::cout << addCommas(1000000) << "\n";
return 0;
}
Output:
1
10
100
1,000
10,000
100,000
1,000,000
I think this one is a bit simpler and easier to follow:
std::string commas(int n)
{
std::string s = std::to_string(n%1000);
if ((n/1000) == 0) return s;
else
{
// Add zeros if required
while(s.size() < 3)
{
s = "0" + s;
}
return commas(n / 1000) + "," + s;
}
}
an alternative approach without recursion:
class Grouping3 : public std::numpunct< char >
{
protected:
std::string do_grouping() const { return "\003"; }
};
std::string commas( int n )
{
std::ostringstream converted;
converted.imbue( std::locale( converted.getloc(), new Grouping3 ) );
converted << n;
return converted.str();
}
will need #include <locale> in some environments
A possible solution for the assignment could be:
std::string commas( std::string&& str )
{
return str.length() > 3?
commas( str.substr( 0, str.length()-3 ) ) + "," + str.substr( str.length()-3 ):
str;
}
std::string commas( int n )
{
std::ostringstream converted;
converted << n;
return commas( converted.str() );
}

string parsing for C++

I have a text file that has #'s in it...It looks something like this.
#Stuff
1
2
3
#MoreStuff
a
b
c
I am trying to use std::string::find() function to get the positions of the # and then go from there, but I'm not sure how to actually code this.
This is my attempt:
int pos1=0;
while(i<string.size()){
int next=string.find('#', pos1);
i++;}
Here's one i made a while ago... (in C)
int char_pos(char c, char *str) {
char *pch=strchr(str,c);
return (pch-str)+1;
}
Port it to C++ and there you go! ;)
If : Not Found Then returns Negative.
Else : Return 'Positive', Char's 1st found position (1st match)
It's hard to tell from your question what you mean by "position", but it looks like you are trying to do something like this:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream incoming{"string-parsing-for-c.txt"};
std::string const hash{"#"};
std::string line;
for (auto line_number = 0U; std::getline(incoming, line); ++line_number)
{
auto const column = line.find(hash);
if (std::string::npos != column)
{
std::cout << hash << " found on line " << line_number
<< " in column " << column << ".\n";
}
}
}
...or possibly this:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream incoming{"string-parsing-for-c.txt"};
char const hash{'#'};
char byte{};
for (auto offset = 0U; incoming.read(&byte, 1); ++offset)
{
if (hash == byte)
{
std::cout << hash << " found at offset " << offset << ".\n";
}
}
}

Return fixed length std::string from integer value

Problem -> To return fixed length string to std::string*.
Target machine -> Fedora 11 .
I have to derive a function which accepts integer value and return fixed lenght string to a string pointer;
for example -> int value are in range of 0 to -127
so for int value 0 -> it shoud display 000
for value -9 -> it should return -009
for value say -50 -> it should return -050
for value say -110 -> it should return -110
so in short , lenght should be same in all cases.
What I have done : I have defined the function according to the requirement which has shown below.
Where I need help: I have derived a function but I am not sure if this is correct approach. When I test it on standalone system on windows side , the exe stopped working after sometimes but when I include this function with the overall project on Linux machine , it works flawlessly.
/* function(s)to implement fixed Length Rssi */
std::string convertString( const int numberRssi, std::string addedPrecison="" )
{
const std::string delimiter = "-";
stringstream ss;
ss << numberRssi ;
std::string tempString = ss.str();
std::string::size_type found = tempString.find( delimiter );
if( found == std::string::npos )// not found
{
tempString = "000";
}
else
{
tempString = tempString.substr( found+1 );
tempString = "-" +addedPrecison+tempString ;
}
return tempString;
}
std::string stringFixedLenght( const int number )
{
std::string str;
if( (number <= 0) && (number >= -9) )
{
str = convertString( number, "00");
}
else if( (number <= -10) && (number >= -99) )
{
str = convertString( number, "0");
}
else
{
str= convertString(number, "");
}
return str;
}
// somewhere in the project calling the function
ErrorCode A::GetNowString( std::string macAddress, std::string *pString )
{
ErrorCode result = ok;
int lvalue;
//some more code like iopening file and reading file
//..bla
// ..bla
// already got the value in lvalue ;
if( result == ok )
{
*pString = stringFixedLenght( lValue );
}
// some more code
return result;
}
You can use I/O manipulators to set the width that you need, and fill with zeros. For example, this program prints 00123:
#include <iostream>
#include <iomanip>
using namespace std;
int main() {
cout << setfill('0') << setw(5) << 123 << endl;
return 0;
}
You have to take care of the negative values yourself, though: cout << setfill('0') << setw(5) << -123 << endl prints 0-123, not -0123. Check if the value is negative, set the width to N-1, and add a minus in front.
How about using std::ostringstream and the standard output formatting manipulators?
std::string makeFixedLength(const int i, const int length)
{
std::ostringstream ostr;
if (i < 0)
ostr << '-';
ostr << std::setfill('0') << std::setw(length) << (i < 0 ? -i : i);
return ostr.str();
}
Note that your examples contradict your description: if the value is -9,
and the fixed length is 3, should the output be "-009" (as in your
example), or "-09" (as you describe)? If the former, the obvious
solution is to just use the formatting flags on std::ostringstream:
std::string
fixedWidth( int value, int width )
{
std::ostringstream results;
results.fill( '0' );
results.setf( std::ios_base::internal, std::ios_base::adjustfield );
results << std::setw( value < 0 ? width + 1 : width ) << value;
return results.str();
}
For the latter, just drop the conditional in the std::setw, and pass
width.
For the record, although I would avoid it, this is one of the rare cases
where printf does something better than ostream. Using snprintf:
std::string
fixedWidth( int value, int width )
{
char buffer[100];
snprintf( buffer, sizeof(buffer), "%.*d", width, value );
return buffer;
}
You'd probably want to capture the return value of snprintf and add
some error handling after it, just in case (but 100 chars is
sufficient for most current machines).
I have nothing against the versions that use streams, but you can do it all yourself more simply than your code:
std::string fixedLength(int value, int digits = 3) {
unsigned int uvalue = value;
if (value < 0) {
uvalue = -uvalue;
}
std::string result;
while (digits-- > 0) {
result += ('0' + uvalue % 10);
uvalue /= 10;
}
if (value < 0) {
result += '-';
}
std::reverse(result.begin(), result.end());
return result;
}
like this?
#include <cstdlib>
#include <string>
template <typename T>
std::string meh (T x)
{
const char* sign = x < 0 ? "-" : "";
const auto mag = std::abs (x);
if (mag < 10) return sign + std::string ("00" + std::to_string(mag));
if (mag < 100) return sign + std::string ("0" + std::to_string(mag));
return std::to_string(x);
}
#include <iostream>
int main () {
std::cout << meh(4) << ' '
<< meh(40) << ' '
<< meh(400) << ' '
<< meh(4000) << '\n';
std::cout << meh(-4) << ' '
<< meh(-40) << ' '
<< meh(-400) << ' '
<< meh(-4000) << '\n';
}
004 040 400 4000
-004 -040 -400 -4000

Formatting an integer in C++

I have an 8 digit integer which I would like to print formatted like this:
XXX-XX-XXX
I would like to use a function that takes an int and returns a string.
What's a good way to do this?
This is how I'd do it, personally. Might not be the fastest way of solving the problem, and definitely not as reusable as egrunin's function, but it strikes me as both clean and easy to understand. I'll throw it in the ring as an alternative to the mathier and loopier solutions.
#include <sstream>
#include <string>
#include <iomanip>
std::string format(long num) {
std::ostringstream oss;
oss << std::setfill('0') << std::setw(8) << num;
return oss.str().insert(3, "-").insert(6, "-");
};
Tested this, it works.
The format parameter here is "XXX-XX-XXX", but it only looks at (and skips over) the dashes.
std::string foo(char *format, long num)
{
std::string s(format);
if (num < 0) { return "Input must be positive"; }
for (int nPos = s.length() - 1; nPos >= 0; --nPos)
{
if (s.at(nPos) == '-') continue;
s.at(nPos) = '0' + (num % 10);
num = num / 10;
}
if (num > 0) { return "Input too large for format string"; }
return s;
}
Usage:
int main()
{
printf(foo("###-##-###", 12345678).c_str());
return 0;
}
Here's a bit different way that tries to work with the standard library and get it to do most of the real work:
#include <locale>
template <class T>
struct formatter : std::numpunct<T> {
protected:
T do_thousands_sep() const { return T('-'); }
std::basic_string<T> do_grouping() const {
return std::basic_string<T>("\3\2\3");
}
};
#ifdef TEST
#include <iostream>
int main() {
std::locale fmt(std::locale::classic(), new formatter<char>);
std::cout.imbue(fmt);
std::cout << 12345678 << std::endl;
return 0;
}
#endif
To return a string, just write to a stringstream, and return its .str().
This may be overkill if you only want to print out one number that way, but if you want to do this sort of thing in more than one place (or, especially, if you want to format all numbers going to a particular stream that way) it becomes more reasonable.
Here's a complete program that shows how I'd do it:
#include <iostream>
#include <iomanip>
#include <sstream>
std::string formatInt (unsigned int i) {
std::stringstream s;
s << std::setfill('0') << std::setw(3) << ((i % 100000000) / 100000) << '-'
<< std::setfill('0') << std::setw(2) << ((i % 100000) / 1000) << '-'
<< std::setfill('0') << std::setw(3) << (i % 1000);
return s.str();
}
int main (int argc, char *argv[]) {
if (argc > 1)
std::cout << formatInt (atoi (argv[1])) << std::endl;
else
std::cout << "Provide an argument, ya goose!" << std::endl;
return 0;
}
Running this with certain inputs gives:
Input Output
-------- ----------
12345678 123-45-678
0 000-00-000
7012 000-07-012
10101010 101-01-010
123456789 234-56-789
-7 949-67-289
Those last two show the importance of testing. If you want different behaviour, you'll need to modify the code. I generally opt for silent enforcement of rules if the caller can't be bothered (or is too stupid) to follow them but apparently some people like to use the principle of least astonishment and raise an exception :-)
You can use the std::ostringstream class to convert the number to a string. Then you can use the string of digits and print them using whatever formatting you want, as in the following code:
std::ostringstream oss;
oss << std::setfill('0') << std::setw(8) << number;
std::string str = oss.str();
if ( str.length() != 8 ){
// some form of handling
}else{
// print digits formatted as desired
}
int your_number = 12345678;
std::cout << (your_number/10000000) % 10 << (your_number/1000000) % 10 << (your_number/100000) %10 << "-" << (your_number/10000) %10 << (your_number/1000) %10 << "-" << (your_number/100) %10 << (your_number/10) %10 << (your_number) %10;
http://www.ideone.com/17eRv
Its not a function, but its a general method for parsing an int number by number.
#include <iostream>
#include <string>
using namespace std;
template<class Int, class Bi>
void format(Int n, Bi first, Bi last)
{
if( first == last ) return;
while( n != 0 ) {
Int t(n % 10);
n /= 10;
while( *--last != 'X' && last != first);
*last = t + '0';
}
}
int main(int argc, char* argv[])
{
int i = 23462345;
string s("XXX-XX-XXX");
format(i, s.begin(), s.end());
cout << s << endl;
return 0;
}
How's this?
std::string format(int x)
{
std::stringstream ss
ss.fill('0');
ss.width(3);
ss << (x / 10000);
ss.width(1);
ss << "-";
ss.width(2);
ss << (x / 1000) % 100;
ss.width(1);
ss << "-";
ss.width(3);
ss << x % 1000;
return ss.str();
}
Edit 1: I see strstream is deprecated and replaced with stringstream.
Edit 2: Fixed issue of missing leading 0's. I know, it's ugly.
Obviously a char * and not a string, but you get the idea. You'll need to free the output once you're done, and you should probably add error checking, but this should do it:
char * formatter(int i)
{
char *buf = malloc(11*sizeof(char));
sprintf(buf, "%03d-%02d-%03d", i/100000, (i/1000)%100, i%1000);
return buf;
}
You don't require malloc or new, just define buf as char buff[11];