String replacement and strange characters - c++

I have an HTML data in a char* and I would like to get it line by line, do some replacements and then add them all up together into a single string. This is the code that I use
std::string to, finalData;
finalData = "";
char* char_array = strtok(data, "\n");
while(char_array){
finalData += std::string(char_array);
char_array = strtok(NULL, "\n");
}
The problem is the data that I get at the end of this (finalData) has a lot of ^M characters and I am unable to search for it as it has a special character. Is there any way to completely eliminate the character?
I am guessing that it has something to do with conversion from c array to c++ string and to do with \n as tab is represented by ^I and cntrl is represented as ^

It seems that you are on a Windows system, or that the data originated on a Windows system. On a Windows system, newline is actually two characters: "\r\n". What you are seeing as ^M is the carriage-return character ('\r') of that newline sequence.
One way to remove those extra characters, would be to use std::string::find and std::string::erase in a loop.
Another way would be to manually copy, character by character, to a new std::string, except if the character is '\r'.

Related

entering newline character into a string

In these two cases I am entering \n as user input (in a string) in one and I am making \n as a part of string in the program itself (no user input):
string str1;
cin>>str1; //case 1 - \n entered as the part of the input
string str="hello\n"; //case 2
in case 1 \n is considered as a part of the input string whereas in case 2 it is considered as newline - why?
Escape sequences are compiler-time only literals. When your compiler comes across a \ in a string, it looks for a pattern afterwards to determine the value.
When you read in from the console, the input is read one character at a time.
Most debuggers will show the inputted string as "hello\\n", or broken up into individual characters:
'h','e','l','l','o','\\','n'
When you manually set the string in the code, such as string str = "hello\n", the compiler recognizes the escape sequence and treats it as the single character '\n'. This allows programmers to have shorthand for printing characters without going and printing their ASCII values.
Users, on the other hand, have an enter button to conveniently add a newline, and programs are generally oriented to have a human-readable interface(i.e., if I type a '\' character I expect it to be a '\' if I have no experience with computers)
Another note about cin is that it uses newline characters other whitespace to distinguish between kinds of input. The function getline is meant for string input for getting around this, but the stream extraction is done from whitespace to whitespace so it is consistent with all data types(int,float,char,etc)

c++: How to insert Line Feed into sprintf concatenation?

I am trying to send two commands at once with sprintf. Commands should be separated with 0x0A (LF). I thought I could enter special characters using two slashes, so I am writing:
sprintf(tmpstr,"VSET1:%ld.%3.3d\\x0AVSET2:%ld.%3.3d",mv/1000, AbsVal((int)mv%1000), mv / 1000, AbsVal((int)mv % 1000));
and it seems only the second command (VSET2) is recognized.
What am I doing wrong?
Use \n in the format string. Also, use a single backslash not \\.
If you are writing your buffer to a file, open the file in binary mode.
Whether you use \n or \x0A, you have to open the file in binary mode to avoid non-portable translations.
See Escape sequences.
When you use \\x0A in a string literal, the first backslash escapes the second backslash. As a result, the string contains a backslash character, '\\', followed by characters 'x', '0', and 'A'.
To use the character represented by 0x0A, you need to use \x0A.
You should be using a single backslash instead of two backslashes. Try the statement given below:
sprintf(tmpstr,"VSET1:%ld.%3.3d\x0AVSET2:%ld.%3.3d",mv/1000, AbsVal((int)mv%1000), mv / 1000, AbsVal((int)mv % 1000));
However, what you have done in your program will print a string "\x0A", rather than an ASCII character (0xAA (Line Feed)).
In C, all escape sequences consist of two or more characters, the first of which is the backslash, \ (called the "Escape character"); the remaining characters determine the interpretation of the escape sequence.
C deal with backslashes as escape sequences by default. However, in your program, you have told C compiler to not use your backslash as an escape sequence by adding an extra backslash to your string.
This works perfect. You not only get to insert \n but looks correct in the code. No need for a \ at the end of lines either. I use this for big paragraphs. Personal data has been obfuscated.
enter code here
wchar_t msg[200];
swprintf(msg, L"XYZ%d: ABCD Limit set to %d%%. %d times it has abcd and xyz rstu %d%%\n"
"Do you want to fix it?\n"
"An Yes will fix it\n"
"No will ignore it and continue\n"
"Cancel will abort the run\n", xxx, yyy, zzz, aaa);

Trim leading and trailing spaces after "=" symbols in string c++

I have a line which have data coming with "=" symbols. i need to ignore all white spaces before and after "=" symbol in my string
example:
input i have: "this is test = test1 and test1= test2"
output I am looking for:
"this is test=test1 and test1=test2"
I have tried with istream ignore function and std::find function for string but not sure how can i remove trailing spaces unless a non-whites pace character occurs in the string.
I found a similar question here but it is not answered.
:
https://stackoverflow.com/questions/24265598/delimiter-is-getting-added-at-the-beginning-of-each-line-of-a-delimited-file-whi
Thanks
Ruchi
If the other whitespace may be replaced by a single space, then you can read in all words from the string (std::cin <<), write them in a new string separated by a space and handle the needed tokens like "=" in this case by putting it in the string without spaces. You will probably need some "spaceNeeded" flags to handle no space before and after the token.

Cocos2D: CCLabelTTF won't preserve ending whitespace.

I need to have a CCLabelTTF print spaces at the end of a string, but they won't. I can log the string and clearly see that the spaces at the end are preserved by highlighting the log.
I've tried appending a decimal ascii non-breaking space, but it shows up as a different character. The font I'm using is Monaco.
I figured it out. Instead of appending #" " I appended the unicode value U+00A0 like this:
labelPieceObj = [labelPieceObj stringByAppendingString:#"\u00A0"];

Removing whitespaces inside a string

I have a string lots\t of\nwhitespace\r\n which I have simplified but I still need to get rid of the other spaces in the string.
QString str = " lots\t of\nwhitespace\r\n ";
str = str.simplified();
I can do this erase_all(str, " "); in boost but I want to remain in qt.
str = str.simplified();
str.replace( " ", "" );
The first changes all of your whitespace characters to a single instance of ASCII 32, the second removes that.
Try this:
str.replace(" ","");
Option 1:
Simplify the white space, then remove it
Per the docs
[QString::simplified] Returns a string that has whitespace removed from the start and the end, and that has each sequence of internal whitespace replaced with a single space.
Once the string is simplified, the white spaces can easily be removed.
str.simplified().remove(' ')
Option 2:
Use a QRegExp to capture all types of white space in remove.
QRegExp space("\\s");
str.remove(space);
Notes
The OPs string has white space of different types (tab, carriage return, new line), all of which need to be removed. This is the tricky part.
QString::remove was introduced in Qt 5.6; prior to 5.6 removal can be achieved using QString::replace and replacing the white space with an empty string "".
You can omit the call to simplified() with a regex:
str.replace(QRegularExpression("\\s+"), QString());
I don't have measured which method is faster. I guess this regex would perform worse.