C++ efficient parse

C++ efficient parse - c++

I am programming some automated test equipment (ATE) and I'm trying to extract the following values out of an example response from the ATE:
DCRE? 1,
DCRE P, 10.3, (pin1)
DCRE F, 200.1, (pin2)
DCRE P, 20.4, (pin3)
From each line, I only care about the pin and the measured result value. So for the case above, I want to store the following pieces of information in a map<std::string, double> results;
results["pin1"] = 50.3;
results["pin2"] = 30.8;
results["pin3"] = 70.3;
I made the following code to parse the response:
void parseResultData(map<Pin*, double> &pinnametoresult, string &datatoparse) {
char *p = strtok((char*) datatoparse.c_str(), " \n");
string lastread;
string current;
while (p) {
current = p;
if(current.find('(') != string::npos) {
string substring = lastread.substr(1);
const char* last = substring.c_str();
double value = strtod(last, NULL);
unsigned short number = atoi(current.substr(4, current.size()-2).c_str());
pinnametoresult[&pinlookupmap[number]] = value;
}
lastread = p;
p = strtok(NULL, " \n");
}
}
It works, but it's not very efficient. Is there a way to make the function more efficient for this specific case? I don't care about the DCRE or P/F value on each line. I thought about using Boost regex library, but not sure if that would be more efficient.

In order to make this a bit more efficient, try to avoid copying. In particular, calls to substring, assignments etc can cause havoc on the performance. If you look at your code, you will see that the content of datatoparse are repeatedly assigned to lastread and current, each time with one line less at the beginning. So, on average you copy half of the original string times the number of lines, making just that part an O(n^2) algorithm. This isn't relevant if you have three or four line (not even on 100 lines!) but if you have a few more, performance degrades rapidly.
Try this approach instead:
string::size_type p0 = 0;
string::size_type p1 = input.find('\n', p0);
while (p1 != string::npos) {
// extract the line
string line = input.substr(p0, p1 - p0);
// move to the next line
p0 = p1 + 1;
p1 = input.find('\n', p0);
}
Notes:
Note that the algorithm still copies all input once, but each line only once, making it O(n).
Since you have a copy of the line, you can insert '\0' as artificial separator in order to give a substring to e.g. atoi() or strtod().
I'm not 100% sure of the order of parameters for string::find() and too lazy to look it up, but the idea is to start searching at a certain position. Look at the various overloads of find-like functions.
When handling a line, search the indices of the parts you need and then extract and parse them.
If you have line fragments (i.e. a partial line without a newline) at the end, you will have to modify the loop slightly. Create tests!

This is what I did:
#include <cstdlib>
#include <string>
#include <vector>
#include <unordered_map>
#include <sstream>
#include <iostream>
using namespace std;
struct Pin {
string something;
Pin() {}
};
vector<Pin*> pins = { new Pin(), new Pin(), new Pin() };
typedef unordered_map<Pin*, double> CONT_T;
inline bool OfInterest(const string& line) {
return line.find("(") != string::npos;
}
void parseResultData(CONT_T& pinnametoresult, const string& datatoparse)
{
istringstream is(datatoparse);
string line;
while (getline(is, line)) {
if (OfInterest(line)) {
double d = 0.0;
unsigned int pinid;
size_t firstComma = line.find(",")+2; // skip space
size_t secondComma = line.find(",", firstComma);
istringstream is2(line.substr(firstComma, secondComma-firstComma));
is2 >> d;
size_t paren = line.find("(")+4; // skip pin
istringstream is3(line.substr(paren, (line.length()-paren)-1));
is3 >> pinid;
--pinid;
Pin* pin = pins[pinid];
pinnametoresult[pin] = d;
}
}
}
/*
*
*/
int main(int argc, char** argv) {
string datatoparse = "DCRE? 1, \n"
"DCRE P, 10.3, (pin1)\n"
"DCRE F, 200.1, (pin2)\n"
"DCRE P, 20.4, (pin3)\n";
CONT_T results;
parseResultData(results, datatoparse);
return 0;
}

Here's my final result. Does not involve any copying, but it will destroy the string.
void parseResultData3(map<std::string, double> &pinnametoresult, std::string &datatoparse) {
char* str = (char*) datatoparse.c_str();
int length = datatoparse.size();
double lastdouble = 0.0;
char* startmarker = NULL; //beginning of next pin to parse
for(int pos = 0; pos < length; pos++, str++) {
if(str[0] == '(') {
startmarker = str + 1;
//get previous value
bool triggered = false;
for(char* lookback = str - 1; ; lookback--) {
if(!triggered && (isdigit(lookback[0]) || lookback[0] == '.')) {
triggered = true;
*(lookback + 1) = '\0';
}
else if(triggered && (!isdigit(lookback[0]) && lookback[0] != '.')) {
lastdouble = strtod(lookback, NULL);
break;
}
}
}
else if(startmarker != NULL) {
if(str[0] == ')') {
str[0] = '\0';
pinnametoresult[startmarker] = lastdouble;
startmarker = NULL;
}
if(str[0] == ',') {
str[0] = '\0';
pinnametoresult[startmarker] = lastdouble;
startmarker = str + 1;
}
}
}
}

Related

How to replace "pi" by "3.14"?

How to replace all "pi" from a string by "3.14"? Example: INPUT = "xpix" ___ OUTPUT = "x3.14x" for a string, not character array.
This doesn't work:
#include<iostream>
using namespace std;
void replacePi(string str)
{
if(str.size() <=1)
return ;
replacePi(str.substr(1));
int l = str.length();
if(str[0]=='p' && str[1]=='i')
{
for(int i=l;i>1;i--)
str[i+2] = str[i];
str[0] = '3';
str[1] = '.';
str[2] = '1';
str[3] = '4';
}
}
int main()
{
string s;
cin>>s;
replacePi(s);
cout << s << endl;
}

There is a ready to use function in the C++ lib. It is called: std::regex_replace. You can read the documentation in the CPP Reference here.
Since it uses regexes it is very powerful. The disadvantage is that it may be a little bit too slow during runtime for some uses case. But for your example, this does not matter.
So, a common C++ solution would be:
#include <iostream>
#include <string>
#include <regex>
int main() {
// The test string
std::string input{ "Pi is a magical number. Pi is used in many places. Go for Pi" };
// Use simply the replace function
std::string output = std::regex_replace(input, std::regex("Pi"), "3.14");
// Show the output
std::cout << output << "\n";
}
But my guess is that you are learning C++ and the teacher gave you a task and expects a solution without using elements from the std C++ library. So, a hands on solution.
This can be implemented best with a temporary string. You check character by character from the original string. If the characters do not belong to Pi, then copy them as is to new new string. Else, copy 3.14 to the new string.
At the end, overwrite the original string with the temp string.
Example:
#include <iostream>
#include <string>
using namespace std;
void replacePi(string& str) {
// Our temporay
string temp = "";
// Sanity check
if (str.length() > 1) {
// Iterate over all chararcters in the source string
for (size_t i = 0; i < str.length() - 1; ++i) {
// Check for Pi in source string
if (str[i] == 'P' and str[i + 1] == 'i') {
// Add replacement string to temp
temp += "3.14";
// We consumed two characters, P and i, so increase index one more time
++i;
}
else {
// Take over normal character
temp += str[i];
}
}
str = temp;
}
}
// Test code
int main() {
// The test string
std::string str{ "Pi is a magical number. Pi is used in many places. Go for Pi" };
// Do the replacement
replacePi(str);
// Show result
std::cout << str << '\n';
}

What you need is string::find and string::replace. Here is an example
size_t replace_all(std::string& str, std::string from, std::string to)
{
size_t count = 0;
std::string::size_type pos;
while((pos=str.find(from)) != str.npos)
{
str.replace(pos, from.length(), to);
count++;
}
return count;
}
void replacePi(std::string& str)
{
replace_all(str, "pi", "3.14");
}

How to split a string by another string in Arduino?

I have a character array like below:
char array[] = "AAAA... A1... 3. B1.";
How can I split this array by the string "..." in Arduino? I have tried:
ptr = strtok(array, "...");
and the output is the following:
AAAA,
A1,
3,
B1
But I actually want output to be
AAAA,
A1,
3.B1.
How to get this output?
edit:
My full code is this:
char array[] = "AAAA... A1... 3. B1.";
char *strings[10];
char *ptr = NULL;`enter code here`
void setup()
{
Serial.begin(9600);
byte index = 0;
ptr = strtok(array, "..."); // takes a list of delimiters
while(ptr != NULL)
{
strings[index] = ptr;
index++;
ptr = strtok(NULL, "..."); // takes a list of delimiters
}
for(int n = 0; n < index; n++)
{
Serial.println(strings[n]);
}
}

The main problem is that strtok does not find a string inside another string. strtok looks for a character in a string. When you give multiple characters to strtok it looks for any of these. Consequently, writing strtok(array, "..."); is exactly the same as writing strtok(array, ".");. That is why you get a split after "3."
There are multiple ways of doing what you want. Below I'll show you an example using strstr. Unlike strtokthe strstr function do find a substring inside a string - just what you are looking for. But.. strstr is not a tokenizer so some extra code is required to print the substrings.
Something like this should do:
int main()
{
char array[] = "AAAA... A1... 3. B1...";
char* ps = array;
char* pf = strstr(ps, "..."); // Find first substring
while(pf)
{
int len = pf - ps; // Number of chars to print
printf("%.*s\n", len, ps);
ps = pf + 3;
pf = strstr(ps, "..."); // Find next substring
}
return 0;
}

You can implement your own split as strtok except the role of the second argument :
#include <stdio.h>
#include <string.h>
char * split(char *str, const char * delim)
{
static char * s;
char * p, * r;
if (str != NULL)
s = str;
p = strstr(s, delim);
if (p == NULL) {
if (*s == 0)
return NULL;
r = s;
s += strlen(s);
return r;
}
r = s;
*p = 0;
s = p + strlen(delim);
return r;
}
int main()
{
char s[] = "AAAA... A1... 3. B1.";
char * p = s;
char * t;
while ((t = split(p, "...")) != NULL) {
printf("'%s'\n", t);
p = NULL;
}
return 0;
}
Compilation and execution:
/tmp % gcc -g -pedantic -Wextra s.c
/tmp % ./a.out
'AAAA'
' A1'
' 3. B1.'
/tmp %
I print between '' to show the return spaces, because I am not sure you want them, so delim is not only ... in that case

Because you tagged this as c++, here is a c++ 'version' of your code:
#include <iostream>
using std::cout;
using std::endl;
#include <vector>
using std::vector;
#include <string>
using std::string;
class T965_t
{
string array;
vector<string> strings;
public:
T965_t() : array("AAAA... A1... 3. B1.")
{
strings.reserve(10);
}
~T965_t() = default;
int operator()() { return setup(); } // functor entry
private: // methods
int setup()
{
cout << endl;
const string pat1 ("... ");
string s1 = array; // working copy
size_t indx = s1.find(pat1, 0); // find first ... pattern
// start search at ---------^
do
{
if (string::npos == indx) // pattern not found
{
strings.push_back (s1); // capture 'remainder' of s1
break; // not found, kick out
}
// else
// extract --------vvvvvvvvvvvvvvvvv
strings.push_back (s1.substr(0, indx)); // capture
// capture to vector
indx += pat1.size(); // i.e. 4
s1.erase(0, indx); // erase previous capture
indx = s1.find(pat1, 0); // find next
} while(true);
for(uint n = 0; n < strings.size(); n++)
cout << strings[n] << "\n";
cout << endl;
return 0;
}
}; // class T965_t
int main(int , char**) { return T965_t()(); } // call functor
With output:
AAAA
A1
3. B1.
Note: I leave changing "3. B1." to "3.B1.", and adding commas at end of each line (except the last) as an exercise for the OP if required.

I looked for a split function and I didn't find one that meets my requirement, so I made one and it works for me so far, of course in the future I will make some improvements, but it got me out of trouble.
But there is also the strtok function and better use that.
https://www.delftstack.com/es/howto/arduino/arduino-strtok/
I have the split function
Arduino code:
void split(String * vecSplit, int dimArray,String content,char separator){
if(content.length()==0)
return;
content = content + separator;
int countVec = 0;
int posSep = 0;
int posInit = 0;
while(countVec<dimArray){
posSep = content.indexOf(separator,posSep);
if(posSep<0){
return;
}
countVec++;
String splitStr = content.substring(posInit,posSep);
posSep = posSep+1;
posInit = posSep;
vecSplit[countVec] = splitStr;
countVec++;
}
}
Llamada a funcion:
smsContent = "APN:4g.entel;DOMAIN:domolin.com;DELAY_GPS:60";
String vecSplit[10];
split(vecSplit,10,smsContent,';');
for(int i = 0;i<10;i++){
Serial.println(vecSplit[i]);
}
String input:
APN:4gentel;DOMAIN:domolin.com;DELAY_GPS:60
Output:
APN:4g.entel
DOMAIN:domolin.com
DELAY_GPS:60
RESET:true
enter image description here

How to "Fold a word" from a string. EX. "STACK" becomes "SKTCA". C++

I'm trying to figure out how to can fold a word from a string. For example "code" after the folding would become "ceod". Basically start from the first character and then get the last one, then the second character. I know the first step is to start from a loop, but I have no idea how to get the last character after that. Any help would be great. Heres my code.
#include <iostream>
using namespace std;
int main () {
string fold;
cout << "Enter a word: ";
cin >> fold;
string temp;
string backwards;
string wrap;
for (unsigned int i = 0; i < fold.length(); i++){
temp = temp + fold[i];
}
backwards= string(temp.rbegin(),temp.rend());
for(unsigned int i = 0; i < temp.length(); i++) {
wrap = fold.replace(backwards[i]);
}
cout << wrap;
}
Thanks

#Supreme, there are number of ways to do your task and I'm going to post one of them. But as #John had pointed you must try your own to get it done because real programming is all about practicing a lot. Use this solution just as a reference of one possibility and find many others.
int main()
{
string in;
cout <<"enter: "; cin >> in;
string fold;
for (int i=0, j=in.length()-1; i<in.length()/2; i++, j--)
{
fold += in[i];
fold += in[j];
}
if( in.length()%2 != 0) // if string lenght is odd, pick the middle
fold += in[in.length()/2];
cout << endl << fold ;
return 0;
}
good luck !

There are two approaches to this form of problem, a mathematically exact method would be to create a generator function which returns the number in the correct order.
An easier plan would be to modify the string to solve practically the problem.
Mathematical solution
We want a function which returns the index in the string to add. We have 2 sequences - increasing and decreasing and they are interleaved.
sequence 1 :
0, 1 , 2, 3.
sequence 2
len-1, len-2, len-3, len-4.
Given they are interleaved, we want even values to be from sequence 1 and odd values from sequence 2.
So our solution would be to for a given new index, choose which sequence to use, and then return the next value from that sequence.
int generator( int idx, int len )
{
ASSERT( idx < len );
if( idx %2 == 0 ) { // even - first sequence
return idx/2;
} else {
return (len- (1 + idx/2);
}
}
This can then be called from a function fold...
std::string fold(const char * src)
{
std::string result;
std::string source(src);
for (size_t i = 0; i < source.length(); i++) {
result += source.at(generator(i, source.length()));
}
return result;
}
Pratical solution
Although less efficient, this can be easier to think about. We are taking either the first or the last character of a string. This we will do using string manipulation to get the right result.
std::string fold2(const char * src)
{
std::string source = src;
enum whereToTake { fromStart, fromEnd };
std::string result;
enum whereToTake next = fromStart;
while (source.length() > 0) {
if (next == fromStart) {
result += source.at(0);
source = source.substr(1);
next = fromEnd;
}
else {
result += source.at(source.length() - 1); // last char
source = source.substr(0, source.length() - 1); // eat last char
next = fromStart;
}
}
return result;
}

You can take advantage of the concept of reverse iterators to write a generic algorithm based on the solution presented in Usman Riaz answer.
Compose your string picking chars from both the ends of the original string. When you reach the center, add the char in the middle if the number of chars is odd.
Here is a possible implementation:
#include <iostream>
#include <string>
#include <vector>
#include <utility>
#include <algorithm>
#include <iterator>
template <class ForwardIt, class OutputIt>
OutputIt fold(ForwardIt source, ForwardIt end, OutputIt output)
{
auto reverse_source = std::reverse_iterator<ForwardIt>(end);
auto reverse_source_end = std::reverse_iterator<ForwardIt>(source);
auto source_end = std::next(source, std::distance(source, end) / 2);
while ( source != source_end )
{
*output++ = *source++;
*output++ = *reverse_source++;
}
if ( source != reverse_source.base() )
{
*output++ = *source;
}
return output;
}
int main() {
std::vector<std::pair<std::string, std::string>> tests {
{"", ""}, {"a", "a"}, {"stack", "sktca"}, {"steack", "sktcea"}
};
for ( auto const &test : tests )
{
std::string result;
fold(
std::begin(test.first), std::end(test.first),
std::back_inserter(result)
);
std::cout << (result == test.second ? " OK " : "FAILED: ")
<< '\"' << test.first << "\" --> \"" << result << "\"\n";
}
}

Remove extra white spaces in C++

I tried to write a script that removes extra white spaces but I didn't manage to finish it.
Basically I want to transform abc sssd g g sdg gg gf into abc sssd g g sdg gg gf.
In languages like PHP or C#, it would be very easy, but not in C++, I see. This is my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <cstring>
#include <unistd.h>
#include <string.h>
char* trim3(char* s) {
int l = strlen(s);
while(isspace(s[l - 1])) --l;
while(* s && isspace(* s)) ++s, --l;
return strndup(s, l);
}
char *str_replace(char * t1, char * t2, char * t6)
{
char*t4;
char*t5=(char *)malloc(10);
memset(t5, 0, 10);
while(strstr(t6,t1))
{
t4=strstr(t6,t1);
strncpy(t5+strlen(t5),t6,t4-t6);
strcat(t5,t2);
t4+=strlen(t1);
t6=t4;
}
return strcat(t5,t4);
}
void remove_extra_whitespaces(char* input,char* output)
{
char* inputPtr = input; // init inputPtr always at the last moment.
int spacecount = 0;
while(*inputPtr != '\0')
{
char* substr;
strncpy(substr, inputPtr+0, 1);
if(substr == " ")
{
spacecount++;
}
else
{
spacecount = 0;
}
printf("[%p] -> %d\n",*substr,spacecount);
// Assume the string last with \0
// some code
inputPtr++; // After "some code" (instead of what you wrote).
}
}
int main(int argc, char **argv)
{
printf("testing 2 ..\n");
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
return 1;
}
It doesn't work. I tried several methods. What I am trying to do is to iterate the string letter by letter and dump it in another string as long as there is only one space in a row; if there are two spaces, don't write the second character to the new string.
How can I solve this?

There are already plenty of nice solutions. I propose you an alternative based on a dedicated <algorithm> meant to avoid consecutive duplicates: unique_copy():
void remove_extra_whitespaces(const string &input, string &output)
{
output.clear(); // unless you want to add at the end of existing sring...
unique_copy (input.begin(), input.end(), back_insert_iterator<string>(output),
[](char a,char b){ return isspace(a) && isspace(b);});
cout << output<<endl;
}
Here is a live demo. Note that I changed from c style strings to the safer and more powerful C++ strings.
Edit: if keeping c-style strings is required in your code, you could use almost the same code but with pointers instead of iterators. That's the magic of C++. Here is another live demo.

Here's a simple, non-C++11 solution, using the same remove_extra_whitespace() signature as in the question:
#include <cstdio>
void remove_extra_whitespaces(char* input, char* output)
{
int inputIndex = 0;
int outputIndex = 0;
while(input[inputIndex] != '\0')
{
output[outputIndex] = input[inputIndex];
if(input[inputIndex] == ' ')
{
while(input[inputIndex + 1] == ' ')
{
// skip over any extra spaces
inputIndex++;
}
}
outputIndex++;
inputIndex++;
}
// null-terminate output
output[outputIndex] = '\0';
}
int main(int argc, char **argv)
{
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
printf("input: %s\noutput: %s\n", input, output);
return 1;
}
Output:
input: asfa sas f f dgdgd dg ggg
output: asfa sas f f dgdgd dg ggg

Since you use C++, you can take advantage of standard-library features designed for that sort of work. You could use std::string (instead of char[0x255]) and std::istringstream, which will replace most of the pointer arithmetic.
First, make a string stream:
std::istringstream stream(input);
Then, read strings from it. It will remove the whitespace delimiters automatically:
std::string word;
while (stream >> word)
{
...
}
Inside the loop, build your output string:
if (!output.empty()) // special case: no space before first word
output += ' ';
output += word;
A disadvantage of this method is that it allocates memory dynamically (including several reallocations, performed when the output string grows).

There are plenty of ways of doing this (e.g., using regular expressions), but one way you could do this is using std::copy_if with a stateful functor remembering whether the last character was a space:
#include <algorithm>
#include <string>
#include <iostream>
struct if_not_prev_space
{
// Is last encountered character space.
bool m_is = false;
bool operator()(const char c)
{
// Copy if last was not space, or current is not space.
const bool ret = !m_is || c != ' ';
m_is = c == ' ';
return ret;
}
};
int main()
{
const std::string s("abc sssd g g sdg gg gf into abc sssd g g sdg gg gf");
std::string o;
std::copy_if(std::begin(s), std::end(s), std::back_inserter(o), if_not_prev_space());
std::cout << o << std::endl;
}

You can use std::unique which reduces adjacent duplicates to a single instance according to how you define what makes two elements equal is.
Here I have defined elements as equal if they are both whitespace characters:
inline std::string& remove_extra_ws_mute(std::string& s)
{
s.erase(std::unique(std::begin(s), std::end(s), [](unsigned char a, unsigned char b){
return std::isspace(a) && std::isspace(b);
}), std::end(s));
return s;
}
inline std::string remove_extra_ws_copy(std::string s)
{
return remove_extra_ws_mute(s);
}
std::unique moves the duplicates to the end of the string and returns an iterator to the beginning of them so they can be erased.
Additionally, if you must work with low level strings then you can still use std::unique on the pointers:
char* remove_extra_ws(char const* s)
{
std::size_t len = std::strlen(s);
char* buf = new char[len + 1];
std::strcpy(buf, s);
// Note that std::unique will also retain the null terminator
// in its correct position at the end of the valid portion
// of the string
std::unique(buf, buf + len + 1, [](unsigned char a, unsigned char b){
return (a && std::isspace(a)) && (b && std::isspace(b));
});
return buf;
}

for in-place modification you can apply erase-remove technic:
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>
int main()
{
std::string input {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
input.erase(std::remove_if(input.begin(), input.end(), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}), input.end());
std::cout << input << "\n";
}
So you first move all extra spaces to the end of the string and then truncate it.
The great advantage of C++ is that is universal enough to port your code to plain-c-static strings with only few modifications:
void erase(char * p) {
// note that this ony works good when initial array is allocated in the static array
// so we do not need to rearrange memory
*p = 0;
}
int main()
{
char input [] {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
erase(std::remove_if(std::begin(input), std::end(input), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}));
std::cout << input << "\n";
}
Interesting enough remove step here is string-representation independent. It will work with std::string without modifications at all.

I have the sinking feeling that good ol' scanf will do (in fact, this is the C school equivalent to Anatoly's C++ solution):
void remove_extra_whitespaces(char* input, char* output)
{
int srcOffs = 0, destOffs = 0, numRead = 0;
while(sscanf(input + srcOffs, "%s%n", output + destOffs, &numRead) > 0)
{
srcOffs += numRead;
destOffs += strlen(output + destOffs);
output[destOffs++] = ' '; // overwrite 0, advance past that
}
output[destOffs > 0 ? destOffs-1 : 0] = '\0';
}
We exploit the fact that scanf has magical built-in space skipping capabilities. We then use the perhaps less known %n "conversion" specification which gives us the amount of chars consumed by scanf. This feature frequently comes in handy when reading from strings, like here. The bitter drop which makes this solution less-than-perfect is the strlen call on the output (there is no "how many bytes have I actually just written" conversion specifier, unfortunately).
Last not least use of scanf is easy here because sufficient memory is guaranteed to exist at output; if that were not the case, the code would become more complex due to buffering and overflow handling.

Since you are writing c-style, here's a way to do what you want.
Note that you can remove '\r' and '\n' which are line breaks (but of course that's up to you if you consider those whitespaces or not).
This function should be as fast or faster than any other alternative and no memory allocation takes place even when it's called with std::strings (I've overloaded it).
char temp[] = " alsdasdl gasdasd ee";
remove_whitesaces(temp);
printf("%s\n", temp);
int remove_whitesaces(char *p)
{
int len = strlen(p);
int new_len = 0;
bool space = false;
for (int i = 0; i < len; i++)
{
switch (p[i])
{
case ' ': space = true; break;
case '\t': space = true; break;
case '\n': break; // you could set space true for \r and \n
case '\r': break; // if you consider them spaces, I just ignore them.
default:
if (space && new_len > 0)
p[new_len++] = ' ';
p[new_len++] = p[i];
space = false;
}
}
p[new_len] = '\0';
return new_len;
}
// and you can use it with strings too,
inline int remove_whitesaces(std::string &str)
{
int len = remove_whitesaces(&str[0]);
str.resize(len);
return len; // returning len for consistency with the primary function
// but u can return std::string instead.
}
// again no memory allocation is gonna take place,
// since resize does not not free memory because the length is either equal or lower
If you take a brief look at the C++ Standard library, you will notice that a lot C++ functions that return std::string, or other std::objects are basically a wrapper to a well written extern "C" function. So don't be afraid to use C functions in C++ applications, if they are well written and you can overload them to support std::strings and such.
For example, in Visual Studio 2015, std::to_string is written exactly like this:
inline string to_string(int _Val)
{ // convert int to string
return (_Integral_to_string("%d", _Val));
}
inline string to_string(unsigned int _Val)
{ // convert unsigned int to string
return (_Integral_to_string("%u", _Val));
}
and _Integral_to_string is a wrapper to a C function sprintf_s
template<class _Ty> inline
string _Integral_to_string(const char *_Fmt, _Ty _Val)
{ // convert _Ty to string
static_assert(is_integral<_Ty>::value,
"_Ty must be integral");
char _Buf[_TO_STRING_BUF_SIZE];
int _Len = _CSTD sprintf_s(_Buf, _TO_STRING_BUF_SIZE, _Fmt, _Val);
return (string(_Buf, _Len));
}

Well here is a longish(but easy) solution that does not use pointers.
It can be optimized further but hey it works.
#include <iostream>
#include <string>
using namespace std;
void removeExtraSpace(string str);
int main(){
string s;
cout << "Enter a string with extra spaces: ";
getline(cin, s);
removeExtraSpace(s);
return 0;
}
void removeExtraSpace(string str){
int len = str.size();
if(len==0){
cout << "Simplified String: " << endl;
cout << "I would appreciate it if you could enter more than 0 characters. " << endl;
return;
}
char ch1[len];
char ch2[len];
//Placing characters of str in ch1[]
for(int i=0; i<len; i++){
ch1[i]=str[i];
}
//Computing index of 1st non-space character
int pos=0;
for(int i=0; i<len; i++){
if(ch1[i] != ' '){
pos = i;
break;
}
}
int cons_arr = 1;
ch2[0] = ch1[pos];
for(int i=(pos+1); i<len; i++){
char x = ch1[i];
if(x==char(32)){
//Checking whether character at ch2[i]==' '
if(ch2[cons_arr-1] == ' '){
continue;
}
else{
ch2[cons_arr] = ' ';
cons_arr++;
continue;
}
}
ch2[cons_arr] = x;
cons_arr++;
}
//Printing the char array
cout << "Simplified string: " << endl;
for(int i=0; i<cons_arr; i++){
cout << ch2[i];
}
cout << endl;
}

I don't know if this helps but this is how I did it on my homework. The only case where it might break a bit is when there is spaces at the beginning of the string EX " wor ds " In that case, it will change it to " wor ds"
void ShortenSpace(string &usrStr){
char cha1;
char cha2;
for (int i = 0; i < usrStr.size() - 1; ++i) {
cha1 = usrStr.at(i);
cha2 = usrStr.at(i + 1);
if ((cha1 == ' ') && (cha2 == ' ')) {
usrStr.erase(usrStr.begin() + 1 + i);
--i;//edit: was ++i instead of --i, made code not work properly
}
}
}

I ended up here for a slighly different problem. Since I don't know where else to put it, and I found out what was wrong, I share it here. Don't be cross with me, please.
I had some strings that would print additional spaces at their ends, while showing up without spaces in debugging. The strings where formed in windows calls like VerQueryValue(), which besides other stuff outputs a string length, as e.g. iProductNameLen in the following line converting the result to a string named strProductName:
strProductName = string((LPCSTR)pvProductName, iProductNameLen)
then produced a string with a \0 byte at the end, which did not show easily in de debugger, but printed on screen as a space. I'll leave the solution of this as an excercise, since it is not hard at all, once you are aware of this.

Complex algorithm to extract numbers/number range from a string

I am working on a algorithm where I am trying the following output:
Given values/Inputs:
char *Var = "1-5,10,12,15-16,25-35,67,69,99-105";
int size = 29;
Here "1-5" depicts a range value, i.e. it will be understood as "1,2,3,4,5" while the values with just "," are individual values.
I was writing an algorithm where end output should be such that it will give complete range of output as:
int list[]=1,2,3,4,5,10,12,15,16,25,26,27,28,29,30,31,32,33,34,35,67,69,99,100,101,102,103,104,105;
If anyone is familiar with this issue then the help would be really appreciated.
Thanks in advance!
My initial code approach was as:
if(NULL != strchr((char *)grp_range, '-'))
{
int_u8 delims[] = "-";
result = (int_u8 *)strtok((char *)grp_range, (char *)delims);
if(NULL != result)
{
start_index = strtol((char*)result, (char **)&end_ptr, 10);
result = (int_u8 *)strtok(NULL, (char *)delims);
}
while(NULL != result)
{
end_index = strtol((char*)result, (char**)&end_ptr, 10);
result = (int_u8 *)strtok(NULL, (char *)delims);
}
while(start_index <= end_index)
{
grp_list[i++] = start_index;
start_index++;
}
}
else if(NULL != strchr((char *)grp_range, ','))
{
int_u8 delims[] = ",";
result = (unison_u8 *)strtok((char *)grp_range, (char *)delims);
while(result != NULL)
{
grp_list[i++] = strtol((char*)result, (char**)&end_ptr, 10);
result = (int_u8 *)strtok(NULL, (char *)delims);
}
}
But it only works if I have either "0-5" or "0,10,15". I am looking forward to make it more versatile.

Here is a C++ solution for you to study.
#include <vector>
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int ConvertString2Int(const string& str)
{
stringstream ss(str);
int x;
if (! (ss >> x))
{
cerr << "Error converting " << str << " to integer" << endl;
abort();
}
return x;
}
vector<string> SplitStringToArray(const string& str, char splitter)
{
vector<string> tokens;
stringstream ss(str);
string temp;
while (getline(ss, temp, splitter)) // split into new "lines" based on character
{
tokens.push_back(temp);
}
return tokens;
}
vector<int> ParseData(const string& data)
{
vector<string> tokens = SplitStringToArray(data, ',');
vector<int> result;
for (vector<string>::const_iterator it = tokens.begin(), end_it = tokens.end(); it != end_it; ++it)
{
const string& token = *it;
vector<string> range = SplitStringToArray(token, '-');
if (range.size() == 1)
{
result.push_back(ConvertString2Int(range[0]));
}
else if (range.size() == 2)
{
int start = ConvertString2Int(range[0]);
int stop = ConvertString2Int(range[1]);
for (int i = start; i <= stop; i++)
{
result.push_back(i);
}
}
else
{
cerr << "Error parsing token " << token << endl;
abort();
}
}
return result;
}
int main()
{
vector<int> result = ParseData("1-5,10,12,15-16,25-35,67,69,99-105");
for (vector<int>::const_iterator it = result.begin(), end_it = result.end(); it != end_it; ++it)
{
cout << *it << " ";
}
cout << endl;
}
Live example
http://ideone.com/2W99Tt

This is my boost approach :
This won't give you array of ints, instead a vector of ints
Algorithm used: (nothing new)
Split string using ,
Split the individual string using -
Make a range low and high
Push it into vector with help of this range
Code:-
#include<iostream>
#include<vector>
#include <boost/algorithm/string.hpp>
#include <boost/lexical_cast.hpp>
int main(){
std::string line("1-5,10,12,15-16,25-35,67,69,99-105");
std::vector<std::string> strs,r;
std::vector<int> v;
int low,high,i;
boost::split(strs,line,boost::is_any_of(","));
for (auto it:strs)
{
boost::split(r,it,boost::is_any_of("-"));
auto x = r.begin();
low = high =boost::lexical_cast<int>(r[0]);
x++;
if(x!=r.end())
high = boost::lexical_cast<int>(r[1]);
for(i=low;i<=high;++i)
v.push_back(i);
}
for(auto x:v)
std::cout<<x<<" ";
return 0;
}

You're issue seems to be misunderstanding how strtok works. Have a look at this.
#include <string.h>
#include <stdio.h>
int main()
{
int i, j;
char delims[] = " ,";
char str[] = "1-5,6,7";
char *tok;
char tmp[256];
int rstart, rend;
tok = strtok(str, delims);
while(tok != NULL) {
for(i = 0; i < strlen(tok); ++i) {
//// range
if(i != 0 && tok[i] == '-') {
strncpy(tmp, tok, i);
rstart = atoi(tmp);
strcpy(tmp, tok + i + 1);
rend = atoi(tmp);
for(j = rstart; j <= rend; ++j)
printf("%d\n", j);
i = strlen(tok) + 1;
}
else if(strchr(tok, '-') == NULL)
printf("%s\n", tok);
}
tok = strtok(NULL, delims);
}
return 0;
}

Don't search. Just go through the text one character at a time. As long as you're seeing digits, accumulate them into a value. If the digits are followed by a - then you're looking at a range, and need to parse the next set of digits to get the upper bound of the range and put all the values into your list. If the value is not followed by a - then you've got a single value; put it into your list.

Stop and think about it: what you actually have is a comma
separated list of ranges, where a range can be either a single
number, or a pair of numbers separated by a '-'. So you
probably want to loop over the ranges, using recursive descent
for the parsing. (This sort of thing is best handled by an
istream, so that's what I'll use.)
std::vector<int> results;
std::istringstream parser( std::string( var ) );
processRange( results, parser );
while ( isSeparator( parser, ',' ) ) {
processRange( results, parser );
}
with:
bool
isSeparator( std::istream& source, char separ )
{
char next;
source >> next;
if ( source && next != separ ) {
source.putback( next );
}
return source && next == separ;
}
and
void
processRange( std::vector<int>& results, std::istream& source )
{
int first = 0;
source >> first;
int last = first;
if ( isSeparator( source, '-' ) ) {
source >> last;
}
if ( last < first ) {
source.setstate( std::ios_base::failbit );
}
if ( source ) {
while ( first != last ) {
results.push_back( first );
++ first;
}
results.push_back( first );
}
}
The isSeparator function will, in fact, probably be useful in
other projects in the future, and should be kept in your
toolbox.

First divide whole string into numbers and ranges (using strtok() with "," delimiter), save strings in array, then, search through array looking for "-", if it present than use sscanf() with "%d-%d" format, else use sscanf with single "%d" format.
Function usage is easily googling.

One approach:
You need a parser that identifies 3 kinds of tokens: ',', '-', and numbers. That raises the level of abstraction so that you are operating at a level above characters.
Then you can parse your token stream to create a list of ranges and constants.
Then you can parse that list to convert the ranges into constants.
Some code that does part of the job:
#include <stdio.h>
// Prints a comma after the last digit. You will need to fix that up.
void print(int a, int b) {
for (int i = a; i <= b; ++i) {
printf("%d, ", i);
}
}
int main() {
enum { DASH, COMMA, NUMBER };
struct token {
int type;
int value;
};
// Sample input stream. Notice the sentinel comma at the end.
// 1-5,10,
struct token tokStream[] = {
{ NUMBER, 1 },
{ DASH, 0 },
{ NUMBER, 5 },
{ COMMA, 0 },
{ NUMBER, 10 },
{ COMMA, 0 } };
// This parser assumes well formed input. You have to add all the error
// checking yourself.
size_t i = 0;
while (i < sizeof(tokStream)/sizeof(struct token)) {
if (tokStream[i+1].type == COMMA) {
print(tokStream[i].value, tokStream[i].value);
i += 2; // skip to next number
}
else { // DASH
print(tokStream[i].value, tokStream[i+2].value);
i += 4; // skip to next number
}
}
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ efficient parse - c++

Related

How to replace "pi" by "3.14"?

How to split a string by another string in Arduino?

How to "Fold a word" from a string. EX. "STACK" becomes "SKTCA". C++

Remove extra white spaces in C++

Complex algorithm to extract numbers/number range from a string

Categories

Resources