How do I get part of a char*? - c++

I have the following code that solves a small image using Tesseract.
char *answer = tess_api.GetUTF8Text();
I know beforehand that the result will always start with the character '+' and it's one word so I want to get rid of any junk it finds.
I get the result as "G+ABC S\n\n" and I need only +ABC. So basically I need to ignore anything before + and everything after the first space. I was thinking I should use rindex to find the position of + and spaces.

std::string ParseString(const std::string& s)
{
size_t plus = s.find_first_of('+');
size_t space = s.find_first_of(" \n", plus);
return s.substr(plus, space-plus);
}
int main()
{
std::cout << ParseString("G+ABC S\n\n").c_str() << std::endl;
std::cout << ParseString("G +ABC\ne\n").c_str() << std::endl;
return 0;
}
Gives
+ABC
+ABC
If you really can't use strings then something like this might do
char *ParseString2(char *s)
{
int plus,end;
for (plus = 0 ; s[plus] != '+' ; ++plus){}
for (end = plus ; s[end] != ' ' && s[end] != '\n' ; ++end){}
char *result = new char[end - plus + 1];
memcpy(result, s + plus, end - plus);
result[end - plus] = 0;
return result;
}

You can use:
// just scan "answer" to find out where to start and where to end
int indexStart = // find the index of '+'
int indexEnd = // find the index before space
int length = indexEnd-indexStart+1;
char *dataYouWant = (char *) malloc(length+1); // result will be stored here
memcpy( dataYouWant, &answer[indexStart], length );
// for example answer = "G+ABC S\n\n"
dataYouWant[length] = '\0'; // dataYouWant will be "+ABC"
You can check out Strings in c, how to get subString for other alternatives.
P.S. suggestion: use string instead in C++, it will be much easier (check out #DavidSykes's answer).

Related

Recognize string formatting Debug Assertion

I have a runtime problem with code below.
The purpose is to "recognize" the formats (%s %d etc) within the input string.
To do this, it returns an integer that matches the data type.
Then the extracted types are manipulated/handled in other functions.
I want to clarify that my purpose isn't to write formatted types in a string (snprintf etc.) but only to recognize/extract them.
The problem is the crash of my application with error:
Debug Assertion Failed!
Program:
...ers\Alex\source\repos\TestProgram\Debug\test.exe
File: minkernel\crts\ucrt\appcrt\convert\isctype.cpp
Line: 36
Expression: c >= -1 && c <= 255
My code:
#include <iostream>
#include <cstring>
enum Formats
{
TYPE_INT,
TYPE_FLOAT,
TYPE_STRING,
TYPE_NUM
};
typedef struct Format
{
Formats Type;
char Name[5 + 1];
} SFormat;
SFormat FormatsInfo[TYPE_NUM] =
{
{TYPE_INT, "d"},
{TYPE_FLOAT, "f"},
{TYPE_STRING, "s"},
};
int GetFormatType(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return format.Type;
}
return -1;
}
bool isValidFormat(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return true;
}
return false;
}
bool isFindFormat(const char* strBufFormat, size_t stringSize, int& typeFormat)
{
bool foundFormat = false;
std::string stringFormat = "";
for (size_t pos = 0; pos < stringSize; pos++)
{
if (!isalpha(strBufFormat[pos]))
continue;
if (!isdigit(strBufFormat[pos]))
{
stringFormat += strBufFormat[pos];
if (isValidFormat(stringFormat.c_str()))
{
typeFormat = GetFormatType(stringFormat.c_str());
foundFormat = true;
}
}
}
return foundFormat;
}
int main()
{
std::string testString = "some test string with %d arguments"; // crash application
// std::string testString = "%d some test string with arguments"; // not crash application
size_t stringSize = testString.size();
char buf[1024 + 1];
memcpy(buf, testString.c_str(), stringSize);
buf[stringSize] = '\0';
for (size_t pos = 0; pos < stringSize; pos++)
{
if (buf[pos] == '%')
{
if (buf[pos + 1] == '%')
{
pos++;
continue;
}
else
{
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
int typeFormat;
if (isFindFormat(bufFormat, stringSize, typeFormat))
{
std::cout << "type = " << typeFormat << "\n";
// ...
}
}
}
}
}
As I commented in the code, with the first string everything works. While with the second, the application crashes.
I also wanted to ask you is there a better/more performing way to recognize types "%d %s etc" within a string? (even not necessarily returning an int to recognize it).
Thanks.
Let's take a look at this else clause:
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
The variable stringSize was initialized with the size of the original format string. Let's say it's 30 in this case.
Let's say you found the %d code at offset 20. You're going to copy 30 characters, starting at offset 20, into bufFormat. That means you're copying 20 characters past the end of the original string. You could possibly read off the end of the original buf, but that doesn't happen here because buf is large. The third line sets a NUL into the buffer at position 30, again past the end of the data, but your memcpy copied the NUL from buf into bufFormat, so that's where the string in bufFormat will end.
Now bufFormat contains the string "%d arguments." Inside isFindFormat you search for the first isalpha character. Possibly you meant isalnum here? Because we can only get to the isdigit line if the isalpha check passes, and if it's isalpha, it's not isdigit.
In any case, after isalpha passes, isdigit will definitely return false so we enter that if block. Your code will find the right type here. But, the loop doesn't terminate. Instead, it continues scanning up to stringSize characters, which is the stringSize from main, that is, the size of the original format string. But the string you're passing to isFindFormat only contains the part starting at '%'. So you're going to scan past the end of the string and read whatever's in the buffer, which will probably trigger the assertion error you're seeing.
Theres a lot more going on here. You're mixing and matching std::string and C strings; see if you can use std::string::substr instead of copying. You can use std::string::find to find characters in a string. If you have to use C strings, use strcpy instead of memcpy followed by the addition of a NUL.
You could just demand it to a regexp engine which bourned to search through strings
Since C++11 there's direct support, what you have to do is
#include <regex>
then you can match against strings using various methods, for instance regex_match which gives you the possibility, together with an smatch to find out your target with just few lines of codes using standard library
std::smatch sm;
std::regex_match ( testString.cbegin(), testString.cend(), sm, str_expr);
where str_exp is your regex to find what you want specifically
in the sm you have now every matched string against your regexp, which you can print in this way
for (int i = 0; i < sm.size(); ++i)
{
std::cout << "Match:" << sm[i] << std::endl;
}
EDIT:
to better express the result you would achieve i'll include a simple sample below
// target string to be searched against
string target_string = "simple example no.%d is: %s";
// pattern to look for
regex str_exp("(%[sd])");
// match object
smatch sm;
// iteratively search your pattern on the string, excluding parts of the string already matched
cout << "My format strings extracted:" << endl;
while (regex_search(target_string, sm, str_exp))
{
std::cout << sm[0] << std::endl;
target_string = sm.suffix();
}
you can easily add any format string you want modifying the str_exp regex expression.

Remove extra white spaces in C++

I tried to write a script that removes extra white spaces but I didn't manage to finish it.
Basically I want to transform abc sssd g g sdg gg gf into abc sssd g g sdg gg gf.
In languages like PHP or C#, it would be very easy, but not in C++, I see. This is my code:
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <cstring>
#include <unistd.h>
#include <string.h>
char* trim3(char* s) {
int l = strlen(s);
while(isspace(s[l - 1])) --l;
while(* s && isspace(* s)) ++s, --l;
return strndup(s, l);
}
char *str_replace(char * t1, char * t2, char * t6)
{
char*t4;
char*t5=(char *)malloc(10);
memset(t5, 0, 10);
while(strstr(t6,t1))
{
t4=strstr(t6,t1);
strncpy(t5+strlen(t5),t6,t4-t6);
strcat(t5,t2);
t4+=strlen(t1);
t6=t4;
}
return strcat(t5,t4);
}
void remove_extra_whitespaces(char* input,char* output)
{
char* inputPtr = input; // init inputPtr always at the last moment.
int spacecount = 0;
while(*inputPtr != '\0')
{
char* substr;
strncpy(substr, inputPtr+0, 1);
if(substr == " ")
{
spacecount++;
}
else
{
spacecount = 0;
}
printf("[%p] -> %d\n",*substr,spacecount);
// Assume the string last with \0
// some code
inputPtr++; // After "some code" (instead of what you wrote).
}
}
int main(int argc, char **argv)
{
printf("testing 2 ..\n");
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
return 1;
}
It doesn't work. I tried several methods. What I am trying to do is to iterate the string letter by letter and dump it in another string as long as there is only one space in a row; if there are two spaces, don't write the second character to the new string.
How can I solve this?
There are already plenty of nice solutions. I propose you an alternative based on a dedicated <algorithm> meant to avoid consecutive duplicates: unique_copy():
void remove_extra_whitespaces(const string &input, string &output)
{
output.clear(); // unless you want to add at the end of existing sring...
unique_copy (input.begin(), input.end(), back_insert_iterator<string>(output),
[](char a,char b){ return isspace(a) && isspace(b);});
cout << output<<endl;
}
Here is a live demo. Note that I changed from c style strings to the safer and more powerful C++ strings.
Edit: if keeping c-style strings is required in your code, you could use almost the same code but with pointers instead of iterators. That's the magic of C++. Here is another live demo.
Here's a simple, non-C++11 solution, using the same remove_extra_whitespace() signature as in the question:
#include <cstdio>
void remove_extra_whitespaces(char* input, char* output)
{
int inputIndex = 0;
int outputIndex = 0;
while(input[inputIndex] != '\0')
{
output[outputIndex] = input[inputIndex];
if(input[inputIndex] == ' ')
{
while(input[inputIndex + 1] == ' ')
{
// skip over any extra spaces
inputIndex++;
}
}
outputIndex++;
inputIndex++;
}
// null-terminate output
output[outputIndex] = '\0';
}
int main(int argc, char **argv)
{
char input[0x255] = "asfa sas f f dgdgd dg ggg";
char output[0x255] = "NO_OUTPUT_YET";
remove_extra_whitespaces(input,output);
printf("input: %s\noutput: %s\n", input, output);
return 1;
}
Output:
input: asfa sas f f dgdgd dg ggg
output: asfa sas f f dgdgd dg ggg
Since you use C++, you can take advantage of standard-library features designed for that sort of work. You could use std::string (instead of char[0x255]) and std::istringstream, which will replace most of the pointer arithmetic.
First, make a string stream:
std::istringstream stream(input);
Then, read strings from it. It will remove the whitespace delimiters automatically:
std::string word;
while (stream >> word)
{
...
}
Inside the loop, build your output string:
if (!output.empty()) // special case: no space before first word
output += ' ';
output += word;
A disadvantage of this method is that it allocates memory dynamically (including several reallocations, performed when the output string grows).
There are plenty of ways of doing this (e.g., using regular expressions), but one way you could do this is using std::copy_if with a stateful functor remembering whether the last character was a space:
#include <algorithm>
#include <string>
#include <iostream>
struct if_not_prev_space
{
// Is last encountered character space.
bool m_is = false;
bool operator()(const char c)
{
// Copy if last was not space, or current is not space.
const bool ret = !m_is || c != ' ';
m_is = c == ' ';
return ret;
}
};
int main()
{
const std::string s("abc sssd g g sdg gg gf into abc sssd g g sdg gg gf");
std::string o;
std::copy_if(std::begin(s), std::end(s), std::back_inserter(o), if_not_prev_space());
std::cout << o << std::endl;
}
You can use std::unique which reduces adjacent duplicates to a single instance according to how you define what makes two elements equal is.
Here I have defined elements as equal if they are both whitespace characters:
inline std::string& remove_extra_ws_mute(std::string& s)
{
s.erase(std::unique(std::begin(s), std::end(s), [](unsigned char a, unsigned char b){
return std::isspace(a) && std::isspace(b);
}), std::end(s));
return s;
}
inline std::string remove_extra_ws_copy(std::string s)
{
return remove_extra_ws_mute(s);
}
std::unique moves the duplicates to the end of the string and returns an iterator to the beginning of them so they can be erased.
Additionally, if you must work with low level strings then you can still use std::unique on the pointers:
char* remove_extra_ws(char const* s)
{
std::size_t len = std::strlen(s);
char* buf = new char[len + 1];
std::strcpy(buf, s);
// Note that std::unique will also retain the null terminator
// in its correct position at the end of the valid portion
// of the string
std::unique(buf, buf + len + 1, [](unsigned char a, unsigned char b){
return (a && std::isspace(a)) && (b && std::isspace(b));
});
return buf;
}
for in-place modification you can apply erase-remove technic:
#include <string>
#include <iostream>
#include <algorithm>
#include <cctype>
int main()
{
std::string input {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
input.erase(std::remove_if(input.begin(), input.end(), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}), input.end());
std::cout << input << "\n";
}
So you first move all extra spaces to the end of the string and then truncate it.
The great advantage of C++ is that is universal enough to port your code to plain-c-static strings with only few modifications:
void erase(char * p) {
// note that this ony works good when initial array is allocated in the static array
// so we do not need to rearrange memory
*p = 0;
}
int main()
{
char input [] {"asfa sas f f dgdgd dg ggg"};
bool prev_is_space = true;
erase(std::remove_if(std::begin(input), std::end(input), [&prev_is_space](unsigned char curr) {
bool r = std::isspace(curr) && prev_is_space;
prev_is_space = std::isspace(curr);
return r;
}));
std::cout << input << "\n";
}
Interesting enough remove step here is string-representation independent. It will work with std::string without modifications at all.
I have the sinking feeling that good ol' scanf will do (in fact, this is the C school equivalent to Anatoly's C++ solution):
void remove_extra_whitespaces(char* input, char* output)
{
int srcOffs = 0, destOffs = 0, numRead = 0;
while(sscanf(input + srcOffs, "%s%n", output + destOffs, &numRead) > 0)
{
srcOffs += numRead;
destOffs += strlen(output + destOffs);
output[destOffs++] = ' '; // overwrite 0, advance past that
}
output[destOffs > 0 ? destOffs-1 : 0] = '\0';
}
We exploit the fact that scanf has magical built-in space skipping capabilities. We then use the perhaps less known %n "conversion" specification which gives us the amount of chars consumed by scanf. This feature frequently comes in handy when reading from strings, like here. The bitter drop which makes this solution less-than-perfect is the strlen call on the output (there is no "how many bytes have I actually just written" conversion specifier, unfortunately).
Last not least use of scanf is easy here because sufficient memory is guaranteed to exist at output; if that were not the case, the code would become more complex due to buffering and overflow handling.
Since you are writing c-style, here's a way to do what you want.
Note that you can remove '\r' and '\n' which are line breaks (but of course that's up to you if you consider those whitespaces or not).
This function should be as fast or faster than any other alternative and no memory allocation takes place even when it's called with std::strings (I've overloaded it).
char temp[] = " alsdasdl gasdasd ee";
remove_whitesaces(temp);
printf("%s\n", temp);
int remove_whitesaces(char *p)
{
int len = strlen(p);
int new_len = 0;
bool space = false;
for (int i = 0; i < len; i++)
{
switch (p[i])
{
case ' ': space = true; break;
case '\t': space = true; break;
case '\n': break; // you could set space true for \r and \n
case '\r': break; // if you consider them spaces, I just ignore them.
default:
if (space && new_len > 0)
p[new_len++] = ' ';
p[new_len++] = p[i];
space = false;
}
}
p[new_len] = '\0';
return new_len;
}
// and you can use it with strings too,
inline int remove_whitesaces(std::string &str)
{
int len = remove_whitesaces(&str[0]);
str.resize(len);
return len; // returning len for consistency with the primary function
// but u can return std::string instead.
}
// again no memory allocation is gonna take place,
// since resize does not not free memory because the length is either equal or lower
If you take a brief look at the C++ Standard library, you will notice that a lot C++ functions that return std::string, or other std::objects are basically a wrapper to a well written extern "C" function. So don't be afraid to use C functions in C++ applications, if they are well written and you can overload them to support std::strings and such.
For example, in Visual Studio 2015, std::to_string is written exactly like this:
inline string to_string(int _Val)
{ // convert int to string
return (_Integral_to_string("%d", _Val));
}
inline string to_string(unsigned int _Val)
{ // convert unsigned int to string
return (_Integral_to_string("%u", _Val));
}
and _Integral_to_string is a wrapper to a C function sprintf_s
template<class _Ty> inline
string _Integral_to_string(const char *_Fmt, _Ty _Val)
{ // convert _Ty to string
static_assert(is_integral<_Ty>::value,
"_Ty must be integral");
char _Buf[_TO_STRING_BUF_SIZE];
int _Len = _CSTD sprintf_s(_Buf, _TO_STRING_BUF_SIZE, _Fmt, _Val);
return (string(_Buf, _Len));
}
Well here is a longish(but easy) solution that does not use pointers.
It can be optimized further but hey it works.
#include <iostream>
#include <string>
using namespace std;
void removeExtraSpace(string str);
int main(){
string s;
cout << "Enter a string with extra spaces: ";
getline(cin, s);
removeExtraSpace(s);
return 0;
}
void removeExtraSpace(string str){
int len = str.size();
if(len==0){
cout << "Simplified String: " << endl;
cout << "I would appreciate it if you could enter more than 0 characters. " << endl;
return;
}
char ch1[len];
char ch2[len];
//Placing characters of str in ch1[]
for(int i=0; i<len; i++){
ch1[i]=str[i];
}
//Computing index of 1st non-space character
int pos=0;
for(int i=0; i<len; i++){
if(ch1[i] != ' '){
pos = i;
break;
}
}
int cons_arr = 1;
ch2[0] = ch1[pos];
for(int i=(pos+1); i<len; i++){
char x = ch1[i];
if(x==char(32)){
//Checking whether character at ch2[i]==' '
if(ch2[cons_arr-1] == ' '){
continue;
}
else{
ch2[cons_arr] = ' ';
cons_arr++;
continue;
}
}
ch2[cons_arr] = x;
cons_arr++;
}
//Printing the char array
cout << "Simplified string: " << endl;
for(int i=0; i<cons_arr; i++){
cout << ch2[i];
}
cout << endl;
}
I don't know if this helps but this is how I did it on my homework. The only case where it might break a bit is when there is spaces at the beginning of the string EX " wor ds " In that case, it will change it to " wor ds"
void ShortenSpace(string &usrStr){
char cha1;
char cha2;
for (int i = 0; i < usrStr.size() - 1; ++i) {
cha1 = usrStr.at(i);
cha2 = usrStr.at(i + 1);
if ((cha1 == ' ') && (cha2 == ' ')) {
usrStr.erase(usrStr.begin() + 1 + i);
--i;//edit: was ++i instead of --i, made code not work properly
}
}
}
I ended up here for a slighly different problem. Since I don't know where else to put it, and I found out what was wrong, I share it here. Don't be cross with me, please.
I had some strings that would print additional spaces at their ends, while showing up without spaces in debugging. The strings where formed in windows calls like VerQueryValue(), which besides other stuff outputs a string length, as e.g. iProductNameLen in the following line converting the result to a string named strProductName:
strProductName = string((LPCSTR)pvProductName, iProductNameLen)
then produced a string with a \0 byte at the end, which did not show easily in de debugger, but printed on screen as a space. I'll leave the solution of this as an excercise, since it is not hard at all, once you are aware of this.

Remove Duplicate words (only if followed) from char array

I am a little bit stuck and cant find out what is wrong here.
I have an assignment to enter a sentence into char array and if there are duplicate and followed words(example : same same , diff diff. but not : same word same.) they should be removed.
here is the function I wrote:
void Same(char arr[], char temp[]){
int i = 0, j = 0, f = 0, *p, k = 0, counter = 0;
for (i = 0; i < strlen(arr); i++){
while (arr[i] != ' ' && i < strlen(arr)){
temp[k] = arr[i];
i++;
k++;
counter++;
}
temp[k] = '\0';
k = 0;
p = strstr((arr + i), (temp + j));
if (p != NULL && (*p == arr[i])){
for (f = 0; f < strlen(p); f++){
*p = '*';
p++;
}
f = 0;
}
j = counter;
}
}
strtok is a handy function to grab the next word from a list (strsep is a better one, but is less likely to be available on your system). Using strtok, an approach like the following might work, at least for simple examples...
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAXPHRASELEN 1000
#define MAXTOKLEN 100
int main(int argc, char ** argv)
{
// Here is the sentence we are looking at
char * tmp = "This is a test and and another test";
// We will copy it to this variable
char phrase[MAXPHRASELEN+1];
strcpy(phrase, tmp);
// And will put the altered text in this variable
char new_phrase[MAXPHRASELEN+1];
// This will be the last word we looked at
char * lasttok = malloc(MAXTOKLEN+1);
// This will be the current word
char * tok = malloc(MAXTOKLEN+1);
// Both words are initially empty
new_phrase[0] = '\0';
lasttok[0] = '\0';
// Get the first word
lasttok = strtok(phrase, " ");
// If there is a word...
if (lasttok != NULL) {
// Put it in the altered text and add a space
strcat(new_phrase, lasttok);
strcat(new_phrase, " ");
// As long as there is a next word
while ( (tok = strtok(NULL, " ")) != NULL ) {
// See if it is the same as the last word
if (strcmp(tok,lasttok) != 0) {
// If it isn't, copy it to the altered text
strcat(new_phrase, tok);
// and add a space
strcat(new_phrase, " ");
// The current word becomes the last word
lasttok = tok;
}
}
}
// Print the lot
printf("%s\n", new_phrase);
}
If you really must write your own routine for grabbing the individual words, you could do worse than emulate strtok. It maintains a pointer to the beginning of current word in the string and puts a null character at the next separator (space character). When called again, it just moves the pointer to the character past the null, and puts another null after the next separator. Most string functions, when passed the pointer, will see the null as the end of the string and so just deal with the current word.
Minus comments, headers, and initialisation, it looks less threatening...
lasttok = strtok(phrase, " ");
if (lasttok != NULL) {
strcat(new_phrase, lasttok);
strcat(new_phrase, " ");
while ( (tok = strtok(NULL, " ")) != NULL ) {
if (strcmp(tok,lasttok) != 0) {
strcat(new_phrase, tok);
strcat(new_phrase, " ");
lasttok = tok;
}
}
}
printf("%s\n", new_phrase);

How to declare an empty char* and increase the size dynamically?

Let's say I am trying to do the following (this is a sub problem of what I am trying to achieve):
int compareFirstWord(char* sentence, char* compareWord){
char* temp; int i=-1;
while(*(sentence+(++i))!=' ') { *(temp+i) = *(sentence+i); }
return strcmp(temp, compareWord); }
When I ran compareFirstWord("Hi There", "Hi");, I got error at the copy line. It said I was using temp uninitialized. Then I used char* temp = new char[]; In this case the function returned 1 and not 0. When I debugged, I saw temp starting with some random characters of length 16 and strcmp fails because of this.
Is there a way to declare an empty char* and increase the size dynamically only to length and contents of what I need ? Any way to make the function work ? I don't want to use std::string.
In C, you may do:
int compareFirstWord(const char* sentence, const char* compareWord)
{
while (*compareWord != '\0' && *sentence == *compareWord) {
++sentence;
++compareWord;
}
if (*compareWord == '\0' && (*sentence == '\0' || *sentence == ' ')) {
return 0;
}
return *sentence < *compareWord ? -1 : 1;
}
With std::string, you just have:
int compareFirstWord(const std::string& sentence, const std::string& compareWord)
{
return sentence.compare(0, sentence.find(" "), compareWord);
}
temp is an uninitialized variable.
It looks like you are attempting to extract the first word out of the sentence in your loop.
In order to do it this way, you would first have to initialize temp to be at least as long as your sentence.
Also, your sentence may not have a space in it. (What about period, \t, \r, \n? Do these matter?)
In addition, you must terminate temp with a null character.
You could try:
int len = strlen(sentence);
char* temp = new char[len + 1];
int i = 0;
while(i < len && *(sentence+(i))!=' ') {
*(temp+i) = *(sentence+i);
i++;
}
*(temp+i) = '\0';
int comparable = strcmp(temp, compareWord);
delete temp;
return comparable;
Also consider using isspace(*(sentence+(i))), which will at least catch all whitespace.
In general, however, I'd use a library, or STL... Why reinvent the wheel...

Are there any functions in the stl which do string conversion, but with a length parameter?

Let say you have a string:
std::string s = "ABCD\t1234";
I can use std::string::find to get an offset to the '\t' character, but as far as I know there are no functions with the following signature:
int atoi_n(char *, int len);
An I missing anything? strtok replaces the \t with \0, and I don't want to touch the original buffer. I find it hard to believe that there aren't instances of the atoi, atof, etc which take a length parameter, but I can't find anything.
Anyone know if there is something I'm missing? I know boost has some tokenizers but I'd like avoid having to add the dependency of boost.
Looking the comments so far I'd like to clarify. Let's change the scenario:
char buffer[1024];
char *pStartPos;
char *pEndPost;
pStartPos = buffer + 5;
pEndPos = buffer + 10;
Let's also say you can't make any assumptions about the memory outside pStartPos and pEndPos. How do you convert the charaters between pStartPos and pEndPos to an int without adding a '\0' to buffer or copying using substr?
If you want to parse only the end of the string (from the character after \t to the end) you just need to pass a pointer to the first character to parse to atoi...
int n = atoi(s.c_str()+s.find('\t')+1);
(error checking omitted for brevity - in particular, we are always assuming that a \t is actually present)
If, instead, you want to parse from the beginning of the string up to \t you can just do
int n = atoi(s.c_str());
since atoi stops at the first non-numeric character anyway.
By the way, you should consider using more robust solutions for parsing the number, like strtol, sscanf or the C++ streams - they all can report a parsing error in some way, while atoi just returns 0 (which isn't distinguishable from a 0 that comes from parsing the string).
Incidentally, atoi is not in the "STL" by any means - it's just part of the C standard library.
I know that atoi is not in the STL. I was wondering if there was anything in STL like it where you can specify the last character which you want to include in the conversion. Basically I have a buffer which may be partially filled with garbage. I know the start of possible valid data and the end of possible valid data. I don't want to depend on whitespace to end the conversion, I want to be explicit about the length of the "field" because it also may not be /0 terminated.
If you are sure that the garbage doesn't start with digits you can use atoi/strtol/istringstream as is - they automatically stop just when they see the garbage. Otherwise, use the substr method to extract the exact substring you need:
std::string mayContainGarbage="alcak123456amaclmò";
std::string onlyTheDigits=mayContainGarbage.substr(5, 6);
// now parse onlyTheDigits as you prefer
To my knowledge, there is no such function out-of-box, but it shouldn't be difficult to implement.
For example:
template <typename ForwardIterator>
int range_to_int(ForwardIterator begin, ForwardIterator past_end) {
if (begin != past_end) {
bool negative = false;
auto ch = *begin;
if (ch == '-') {
negative = true;
++begin;
}
else if (ch == '+')
++begin;
if (begin != past_end) {
int result = 0;
do {
auto ch = *begin;
if (ch < '0' || ch > '9')
throw std::invalid_argument("Invalid digit.");
result = result * 10 + (ch - '0');
++begin;
} while (begin != past_end);
if (negative)
result = -result;
return result;
}
throw std::invalid_argument("+ or - must be followed by at least one digit.");
}
throw std::invalid_argument("Empty range.");
}
And you can use it like this:
int main() {
const char* buffer = "abc-123def";
int i = range_to_int(buffer + 4, buffer + 7);
assert(i == 123);
i = range_to_int(buffer + 3, buffer + 7);
assert(i == -123);
try {
i = range_to_int(buffer + 3, buffer + 8);
assert(false);
}
catch (const std::exception& ex) {
std::cout << ex.what() << std::endl;
}
try {
i = range_to_int(buffer + 3, buffer + 4);
assert(false);
}
catch (const std::exception& ex) {
std::cout << ex.what() << std::endl;
}
try {
i = range_to_int(buffer + 4, buffer + 4);
assert(false);
}
catch (const std::exception& ex) {
std::cout << ex.what() << std::endl;
}
// You can use it on std::string as well....
const std::string str = buffer;
i = range_to_int(str.begin() + 4, str.begin() + 7);
assert(i == 123);
// Etc...
return EXIT_SUCCESS;
}