I have string like this 1-2,4^,14-56
I am expecting output 2-3,5^,15-57
char input[48];
int temp;
char *pch;
pch = strtok(input, "-,^");
while(pch != NULL)
{
char tempch[10];
temp = atoi(pch);
temp++;
itoa(temp, tempch, 10);
memcpy(pch, tempch, strlen(tempch));
pch = strtok(NULL, "-,^");
}
After running through this if I print input it prints only 2 which is first character of the updated string. It does not print all characters in the string. What is the problem with my code?
For plain C, use the library function strtod. Other than atoi, this can update a pointer to the next unparsed character:
long strtol (const char *restrict str, char **restrict endptr, int base);
...
The strtol() function converts the string in str to a long value. [...] If endptr is not NULL, strtol() stores the address of the first invalid character in *endptr.
Since there may be more than one 'not-a-digit' character between the numbers, skip them with the library function isdigit. I placed this at the start of the loop so it would not accidentally convert a string such as -2,3 to -1,4 -- the initial -2 would be picked up first! (And if that is a problem elsewhere, there is also a strtoul.)
Since it appears you want the result in a char string, I use sprintf to copy the output into a buffer, which must be large enough for your possible input plus extra characters caused by a decimal overflow.
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <errno.h>
#include <limits.h>
int main (void)
{
char *inputString = "1-2,4^,14-56";
char *next_code_at = inputString;
long result;
char dest[100], *dest_ptr;
printf ("%s\n", inputString);
dest[0] = 0;
dest_ptr = dest;
while (next_code_at && *next_code_at)
{
while (*next_code_at && !(isdigit(*next_code_at)))
{
dest_ptr += sprintf (dest_ptr, "%c", *next_code_at);
next_code_at++;
}
if (*next_code_at)
{
result = strtol (next_code_at, &next_code_at, 10);
if (errno)
{
perror ("strtol failed");
return EXIT_FAILURE;
} else
{
if (result < LONG_MAX)
dest_ptr += sprintf (dest_ptr, "%ld", result+1);
else
{
fprintf (stderr, "number too large!\n");
return EXIT_FAILURE;
}
}
}
}
printf ("%s\n", dest);
return EXIT_SUCCESS;
}
Sample run:
Input: 1-2,4^,14-56
Output: 2-3,5^,15-57
There are two major problems with this code:
First of all,
pch = strtok(input, ",");
When applied to the string 1-2,4^,14-56 will return the token 1-2.
When you call atoi("1-2") you'll get 1, which gets converted to 2.
You can fix this by changing the first strtok to pch = strtok(NULL, "-,^");
Second of all, strtok modifies the string, which means that you lose the original delimiter found. As this looks like a homework exercise, I'll leave you to figure out how to get around this.
I think this could by easier using regular expressions(and C++ instead of C of course):
Complete exmaple:
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
int main()
{
// Your test string.
std::string input("1-2,4^,14-56");
// Regex representing a number.
std::regex number("\\d+");
// Iterators for traversing the test string using the number regex.
auto ri_begin = std::sregex_iterator(input.begin(), input.end(), number);
auto ri_end = std::sregex_iterator();
for (auto i = ri_begin; i != ri_end; ++i)
{
std::smatch match = *i; // Match a number.
int value = std::stoi(match.str()); // Convert that number to integer.
std::string replacement = std::to_string(++value); // Increment 1 and convert to string again.
input.replace(match.position(), match.length(), replacement); // Finally replace.
}
std::cout << input << std::endl;
return 0;
}
Output:
2-3,5^,15-57
strtok modifies the string you pass to it. Either use strchr or something like that to find the delimiters or make a copy of the string to work on.
Related
I've saved in a pointer an address of a word saved in a char list. I'm using strtok function to get the words delimited by the keywords I've set.
#include <iostream>
#include <cstring>
#include <cctype>
using namespace std;
int main() {
unsigned int i, z;
char sir[256];
cin.getline(sir, 256);
for(i = 0; i < strlen(sir); ++i){
char * p = strtok(sir, " ,.");
while(p != nullptr) {
// here i want to process the word.
p = strtok(sir, " ,.");
}
}
return 0;
}
What I want to do is to process the words like that:
Let's assume the word "StackOverflow", i want to go from the first letter which is "S" to the last letter which is "w". How can i do that?
Thank you very much, I hope it's clear what I'm asking.
Changed the input to a static value; but just switch that out.
#include <iostream>
#include <cstring>
#include <cctype>
using namespace std;
int main()
{
unsigned int i, z;
char sir[256] = "Thisissome ,.StackOverflow ,.sampletext";
//cin.getline(sir, 256);
char * p = strtok(sir, " ,.");
std::string word("StackOverflow");
do
{
printf ("%s\n",p);
std::string test(p);
if (word.compare(test) == 0)
{
printf("--> Found it!\n");
}
p = strtok (NULL, " ,.-");
} while(p != nullptr);
return 0;
}
strtok splits a string into a series of tokens. It shouldn't be successively called using the original character array to parse based on a delimiter. (i.e. the for(...) strtok(...) in OP).
The initial call expects a c-string for argument; and successful calls expects a nullptr. One each successive call; the next value will be returned from the c-string after the delimiter.
Once a null character is found; the function call will always return nullptr ( when nullptr is the first argument).
Since the solution is already using C++, this solution uses a std::string to show how each sequence of words will appear; and can be compared against a token word.
The value returned to p is still just a char*, so the OP can decide how to use the value returned by strtok. The example is intended to simply show to return delimited words using strtok; and a simple solution for seeing if that word is "StackOverflow"
I have the following piece of C++ code:
string dots="...";
char *points=(char *)malloc(sizeof(char)*20);
strcpy(points,dots.c_str());
points=strtok(points,".");
while(points!=NULL)
{
cout<<points<<endl;
points=strtok(NULL,".");
}
The cout statement prints nothing. What is this character that cout returns for 0 length token match? I have tried to check for '\0' but does not work. Please Help.
EDIT: Complete Program to Validate IP Addresses
#include<iostream>
#include<cstring>
#include<stdlib.h>
using namespace std;
int validateIP(string);
int main()
{
string IP;
cin>>IP;
int result=validateIP(IP);
if(result==0)
cout<<"Invalid IP"<<endl;
if(result==1)
cout<<"Valid IP"<<endl;
return 0;
}
//function definition validateIP(string)
int validateIP(string IP)
{
char ip[16];
int dotCount=0;
strcpy(ip,IP.c_str());
//check number of dots
for(int i=0;i<strlen(ip);++i)
{
if(ip[i]=='.')
{
dotCount++;
}
}
if(dotCount!=3)
return 0;
//check range
char *numToken;
numToken = strtok (ip,".");
while (numToken!= NULL)
{
int number;
if(numToken!=NULL) //check for token of length 0(e.g. case: ...)
number=atoi(numToken); //i also checked for (numToken[0]!='\O')
else return 0;
if(number<0 or number>255)
return 0;
numToken=strtok (NULL,".");
}
return 1;
}
The program prints ValidIP for input: ...
Your code has undefined behavior, you haven't allocate memory for points, accessing it invokes UB.
Update, I might write validateIP by using string and STL functions only if I could. Mix C/C++ is not good for maintenance.
#include <sstream>
int to_int(const std::string& s)
{
int i(0);
std::stringstream ss(s);
ss >> i;
return i;
}
bool isValidIp(const std::string& IP)
{
if (std::count(IP.begin(), IP.end(), '.') != 3)
{
return false;
}
std:stringstream ss(IP);
int token;
std::string s;
while(std::getline(ss, s, '.'))
{
int token = to_int(s);
if (token < 0 || token > 255)
{
return false;
}
}
return true;
}
Then you call it:
if (isValidIp(IP))
{
std::cout << "Valid IP" << std::endl;
}
else
{
std::cout << "Invalid IP" << std::endl;
}
The strtok function returns sub-string of the given string delimited by the given character. IMO (to be tested) if your string only contains delimiting characters, the strtok function will return NULL (no more tokens) at the first call.
Moreover in your code snippet, you copy the string to an uninitialized pointer. Replace your call to strcpy by a call to strdup for the underlying memory to be allocated before copying. Edit: you modified your question as I were answering
strtok is used to tokenize the string. Say, i have a string "abc.def.ghi.jkl" then we can use strtok to get the tokens besed on the delimiter.
char a[]="abc.def.ghi.jkl";
char tmp=strtok(a, ".");
if (tmp != NULL) //Required because strtok will return null if it failes find the delimiter
printf("\n value is [%s]", tmp); //out put is abc
So, in your case "..." is the string and '.' is the delimiter which result in empty string because there is no characters between first character and the delimiter '.'
your code will return empty string say "" as an output. for all the sttok function call.
Second you have to allocate memory to the points variable like
char points[dots.length()+1];
If the string only contains delimiting characters, strok return NULL
You probably want this:
int main()
{
string dots=". . ."; //Notice space
char *points=(char *)malloc(sizeof(char)*20);
char *p; // Use a char pointer
strcpy(points,dots.c_str());
p=strtok(points,".");
while(p!=NULL)
{
cout<<points<<endl;
p=strtok(NULL,".");
}
/* Free Memory */
free(points);
}
Okay, so I'm trying to reverse a C style string in C++ , and I'm coming upon some weird output. Perhaps someone can shed some light?
Here is my code:
int main(){
char str[] = "string";
int strSize = sizeof(str)/sizeof(char);
char str2[strSize];
int n = strSize-1;
int i =0;
while (&str+n >= &str){
str2[i] = *(str+n);
n--;
i++;
}
int str2size = sizeof(str)/sizeof(char);
int x;
for(x=0;x<str2size;x++){
cout << str2[x];
}
}
The basic idea here is just making a pointer point to the end of the string, and then reading it in backwards into a new array using pointer arithmetic.
In this particular case, I get an output of: " gnirts"
There is an annoying space at the beginning of any output which I'm assuming is the null character? But when I try to get rid of it by decrementing the strSize variable to exclude it, I end up with some other character on the opposite end of the string probably from another memory block.
Any ideas on how to avoid this? PS: (would you guys consider this a good idea of reversing a string?)
A valid string should be terminated by a null character. So you need to keep the null character in its original position (at the end of the string) and only reverse the non-null characters. So you would have something like this:
str2[strSize - 1] = str[strSize - 1]; // Copy the null at the end of the string
int n = strSize - 2; // Start from the penultimate character
There is an algorithm in the Standard Library to reverse a sequence. Why reinvent the wheel?
#include <algorithm>
#include <cstring>
#include <iostream>
int main()
{
char str[] = "string";
std::reverse(str, str + strlen(str)); // use the Standard Library
std::cout << str << '\n';
}
#ildjarn and #Blastfurnace have already given good ideas, but I think I'd take it a step further and use the iterators to construct the reversed string:
std::string input("string");
std::string reversed(input.rbegin(), input.rend());
std::cout << reversed;
I would let the C++ standard library do more of the work...
#include <cstddef>
#include <algorithm>
#include <iterator>
#include <iostream>
int main()
{
typedef std::reverse_iterator<char const*> riter_t;
char const str[] = "string";
std::size_t const strSize = sizeof(str);
char str2[strSize] = { };
std::copy(riter_t(str + strSize - 1), riter_t(str), str2);
std::cout << str2 << '\n';
}
while (&str+n >= &str){
This is nonsense, you want simply
while (n >= 0) {
and
str2[i] = *(str+n);
should be the much more readable
str2[i] = str[n];
Your while loop condition (&str+n >= &str) is equivalent to (n >= 0).
Your *(str+n) is equivalent to str[n] and I prefer the latter.
As HappyPixel said, your should start n at strSize-2, so the first character copied will be the last actual character of str, not the null termination character of str.
Then after you have copied all the regular characters in the loop, you need to add a null termination character at the end of the str2 using str2[strSize-1] = 0;.
Here is fixed, working code that outputs "gnirts":
#include <iostream>
using namespace std;
int main(int argc, char **argv){
char str[] = "string";
int strSize = sizeof(str)/sizeof(char);
char str2[strSize];
int n = strSize-2; // Start at last non-null character
int i = 0;
while (n >= 0){
str2[i] = str[n];
n--;
i++;
}
str2[strSize-1] = 0; // Add the null terminator.
int str2size = sizeof(str)/sizeof(char);
int x;
cout << str2;
}
Here is a problem from spoj. nothing related to algorithms, but just c
Sample Input
2
a aa bb cc def ghi
a a a a a bb bb bb bb c c
Sample Output
3
5
it counts the longest sequence of same words
http://www.spoj.pl/problems/WORDCNT/
The word is less than 20 characters
But when i run it, it's giving segmentation fault. I debugged it using eclipse. Here's where it crashes
if (strcmp(previous, current) == 0)
currentLength++;
with the following message
No source available for "strcmp() at 0x2d0100"
What's the problem?
#include <iostream>
#include <cstring>
#include <string>
#include <cstdio>
using namespace std;
int main(int argc, const char *argv[])
{
int t;
cin >> t;
while (t--) {
char line[20000], previous[21], current[21], *p;
int currentLength = 1, maxLength = 1;
if (cin.peek() == '\n') cin.get();
cin.getline(line, 20000);
p = strtok(line, " '\t''\r'");
strcpy(previous, p);
while (p != NULL) {
p = strtok(NULL, " '\t''\r'");
strcpy(current, p);
if (strcmp(previous, current) == 0)
currentLength++;
else
currentLength = 1;
if (currentLength > maxLength)
maxLength = currentLength;
}
cout << maxLength << endl;
}
return 0;
}
The problem is probably here:
while (p != NULL) {
p = strtok(NULL, " '\t''\r'");
strcpy(current, p);
While p may not be NULL when the loop is entered.
It may be NULL when strcpy is used on it.
A more correct form of the loop would be:
while ((p != NULL) && ((p = strtok(NULL, " \t\r")) != NULL))
{
strcpy(current, p);
Note. Tokenizing a stream in C++ is a lot easier.
std::string token;
std::cin >> token; // Reads 1 white space seoporated word
If you want to tokenize a line
// Step 1: read a single line in a safe way.
std::string line;
std::getline(std::cin, line);
// Turn that line into a stream.
std::stringstream linestream(line);
// Get 1 word at a time from the stream.
std::string token;
while(linestream >> token)
{
// Do STUFF
}
Forgot to check for NULL on strtok, it will return NULL when done and you cannot use that NULL on strcpy, strcmp, etc.
Note that you do a strcpy right after the strtok, you should check for null before doing that using p as a source.
The strtok man page says:
Each call to strtok() returns a pointer to a null-terminated string containing the next
token. This string does not include the delimiting character. If no more tokens are found,
strtok() returns NULL.
And in your code,
while (p != NULL) {
p = strtok(NULL, " '\t''\r'");
strcpy(current, p);
you are not checking for NULL (for p) once the whole string has been parsed. After that you are trying to copy p (which is NULL now) in current and so getting the crash.
You will find that one of previous or current does not point to a null-terminated string at that point, so strcmp doesn't know when to stop.
Use proper C++ strings and string functions instead, rather than mixing C and C++. The Boost libraries can provide a much safer tokeniser than strtok.
You probably undersized current and previous. You should really use std::string for this kind of thing- that's what it's for.
You are doing nothing to check your string lengths before copying them into buffers of size 21. I bet that you are somehow overwriting the end of the buffer.
If you insist on using C strings, I'd suggest using strncmp instead of strcmp. That way, in case you are ending up with a non-null terminated string (which is what I suspect is the case), you can restrict your compare to the max length of the string (in this case 21).
Try this one...
#include <cstdio>
#include <cstring>
#define T(x) strtok(x, " \n\r\t")
char line[44444];
int main( )
{
int t; scanf("%d\n", &t);
while(t--)
{
fgets(line, 44444, stdin);
int cnt = 1, len, maxcnt = 0, plen = -1;
for(char *p = T(line); p != NULL; p = T(NULL))
{
len = strlen(p);
if(len == plen) ++cnt;
else cnt = 1;
if(cnt > maxcnt)
maxcnt = cnt;
plen = len;
}
printf("%d\n", maxcnt);
}
return 0;
}
I have a string that I would like to tokenize.
But the C strtok() function requires my string to be a char*.
How can I do this simply?
I tried:
token = strtok(str.c_str(), " ");
which fails because it turns it into a const char*, not a char*
#include <iostream>
#include <string>
#include <sstream>
int main(){
std::string myText("some-text-to-tokenize");
std::istringstream iss(myText);
std::string token;
while (std::getline(iss, token, '-'))
{
std::cout << token << std::endl;
}
return 0;
}
Or, as mentioned, use boost for more flexibility.
Duplicate the string, tokenize it, then free it.
char *dup = strdup(str.c_str());
token = strtok(dup, " ");
free(dup);
If boost is available on your system (I think it's standard on most Linux distros these days), it has a Tokenizer class you can use.
If not, then a quick Google turns up a hand-rolled tokenizer for std::string that you can probably just copy and paste. It's very short.
And, if you don't like either of those, then here's a split() function I wrote to make my life easier. It'll break a string into pieces using any of the chars in "delim" as separators. Pieces are appended to the "parts" vector:
void split(const string& str, const string& delim, vector<string>& parts) {
size_t start, end = 0;
while (end < str.size()) {
start = end;
while (start < str.size() && (delim.find(str[start]) != string::npos)) {
start++; // skip initial whitespace
}
end = start;
while (end < str.size() && (delim.find(str[end]) == string::npos)) {
end++; // skip to end of word
}
if (end-start != 0) { // just ignore zero-length strings.
parts.push_back(string(str, start, end-start));
}
}
}
There is a more elegant solution.
With std::string you can use resize() to allocate a suitably large buffer, and &s[0] to get a pointer to the internal buffer.
At this point many fine folks will jump and yell at the screen. But this is the fact. About 2 years ago
the library working group decided (meeting at Lillehammer) that just like for std::vector, std::string should also formally, not just in practice, have a guaranteed contiguous buffer.
The other concern is does strtok() increases the size of the string. The MSDN documentation says:
Each call to strtok modifies strToken by inserting a null character after the token returned by that call.
But this is not correct. Actually the function replaces the first occurrence of a separator character with \0. No change in the size of the string. If we have this string:
one-two---three--four
we will end up with
one\0two\0--three\0-four
So my solution is very simple:
std::string str("some-text-to-split");
char seps[] = "-";
char *token;
token = strtok( &str[0], seps );
while( token != NULL )
{
/* Do your thing */
token = strtok( NULL, seps );
}
Read the discussion on http://www.archivum.info/comp.lang.c++/2008-05/02889/does_std::string_have_something_like_CString::GetBuffer
With C++17 str::string receives data() overload that returns a pointer to modifieable buffer so string can be used in strtok directly without any hacks:
#include <string>
#include <iostream>
#include <cstring>
#include <cstdlib>
int main()
{
::std::string text{"pop dop rop"};
char const * const psz_delimiter{" "};
char * psz_token{::std::strtok(text.data(), psz_delimiter)};
while(nullptr != psz_token)
{
::std::cout << psz_token << ::std::endl;
psz_token = std::strtok(nullptr, psz_delimiter);
}
return EXIT_SUCCESS;
}
output
pop
dop
rop
EDIT: usage of const cast is only used to demonstrate the effect of strtok() when applied to a pointer returned by string::c_str().
You should not use
strtok() since it modifies the tokenized string which may lead to undesired, if not undefined, behaviour as the C string "belongs" to the string instance.
#include <string>
#include <iostream>
int main(int ac, char **av)
{
std::string theString("hello world");
std::cout << theString << " - " << theString.size() << std::endl;
//--- this cast *only* to illustrate the effect of strtok() on std::string
char *token = strtok(const_cast<char *>(theString.c_str()), " ");
std::cout << theString << " - " << theString.size() << std::endl;
return 0;
}
After the call to strtok(), the space was "removed" from the string, or turned down to a non-printable character, but the length remains unchanged.
>./a.out
hello world - 11
helloworld - 11
Therefore you have to resort to native mechanism, duplication of the string or an third party library as previously mentioned.
I suppose the language is C, or C++...
strtok, IIRC, replace separators with \0. That's what it cannot use a const string.
To workaround that "quickly", if the string isn't huge, you can just strdup() it. Which is wise if you need to keep the string unaltered (what the const suggest...).
On the other hand, you might want to use another tokenizer, perhaps hand rolled, less violent on the given argument.
Assuming that by "string" you're talking about std::string in C++, you might have a look at the Tokenizer package in Boost.
First off I would say use boost tokenizer.
Alternatively if your data is space separated then the string stream library is very useful.
But both the above have already been covered.
So as a third C-Like alternative I propose copying the std::string into a buffer for modification.
std::string data("The data I want to tokenize");
// Create a buffer of the correct length:
std::vector<char> buffer(data.size()+1);
// copy the string into the buffer
strcpy(&buffer[0],data.c_str());
// Tokenize
strtok(&buffer[0]," ");
If you don't mind open source, you could use the subbuffer and subparser classes from https://github.com/EdgeCast/json_parser. The original string is left intact, there is no allocation and no copying of data. I have not compiled the following so there may be errors.
std::string input_string("hello world");
subbuffer input(input_string);
subparser flds(input, ' ', subparser::SKIP_EMPTY);
while (!flds.empty())
{
subbuffer fld = flds.next();
// do something with fld
}
// or if you know it is only two fields
subbuffer fld1 = input.before(' ');
subbuffer fld2 = input.sub(fld1.length() + 1).ltrim(' ');
Typecasting to (char*) got it working for me!
token = strtok((char *)str.c_str(), " ");
Chris's answer is probably fine when using std::string; however in case you want to use std::basic_string<char16_t>, std::getline can't be used. Here is a possible other implementation:
template <class CharT> bool tokenizestring(const std::basic_string<CharT> &input, CharT separator, typename std::basic_string<CharT>::size_type &pos, std::basic_string<CharT> &token) {
if (pos >= input.length()) {
// if input is empty, or ends with a separator, return an empty token when the end has been reached (and return an out-of-bound position so subsequent call won't do it again)
if ((pos == 0) || ((pos > 0) && (pos == input.length()) && (input[pos-1] == separator))) {
token.clear();
pos=input.length()+1;
return true;
}
return false;
}
typename std::basic_string<CharT>::size_type separatorPos=input.find(separator, pos);
if (separatorPos == std::basic_string<CharT>::npos) {
token=input.substr(pos, input.length()-pos);
pos=input.length();
} else {
token=input.substr(pos, separatorPos-pos);
pos=separatorPos+1;
}
return true;
}
Then use it like this:
std::basic_string<char16_t> s;
std::basic_string<char16_t> token;
std::basic_string<char16_t>::size_type tokenPos=0;
while (tokenizestring(s, (char16_t)' ', tokenPos, token)) {
...
}
It fails because str.c_str() returns constant string but char * strtok (char * str, const char * delimiters ) requires volatile string. So you need to use *const_cast< char > inorder to make it voletile.
I am giving you a complete but small program to tokenize the string using C strtok() function.
#include <iostream>
#include <string>
#include <string.h>
using namespace std;
int main() {
string s="20#6 5, 3";
// strtok requires volatile string as it modifies the supplied string in order to tokenize it
char *str=const_cast< char *>(s.c_str());
char *tok;
tok=strtok(str, "#, " );
int arr[4], i=0;
while(tok!=NULL){
arr[i++]=stoi(tok);
tok=strtok(NULL, "#, " );
}
for(int i=0; i<4; i++) cout<<arr[i]<<endl;
return 0;
}
NOTE: strtok may not be suitable in all situation as the string passed to function gets modified by being broken into smaller strings. Pls., ref to get better understanding of strtok functionality.
How strtok works
Added few print statement to better understand the changes happning to string in each call to strtok and how it returns token.
#include <iostream>
#include <string>
#include <string.h>
using namespace std;
int main() {
string s="20#6 5, 3";
char *str=const_cast< char *>(s.c_str());
char *tok;
cout<<"string: "<<s<<endl;
tok=strtok(str, "#, " );
cout<<"String: "<<s<<"\tToken: "<<tok<<endl;
while(tok!=NULL){
tok=strtok(NULL, "#, " );
cout<<"String: "<<s<<"\t\tToken: "<<tok<<endl;
}
return 0;
}
Output:
string: 20#6 5, 3
String: 206 5, 3 Token: 20
String: 2065, 3 Token: 6
String: 2065 3 Token: 5
String: 2065 3 Token: 3
String: 2065 3 Token:
strtok iterate over the string first call find the non delemetor character (2 in this case) and marked it as token start then continues scan for a delimeter and replace it with null charater (# gets replaced in actual string) and return start which points to token start character( i.e., it return token 20 which is terminated by null). In subsequent call it start scaning from the next character and returns token if found else null. subsecuntly it returns token 6, 5, 3.