Using strtok tokens in if statements - c++

I'm trying to get my head around how to split arrays and use the tokens in an if statement, however I'm not having much luck.
The below code is for an Arduino. What I am doing is passing the function receviedChars which will be something like:
token0,token1,token2
When i print out func, it reads out c, so I figured that if I compared func to c it should match true. Unfortunately, this doesn't seem to happen.
I'm quite new to C++ and Arduino, and mainly have a web development background so I might be misinterpreting something
const byte numChars = 32;
char receivedChars[numChars];
char *chars_array = strtok(receivedChars, ",");
char *func = chars_array;
Serial.println(func);
if(func == 'c') {
Serial.println("It works");
}
Could someone help me with where I am going wrong please?

First of all, strtok works iteratively. This means that to split a string into tokens you have to call it until it returns NULL:
char* token = strtok(input, ",");
while (token)
{
...
token = strtok(NULL, ",");
}
And the second thing to know is that char * is just a pointer to a block of memory treated as a string. So when you write something like:
char* str = ...;
if (str == 'c')
{
...
}
This actually means "compare an address pointed by variable 'str' with a value of an ASCII code of character 'c' (which is 0x63 in hex)", therefore your condition will be true iff the pointer returned by strtok equals to 0x63 and that is definitely not what you want.
What you really need is strcmp function, that compares two blocks of memory character by character:
char* chars_array = strtok(receivedChars, ",");
if (strcmp(chars_array, "bla") == 0)
{
// a first token is "bla"
}

Swap
if(func == 'c') {
to
if(func[0] == 'c') {
if you want to check if first char is 'c'

'func' is a pointer to the start of an array of characters; comparing it to a character value will almost never yield true. Perhaps you want to compare the character in that array instead.

The main issue is that you should use if(*func == 'c') {, i.e. dereference pointer func, instead of if(func == 'c') {.
Note that you additionally should consider that chars_array might be an empty string or might comprise only ','-characters; in this case, strtok will yield NULL, and probably lets your app crash. Hence, the code should look as follows:
if (func != nullptr) {
Serial.println(func);
if(*func == 'c') {
Serial.println("It works");
}
}

Related

Split string with strtok (nested)

I use an arduino MEGA to parse a the param part of an url.
It should not matter how the order of the parameters are. I have following code and I tried that with strtok.
char text[] = "ssid=SSID&pwd=PASSWORD&userId=1234"
and will split in
ssid=SSID
pwd=PASSWORD
userId=1234
and split it again in key and values
ssid
SSID
pwd
PASSWORD
userId
1234
I tried to use strtok for the first split.
char *ptr;
ptr = strtok(params, "&");
while (ptr != NULL) {
Serial.println(urlParam);
ptr = strtok(NULL, "&");
}
and the output is as expected:
Output:
ssid=SSID
pwd=PASSWORD
userId=1234
then the next split:
char *ptr;
ptr = strtok(params, "&");
while (ptr != NULL) {
char *paramKey;
char *paramValue;
paramKey = strtok(ptr, "=");
Serial.println(paramKey);
if (paramKey == 'ssid'){
paramValue = strtok(NULL, "=");
Serial.println(paramValue);
ssidName = paramValue;
}
if (paramKey == 'pwd'){
...
}
if (paramKey == 'userId'){
...
}
ptr = strtok(NULL, "&");
}
But the output is just
ssid
SSID
Looks like the loop is not working properly.
Where do I make the mistake?
Is there any other way to resolve this string?
The strtok function uses an internal static variable to keep track of its current state. When you use the function for multiple different substrings interleaved like you're doing, you step on the internal state.
You need to instead use strtok_r, which uses an external variable to keep track of state.
char *ptr, *sav1 = NULL;
ptr = strtok_r(params, "&", &sav1); // outer strtok_r, use sav1
while (ptr != NULL) {
char *paramKey;
char *paramValue;
char *sav2 = NULL;
paramKey = strtok_r(ptr, "=", &sav2); // inner strtok_r, use sav2
Serial.println(paramKey);
if (!strcmp(paramKey, "ssid")) {
paramValue = strtok_r(NULL, "=", &sav2); // inner strtok_r, use sav2
Serial.println(paramValue);
ssidName = paramValue;
}
if (!strcmp(paramKey, "pwd")) {
...
}
if (!strcmp(paramKey, "userId")) {
...
}
ptr = strtok_r(NULL, "&", &sav1); // outer strtok_r, use sav1
}
Not related to the parsing issue, you also can't compare strings with ==. You need to use strcmp instead, and string constants are surrounded in double quotes, not single quotes.
You have three problems:
The first is that you use implementation-specific multi-character literals like e.g. 'ssid', when you should be using strings like "ssid".
The second problem is that you use == to compare strings. That's almost impossible to get to work, as then you compare the pointers instead of the string contents. To compare strings you need to use strcmp.
The strtok function is not reentrant. You can't have multiple strtok parsings going simultaneously. Either separate the steps, or use strtok_s (or strtok_r if such a function is available).
#include <stdio.h>
#include <string.h>
struct split_result {
char first_part[64];
char second_part[64];
};
split_result return_left(char splitter, const char *str);
int main() {
/*
ssid=SSID
pwd=PASSWORD
userId=1234
*/
char *str = "ssid=109304905995";
split_result result = return_left('=', str);
printf("first part: %s \n second part: %s\n", result.first_part, result.second_part);
return 0;
}
split_result return_left(char splitter, const char *str) {
int i = 0;
split_result split_ptr;
while (str[i] != 0) {
//we found splitter
if ((char)str[i] == splitter) {
//remove first half
strncpy_s(split_ptr.first_part, str, (int)(&str[i]-str));
break;
}
//we need to remove second half now
i++;
}
strcpy_s(split_ptr.second_part, &str[i+1]);
return split_ptr;
}
I hope this will be helpful to anyone.

C++ tolower/toupper char pointer

Do you guys know why the following code crash during the runtime?
char* word;
word = new char[20];
word = "HeLlo";
for (auto it = word; it != NULL; it++){
*it = (char) tolower(*it);
I'm trying to lowercase a char* (string). I'm using visual studio.
Thanks
You cannot compare it to NULL. Instead you should be comparing *it to '\0'. Or better yet, use std::string and never worry about it :-)
In summary, when looping over a C-style string. You should be looping until the character you see is a '\0'. The iterator itself will never be NULL, since it is simply pointing a place in the string. The fact that the iterator has a type which can be compared to NULL is an implementation detail that you shouldn't touch directly.
Additionally, you are trying to write to a string literal. Which is a no-no :-).
EDIT:
As noted by #Cheers and hth. - Alf, tolower can break if given negative values. So sadly, we need to add a cast to make sure this won't break if you feed it Latin-1 encoded data or similar.
This should work:
char word[] = "HeLlo";
for (auto it = word; *it != '\0'; ++it) {
*it = tolower(static_cast<unsigned char>(*it));
}
You're setting word to point to the string literal, but literals are read-only, so this results in undefined behavior when you assign to *it. You need to make a copy of it in the dynamically-allocated memory.
char *word = new char[20];
strcpy(word, "HeLlo");
Also in your loop you should compare *it != '\0'. The end of a string is indicated by the character being the null byte, not the pointer being null.
Given code (as I'm writing this):
char* word;
word = new char[20];
word = "HeLlo";
for (auto it = word; it != NULL; it++){
*it = (char) tolower(*it);
This code has Undefined Behavior in 2 distinct ways, and would have UB also in a third way if only the text data was slightly different:
Buffer overrun.
The continuation condition it != NULL will not be false until the pointer it has wrapped around at the end of the address range, if it does.
Modifying read only memory.
The pointer word is set to point to the first char of a string literal, and then the loop iterates over that string and assigns to each char.
Passing possible negative value to tolower.
The char classification functions require a non-negative argument, or else the special value EOF. This works fine with the string "HeLlo" under an assumption of ASCII or unsigned char type. But in general, e.g. with the string "Blåbærsyltetøy", directly passing each char value to tolower will result in negative values being passed; a correct invocation with ch of type char is (char) tolower( (unsigned char)ch ).
Additionally the code has a memory leak, by allocating some memory with new and then just forgetting about it.
A correct way to code the apparent intent:
using Byte = unsigned char;
auto to_lower( char const c )
-> char
{ return Byte( tolower( Byte( c ) ) ); }
// ...
string word = "Hello";
for( char& ch : word ) { ch = to_lower( ch ); }
There are already two nice answers on how to solve your issues using null terminated c-strings and poitners. For the sake of completeness, I propose you an approach using c++ strings:
string word; // instead of char*
//word = new char[20]; // no longuer needed: strings take care for themseves
word = "HeLlo"; // no worry about deallocating previous values: strings take care for themselves
for (auto &it : word) // use of range for, to iterate through all the string elements
it = (char) tolower(it);
Its crashing because you are modifying a string literal.
there is a dedicated functions for this
use
strupr for making string uppercase and strlwr for making the string lower case.
here is an usage example:
char str[ ] = "make me upper";
printf("%s\n",strupr(str));
char str[ ] = "make me lower";
printf("%s\n",strlwr (str));

strchr, memchr fails to locate the character

I am trying to parse a string of format something like 1-3,5-7. I need to read 1,3 and 5,7.
What I am doing
char *dup_string;
dup_string = strdup(data);
tok = strtok(dup_string, ",");
while (tok != NULL)
{
char *rangeTok;
rangeTok = (char *)memchr(tok, "-", strlen(tok));
startpage = atoi(tok);
if(rangeTok != NULL)
{
*rangeTok++;
endpage = atoi(rangeTok);
}
else
endpage = startpage;
tok = strtok(NULL,",");
}
Here memchar returning a badptr, I have tried using strchr which is also returning batptr. Any ideas why it is returning badptr.
FYI, earlier I tried:
tok = strchr(dupstring, ",");
which worked fine for sometime, and started returning badptr. I am not sure why it is doing that.
You're passing the wrong argument to both strchr and memchr, as has already been pointed out. The second argument is an integer holding the value of a character, not a const char *.
This line
rangeTok = (char *)memchr(tok, "-", strlen(tok));
should be either
rangeTok = (char *)memchr(tok, '-', strlen(tok));
or preferably
rangeTok = strchr(tok, '-');
As an aside, what is this badptr? Do you just mean NULL?
The prototype of memchr() is as follows, void * memchr(void * ptr, int value, size_t num);. But you are passing a string in memchr(tok, "-", strlen(tok)); instead of an integer. The way you used strtok() is also wrong, It should be as follows,
tok = strtok(dup_string, ",");
while (tok != NULL)
{
/* Body of Loop */
tok = strtok(NULL,",");
}
On a first call, strtok() expects a string as the first argument, whose first character is used as the starting location to scan for tokens. In subsequent calls, the function expects a null pointer and uses the position right after the end of last token as the new starting location for scanning.
try to use sscanf() in this way
#include<stdio.h>
#include<string.h>
main()
{
char *data = "1-3,5,8-9";
char *ptr = data;
int e, pos=0, startpage, endpage;
while((e=sscanf(ptr, "%d-%d%n", &startpage, &endpage, &pos))>=1)
{
ptr+=pos;
if(e==1)
endpage = startpage;
printf("start page %d ** end page %d\n",startpage,endpage);
if (sscanf(ptr, " %*[,]%n", &pos) >= 0)
ptr+=pos;
}
}

Can you change the size of what a pointer point to

For example if a pointer points to an array of chars that read "Hello how are you?" And you only want the pointer to point to Hello. I am passing in a char pointer and when I cout it, it reads the entire array. I try to cut down the size using a for loop that break when it hit a ' '. But I am not having luck figuring it out. Any ideas?
const char *infile(char * file )
{
cout<<file<<endl; //this prints out the entire array
int j;
for(j=0;j<500; j++)
{
if(file[j]==' ')
break;
}
strncpy(file, file, j);
cout<<file<<endl; //how to get this to print out only the first word
}
strncpy() does not append a null terminator if there isn't one in the first j bytes of your source string. And your case, there isn't.
I think what you want to do is manually change the first space to a \0:
for (j = 0; j < 500; j++) {
if (file[j] == ' ') {
file[j] = '\0';
break;
}
}
First, avoid strtok (like the plague that it mostly is). It's unpleasant but sometimes justifiable in C. I've yet to see what I'd call justification for using it in C++ though.
Second, probably the easiest way to handle this (given that you're using C++) is to use a stringstream:
void infile(char const *file)
{
std::strinstream buffer(file);
std::string word;
buffer >> word;
std::cout << word;
}
Another possibility would be to use some of the functions built into std::string:
void infile(char const *file) {
std::string f(file);
std::cout << std::string(f, 0, f.find(" "));
}
...which, now that I think about it, is probably a bit simpler than the stringstream version of things.
A char* pointer actually just points to a single char object. If that object happens to be the first (or any) element of a string, you can use pointer arithmetic to access the other elements of that string -- which is how strings (C-style strings, not C++-style std::string objects) are generally accessed.
A (C-style) string is simply a sequence of characters terminated by a null character ('\0'). (Anything after the '\0' terminator isn't part of the string.) So a string "foo bar" consists of this sequence of characters:
{ 'f', 'o', 'o', ' ', 'b', 'a', 'r', '\0' }
If you want to change the string from "foo bar" to just "foo", one way to do it is simply to replace the space character with a null character:
{ 'f', 'o', 'o', '\0', ... }
The ... is not part of the syntax; it represents characters that are still there ('b', 'a', 'r', '\0'), but are no longer part of the string.
Since you're using C++, you'd probably be much better off using std::string; it's much more powerful and flexible, and frees you from having to worry about terminators, memory allocation, and other details. (Unless the point of this exercise is to learn how C-style strings work, of course.)
Note that this modifies the string pointed to by file, and that change will be visible to the caller. You can avoid that by making a local copy of the string (which requires allocating space for it, and later freeing that space). Again, std::string makes this kind of thing much easier.
Finally, this:
strncpy(file, file, j);
is bad on several levels. Calling strncpy() with an overlapping source and destination like this has undefined behavior; literally anything can happen. And strncpy() doesn't necessarily provide a proper NUL terminator in the destination. In a sense, strncpy() isn't really a string function. You're probably better off pretending it doesn't exist.
See my rant on the topic.
Doing this would be much easier
if(file[j]==' ') {
file[j] = 0;
break;
..
// strncpy(file, file, j);
Using strtok might make your life much easier.
Split up the string with ' ' as a delimiter, then print the first element you get from strtok.
Use 'strtok', see e.g. http://www.cplusplus.com/reference/clibrary/cstring/strtok/
If what you're asking is "can I dynamically resize the memory block pointed to by this pointer" then... not really, no. (You have to create a new block of the desired size, then copy the bytes over, delete the first block, etc.)
If you're trying to just "print the first word" then set the character at the position of the space to 0. Then, when you output the file* pointer you'll just get the first word (everything up to the \0.) (Read null terminated strings for more information on why that works that way.)
But this all depends on how much of what you're doing is an example to demonstrate the problem you're trying to solve. If you're really 'splitting up strings' then you'll at least want to look in to using strtok.
Why not just output each character at a time and then break once you hit a space.
const char *infile(char * file )
{
cout<<file<<endl; //this prints out the entire array
int j;
for(j=0;j<500; j++)
{
if(file[j]==' ')
break;
cout<<file[j];
}
cout<<endl;
}
This has nothing to do with the size of the pointer. A pointer always has the same size for a particular type.
Strtok might be the best solution (this code using strtok will break the string into substring every time is meets a space, an ",", a dot or a "-".
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
Source : CPP STRTOK
std::copy(file, std::find(file, file+500, ' '),
std::ostream_iterator<char>(std::cout, ""));
If you allocated the space that a char * points to using malloc, you can change the size using realloc.
char * pzFile = malloc(sizeof("Hello how are you?" + 1));
strcpy(pzFile, "Hello how are you?");
realloc(pzFile, 6);
pzFile[6] = '\0';
Note that if you do not set the null pointer, using the string can cause a problem.
If you were just trying to shorten the string, all you had to do is set the null terminator at position 6. The space allocated is larger than needed, but that's OK as long as it's not shorter.
I strongly advise that mostly what you want to do is COPY the string up to the space.
char * pzInput = malloc(sizeof("Hello how are you?" + 1));
strcpy(pzInput, "Hello how are you?");
char * pzBuffer = malloc(BUFFER_SIZE);
char * pzSrc = pzInput;
char * pzDst = pzBuffer;
while (*pzSrc && ' ' != *pzSrc)
*(pzDst++) = *(pzSrc++);
*pzDst = '\0';
This also ends up with pzSrc pointing at the rest of the string for later use!

D2: empty string in a conditional statement

In the following code, why does 2 give output but not 3? The removechars statement returns a string with length 0
import std.stdio, std.string;
void main() {
string str = null;
if (str) writeln(1); // no
str = "";
if (str) writeln(2); // yes
if (",&%$".removechars(r"^a-z")) writeln(3); // no
}
Edit: Ok, it may return null, but I'm still a bit puzzled because all of these print true
writeln(",&%$".removechars(r"^a-z") == "");
writeln(",&%$".removechars(r"^a-z") == null);
writeln(",&%$".removechars(r"^a-z").length == 0);
Edit 2: This also prints true, but put either of them in a conditional and you get a different result
writeln("" == null);
Edit 3: Alright, I understand that I cannot test for an empty string the way I did. What led to this question is the following situation. I want to remove chars from a word, but don't want to store an empty string:
if (auto w = word.removechars(r"^a-z"))
wordcount[w]++;
This works when I try it, but that must be because removechars is returning null rather than ""
Because removeChars will return null when no characters match.
(This happens because .dup of an empty string will always be null.)
D arrays, or slices if you prefer, are interesting beasts.
In D an empty array is equal to null, or more appropriately a null array is equal to an empty array, this is why assert("" == null) or assert([] == null). However when using just if(str) you're asking if there is a string here, and for null there isn't an array. It is equivalent to an empty array, but one does not exist.
The proper way to check if something is null: assert(str is null). I'm not sure which is best for converting a string to a bool, but really there can't be a perfect solution because string isn't a boolean.
Always use is and !is (is not) to compare with null. If you want to check if a string is empty check against its length property:
string str;
assert(str is null); // str is null
assert(!str); // str is null
str = "";
assert(str !is null); // no longer null
assert(str); // no longer null
assert(!str.length); // but it's zero length
if(!str.length) {
//dosomething ...
}