This question already has an answer here:
Comparing uint8_t data with string
(1 answer)
Closed 4 years ago.
I'm new to C and C++, and can't seem to work out how I need to compare these values:
Variable I'm being passed:
typedef struct {
uint8_t ssid[33];
String I want to match. I've tried both of these:
uint8_t AP_Match = "MatchString";
unsigned char* AP_Match = "MatchString";
How I've attempted to match:
if (strncmp(list[i].ssid, "MatchString")) {
if (list[i].ssid == AP_Match) {
if (list[i].ssid == "MatchString") {
// This one fails because String is undeclared, despite having
// an include line for string.h
if (String(reinterpret_cast<const char*>(conf.sta.ssid)) == 'MatchString') {
I've noodled around with this a few different ways, and done some searching. I know one or both of these may be the wrong type, but I'm not sure to get from where I am to working.
There is no such type as "String" defined by any C standard. A string is just an array of characters that are stored as unsigned values based on the chosen encoding. 'string.h' provides various functions for comparison, concatenation, etc. but it can only work if the values you are passing to it are coherent.
The operator "==" is also undefined for string comparisons, because it would require comparing each character at each index, for two arrays that may not be the same size and ultimately may use different encodings, despite the same underlying unsigned integer representation (raising the prospect of false positive comparisons). You can possibly define your own function to do it (note C doesn't allow overloading operators), but otherwise you're stuck with what the standard libraries provide.
Note that strncmp() takes a size parameter for the number of characters to compare (your code is missing this). https://www.tutorialspoint.com/c_standard_library/c_function_strncmp.htm
Otherwise you would be looking at the function strcmp(), which requires the input strings to be null-terminated (last character equal to '\0'). Ultimately it's up to you to consider what the possible combinations of inputs could be and how they are stored and to use a comparison function that is robust to all possibilities.
As a final side note
if (list[i].ssid == "MatchString") {
Since ssid is an array, you should know that when you do this comparison, you are not actually accessing the contents of ssid, but rather the address of the first element of ssid. When you pass list[i].ssid into strcmp (or strncmp), you are passing a pointer to the first element of the array in memory. The function then iterates over the entire array until it reaches the null character (in the case of strcmp) or until it has compared the specified number of elements (in the case of strncmp).
To match two strings use strcmp:
if (0==strcmp(str1, str2))
str1 and str2 are addresses to memory holding a null terminated string. Return value zero means the strings are equal.
In your case one of:
if (0==strcmp(list[i].ssid, AP_Match))
if (0==strcmp(list[i].ssid, "MatchString"))
Related
This question already has answers here:
C++ Comparison of String Literals
(8 answers)
Closed 2 years ago.
I was trying to learn about "<" operator on c++ strings and tried some test cases. I realized the two codes which I thought should behave the same was giving different results. Below are the codes, what is the reason for this?
string s="test";
string k="tes";
cout<<(s<k)<<endl; //returns false so it prints 0
cout<<("test"<"tes")<<endl; // returns true so it prints 1
(s < k) compares the values of the strings as you would expect.
("test" < "tes") compares the pointers to the beginning of the string literals as the compiler decides to arrange them in memory. Therefore, this comparison may return 0 or 1 depending on the compiler and settings in use, and both results are correct. The comparison is effectively meaningless.
The "C way" to compare these string literals would be strcmp("test", "tes").
s and k are string objects for which a comparison operator has been defined and performs what you expect.
"test" and "tes" are pointers to char that hold the address of the locations where these characters are stored. Thus the comparison is on the addresses.
I'm trying to instantiate and easily access an array of names in C++ using basic types in contiguous memory. I'm astounded that this is extremely difficult or complicated to do in C++ WITH ONLY basic types.
For some background, I am programming a microcontroller with limited memory, modest processing power, and it is handling serial communication over a network to 36 other microcontrollers sending continuous sensor data which is uploaded to a webserver. The shorter the refresh rate of the data, the better, so I prefer basic program features.
Not that I'm saying the more complicated stuff I've looked in other forums for, like an array of strings, has worked.
In my desperation, I was able to get this to work.
char names_array[] = "Bob\0\0Carl";
printf("%s",names_array); //outputs Bob
printf("%s",names_array + 5); //outputs Carl
This is a horrible solution though. My indexing is dependent on the longest name in the array, so if I added "Megan" to my list of names, I'd have to add a bunch of null characters throughout the entire array.
What I want to do is something like this:
char names_array[2] = {"Bob","Carl"}; //does not compile
printf("%s",names_array[0]);
printf("%s",names_array[1]);
//Error: A value of type "const char *" cannot be used to
//initialize an entity of type "char" in "main.cpp"
but that didn't work.
I want to loop through the names in my list and do something with each name, so at this point, this is my best solution.
char name0[] = "Bob";
loop_code(name0);
char name1[] = "Carl";
loop_code(name1);
.
.
.
I expect there's a reasonable way to make an array of pointers, each to an array of char terminated by null(s). I must be doing something wrong. I refuse to believe that a language like C++ is incapable of such a basic memory allocation.
You can, e.g., get an array of pointers to null-terminated strings:
const char* names_array[] = { "Bob", "Carl" };
and then
std::printf("%s", names_array[0]);
std::printf("%s", names_array[1]);
The problem with your attempt
char names_array[2] = {"Bob","Carl"};
is that you declare names_array to be an array of characters. This should never compile because what the = {"Bob","Carl"} essentially attempts to do is initialize each character in that array of characters with an entire array of characters of its own. A character is just a character, you cannot assign an entire array of characters to just an individual character. More precisely, initialization of a character array from a string literal is a special case of initialization [dcl.init.string] that allows a single string literal to be used to initialize an entire character array (because anything else doesn't make sense). What you actually want would be something more like an array of character arrays. However, the problem there is that you'd have to effectively pick a fixed maximum length for all strings in the array:
char names_array[][5] = { "Bob", "Carl" }; // each subarray is 5 characters in length
which would be potentially wasteful. You can flatten a series of multiple strings into one long array and then index into that, like you did with your first approach. The downside of that, as you've found out, is that you then need to know where each string starts in that array…
If you just want an array of string constants, a more modern C++ approach would be something like this:
#include <string_view>
using namespace std::literals;
constexpr std::string_view names[] = {
"Bob"sv,
"Carl"sv
};
The advantage of std::string_view is that it also has information about the length of the string. However, while std::string_view is compatible with most of the C++ standard library facilities that handle strings, it's not so simple to use it together with functions that expect C-style null-terminated strings. If you need null-terminated strings, I'd suggest to simply use an array of pointers to strings as shown at the very beginning of this answer…
char can has only one character.
If you want to use char, you can do it like
char name0[3] = "Bob";
char name1[4] = "Carl";
char *nameptr[2] = {&name0[0], &name1[0]};
Acutally, this pretty hard.
I suggest to you, use std::string.
std::string name[2] = {"Bob","Carl"};
this code is acceptable.
From what I understand, character arrays in C/C++ have a null-terminating character for the purpose of denoting an off-the-end element of that array, while integer arrays don't; they have some internal mechanism that is hidden from the user, but they obviously know their own size since the user can do sizeof(myArray)/sizeof(int) (Is that technically a hack?). Wouldn't it make sense for an integer array to have some null-terminating int -- call it i or something?
Why is this? It has never made any sense to me.
Because, in C, strings are not the same as character arrays, they exist at a level above arrays in much the same way as a linked list exists at a level above structures.
This is an example of a string:
"pax is great"
This is an example of a character array:
{ 'p', 'a', 'x' }
This is an example of a character array that just happens to be equivalent to a string:
{ 'p', 'a', 'x', '\0' }
In other words, C string are built on top of character arrays.
If you look at it another way, neither integer arrays nor "real" character arrays (like {'a', 'b', 'c'} for example) have a terminating character.
You can quite easily do the same thing (have a terminator) with an integer array of people's ages, using -1 (or any negative number) as the terminator.
The only difference is that you'll write your own code to handle it rather than using code helpfully provided in the C standard library, things like:
size_t agelen (int *ages) {
size_t len = 0;
while (*ages++ >= 0)
len++;
return len;
}
int *agecpy (int *src, int *dst) {
int *d = dst;
while (*s >= 0)
*d++ = *src++;
*dst = -1;
return dst;
}
Because string does not exists in c.
Because the null terminator is there to mark the end of the input and it doesn't have to be the length of the given array.
This is by convention, treating null as a non-character. Unlike other major system software languages of then e.g. PL/1 which had a leading integer to denote the length of a variable length character string, C was designed to treat strings as simply character arrays and did not want the overhead and in particular any portability issues (such as sizeof int) nor any limitations (what about very long strings). The convention has stuck because it worked out rather well.
To denote end of an int array as you have suggested would require a non-Int marker. That could be rather difficult to arrange. And sizeof an int array as you are figuring out is merely taking advantage of your knowledge of *alloc - there is absolutely nothing in C to prevent you from cobbling together an "array" by clever management of allocated memory. Modern compilers of course contain many convenience checks on wayward code and someone with better knowledge of compilers could clarify/rectify my comments here. C++ Vector contains an explicit knowledge of array capacity, for example.
A lot of places you can see a different Field Separator FS character used to separate out strings. E.g., CSV. But if you were to do that, you will need to write you own std libraries - thousands and thousands of lines of good, tested code.
A C-Style string is a collection of characters terminated by '\0'. It is not an array.
The collection can be indexed like an array.
Because the length of the collection can vary, the length must be determined by counting the number of characters in the collection.
A convenient representation is an array because an array is also a collection.
One difference is that an array is a fixed sized data structure. The collection of characters may not be a fixed size; for example, it can be concatenated.
If you think about the problem of how to represent strings, you have two choices: 1) store a count of letters followed by the letters or 2) store the letters followed by some unique special character used as an end of string marker.
End of string marker is more flexible - longer strings possible, easier to use, etc.
BTW you can have terminator on an int array if you want... Nothing stopping you saying that a -1 for example means the end if the list, as long as you are sure that the -1 is unique.
Just had an interesting argument in the comment to one of my questions. My opponent claims that the statement "" does not contain "" is wrong.
My reasoning is that if "" contained another "", that one would also contain "" and so on.
Who is wrong?
P.S.
I am talking about a std::string
P.S. P.S
I was not talking about substrings, but even if I add to my question " as a substring", it still makes no sense. An empty substring is nonsense. If you allow empty substrings to be contained in strings, that means you have an infinity of empty substrings. What is the point of that?
Edit:
Am I the only one that thinks there's something wrong with the function std::string::find?
C++ reference clearly says
Return Value: The position of the first character of the first match.
Ok, let's assume it makes sense for a minute and run this code:
string empty1 = "";
string empty2 = "";
int postition = empty1.find(empty2);
cout << "found \"\" at index " << position << endl;
The output is: found "" at index 0
Nonsense part: how can there be index 0 in a string of length 0? It is nonsense.
To be able to even have a 0th position, the string must be at least 1 character long.
And C++ is giving a exception in this case, which proves my point:
cout << empty2.at( empty1.find(empty2) ) << endl;
If it really contained an empty string it would had no problem printing it out.
It depends on what you mean by "contains".
The empty string is a substring of the empty string, and so is contained in that sense.
On the other hand, if you consider a string as a collection of characters, the empty string can't contain the empty string, because its elements are characters, not strings.
Relating to sets, the set
{2}
is a subset of the set
A = {1, 2, 3}
but {2} is not a member of A - all A's members are numbers, not sets.
In the same way, {} is a subset of {}, but {} is not an element in {} (it can't be because it's empty).
So you're both right.
C++ agrees with your "opponent":
#include <iostream>
#include <string>
using namespace std;
int main()
{
bool contains = string("").find(string("")) != string::npos;
cout << "\"\" contains \"\": "
<< boolalpha << contains;
}
Output: "" contains "": true
Demo
It's easy. String A contains sub-string B if there is an argument offset such that A.substr(offset, B.size()) == B. No special cases for empty strings needed.
So, let's see. std::string("").substr(0,0) turns out to be std::string(""). And we can even check your "counter-example". std::string("").substr(0,0).substr(0,0) is also well-defined and empty. Turtles all the way down.
The first thing that is unclear is whether you are talking about std::string or null terminated C strings, the second thing is why should it matter?. I will assume std::string.
The requirements on std::string determine how the component must behave, not what its internal representation must be (although some of the requirements affect the internal representation). As long as the requirements for the component are met, whether it holds something internally is an implementation detail that you might not even be able to test.
In the particular case of an empty string, there is nothing that mandates that it holds anything. It could just hold a size member set to 0 and a pointer (for the dynamically allocated memory if/when not empty) also set to 0. The requirement in operator[] requires that it returns a reference to a character with value 0, but since that character cannot be modified without causing undefined behavior, and since strict aliasing rules allow reading from an lvalue of char type, the implementation could just return a reference to one of the bytes in the size member (all set to 0) in the case of an empty string.
Some implementations of std::string use small object optimizations, in those implementations there will be memory reserved for small strings, including an empty string. While the std::string will obviously not contain a std::string internally, it might contain the sequence of characters that compose an empty string (i.e. a terminating null character)
empty string doesn't contain anything - it's EMPTY. :)
Of course an empty string does not contain an empty string. It'll be turtles all the way down if it did.
Take String empty = ""; that is declaring a string literal that is empty, if you want a string literal to represent a string literal that is empty you would need String representsEMpty = """"; but of course, you need to escape it, giving you string actuallyRepresentsEmpty = "\"\"";
ps, I am taking a pragmatic approach to this. Leave the maths nonsense at the door.
Thinking about you amendment, it could be possible that your 'opponent' meant was that an 'empty' std::string still has an internal storage for characters which is itself empty of characters. That would be an implementation detail I am sure, it could perhaps just keep a certain size (say 10) array of characters 'just incase', so it will technically not be empty.
Of course, there is the trick question answer that 'nothing' fits into anything infinite times, a sort of 'divide by zero' situation.
Today I had the same question since I'm currently bound to a lousy STL implementation (dating back to the pre-C++98 era) that differs from C++98 and all following standards:
TEST_ASSERT(std::string().find(std::string()) == string::npos); // WRONG!!! (non-standard)
This is especially bad if you try to write portable code because it's so hard to prove that no feature depends on that behaviour. Sadly in my case that's actually true: it does string processing to shorten phone numbers input depending on a subscriber line spec.
On Cppreference, I see in std::basic_string::find an explicit description about empty strings that I think matches exactly the case in question:
an empty substring is found at pos if and only if pos <= size()
The referred pos defines the position where to start the search, it defaults to 0 (the beginning).
A standard-compliant C++ Standard Library will pass the following tests:
TEST_ASSERT(std::string().find(std::string()) == 0);
TEST_ASSERT(std::string().substr(0, 0).empty());
TEST_ASSERT(std::string().substr().empty());
This interpretation of "contain" answers the question with yes.
Why is it that you can insert a '\0' char in a std::basic_string and the .length() method is unaffected but if you call char_traits<char>::length(str.c_str()) you get the length of the string up until the first '\0' character?
e.g.
string str("abcdefgh");
cout << str.length(); // 8
str[4] = '\0';
cout << str.length(); // 8
cout << char_traits<char>::length(str.c_str()); // 4
Great question!
The reason is that a C-style string is defined as a sequence of bytes that ends with a null byte. When you use .c_str() to get a C-style string out of a C++ std::string, then you're getting back the sequence the C++ string stores with a null byte after it. When you pass this into strlen, it will scan across the bytes until it hits a null byte, then report how many characters it found before that. If the string contains a null byte, then strlen will report a value that's smaller than the whole length of the string, since it will stop before hitting the real end of the string.
An important detail is that strlen and char_traits<char>::length are NOT the same function. However, the C++ ISO spec for char_traits<charT>::length (§21.1.1) says that char_traits<charT>::length(s) returns the smallest i such that char_traits<charT>::eq(s[i], charT()) is true. For char_traits<char>, the eq function just returns if the two characters are equal by doing a == comparison, and constructing a character by writing char() produces a null byte, and so this is equal to saying "where is the first null byte in the string?" It's essentially how strlen works, though the two are technically different functions.
A C++ std::string, however, it a more general notion of "an arbitrary sequence of characters." The particulars of its implementation are hidden from the outside world, though it's probably represented either by a start and stop pointer or by a pointer and a length. Because this representation does not depend on what characters are being stored, asking the std::string for its length tells you how many characters are there, regardless of what those characters actually are.
Hope this helps!