I am attempting to write a Vulkan application from scratch, and I have been having issues with attempting to count the total number of characters in a
const char**
I was wondering how I could iterate through each character of each "c string".
This is for the purpose of comparing existing instance extensions for creating a Vulkan instance.
This is the function I'm having issues with and every problem seems localized to the fact that sizeof gives back the size in bytes, not the number of elements.
void Extensions_Manager::GetInstanceExtensionNames(const char** extNames, const char** glfwNames, bool validation)
{
std::string extension = "";
if (validation)
instanceExtensionNames.push_back(VK_EXT_DEBUG_UTILS_EXTENSION_NAME);
for (int i = 0; i < sizeof(glfwNames); ++i)
{
if (glfwNames[i] == '\0')
{
instanceExtensionNames.push_back(extension.c_str());
slog("Instance extension to use %s", extension.c_str());
extension = "";
}
else
{
extension += glfwNames[i];
}
}
for (const char* ext : instanceExtensionNames)
{
slog("Available Extension: %s", ext);
}
extNames = instanceExtensionNames.data();
}
instanceExtensionNames
is a static vector of const char*
How to count the number of total chars in const char**
Loop over each string in the pointed array. For each string, count the number of chars. Add the lengths together.
Note that you may want to consider whether you want to count the number of char objects (i.e. code-units), or the number of character symbols which are not necessarily the same number depending on the character encoding.
sizeof(glfwNames)
sizeof has nothing to do with "number of elements". glfwNames is a pointer. sizeof(glfwNames) is the size of a pointer object. All pointers of particular type are always the same size (within a particular system) regardless of the object they point at.
There is no way of finding out the number of elements if given only a pointer to an element of the array and no other information. It is simply not possible because a pointer doesn't contain the necessary information. Without knowing when to stop iterating the array, there is no way to do the iteration correctly.
These are common ways of iterating a range (the range being an array in this case) of iterators (pointers being iterators for arrays):
Pass the length of the range as an argument, and use an indexed loop. This works with randomly accessible iterators such as pointers.
Pass an iterator (pointer in this case) to the end (one past the last element) of the range. This approach is more generic as it works with non-random access iterators as well. Increment the iterator in the loop, and once it equals to the end, you know what you've reached.
Designate certain value as a sentinel element (also known as a terminator). When a sentinel value is encountered, you know that iteration has reached the end of the range. This approach is used with null-terminated strings.
A side note: Interestingly, the command line arguments (argc + argv) provide both ways: argc is the length, but the char** argv is also terminated by a null pointer.
If you receive glfwNames from a library, then consult the documentation of the library for how to find out the length. If you're in control of both calling and defining the function, then make sure that you pass sufficient information into the function.
a string ends with a '\0' .
let's say u have *ptr = "string";
what u can do here is :
int i = 0;
while(ptr[i] != '\0')
i++;
cout << ptr[0];
return 0;
that will give you total characters of the string with '\0', else you can use strlen() which is in the string.h lib .
Related
I'm working on an exercise to calculate the length of a string using pointers.
Here's the code I've written below:
int main() {
std::string text = "Hello World";
std::string *string_ptr = &text;
int size = 0;
//Error below: ISO C++ forbids comparison between pointer and integer [-fpermissive]
while (string_ptr != '\0') {
size++;
string_ptr++;
}
std::cout << size;
}
In a lot of examples that I've seen, the string is often a char array which I also understand is a string. However, I want to try calculate it as a string object but I'm getting the error below.
Is it possible to calculate it where the string is an object, or does it need to be a char array?
If you just want the size of the string, well, use std::string::size():
auto size = text.size();
Alternatively, you can use length(), which does the same thing.
But I'm guessing you're trying to reimplement strlen for learning purposes. In that case, there are three problems with your code.
First, you're trying to count the number of characters in the string, and that means you need a pointer to char, not a pointer to std::string. That pointer should also point to constant characters, because you're not trying to modify those characters.
Second, to get a pointer to the string's characters, use its method c_str(). Getting the address of the string just gets you a pointer to the string itself, not its contents. Most importantly, the characters pointed to by c_str() are null terminated, so it is safe to use for your purposes here. Alternatively, use data(), which has been behaving identically to c_str() since C++11.
Finally, counting those characters involves checking if the value pointed to by the pointer is '\0', so you'll need to dereference it in your loop.
Putting all of this together:
const char* string_ptr = text.c_str(); // get the characters
int size = 0;
while (*string_ptr != '\0') { // make sure you dereference the pointer
size++;
string_ptr++;
}
Of course, this assumes the string does not contain what are known as "embedded nulls", which is when there are '\0' characters before the end. std::string can contain such characters and will work correctly. In that case, your function will return a different value from what the string's size() method would, but there's no way around it.
For that reason, you should really just call size().
First things first, the problem is irrelevant. std::string::size() is a O(1) (constant time) operation, as std::string's typically store their size. Even if you need to know the length of a C-style string (aka char*), you can use strlen. (I get that this is an exercise, but I still wanted to warn you.)
Anyway, here you go:
size_t cstrSize(const char* cstr)
{
size_t size(0);
while (*cstr != '\0')
{
++size;
++cstr;
}
return size;
}
You can get the underlying C-style string (which is a pointer to the first character) of a std::string by calling std::string::c_str(). What you did was getting a pointer to the std::string object itself, and dereferencing it would just give you that object back. And yes, you need to dereference it (using the * unary operator). That is why you got an error (which was on the (string_ptr != '\0') btw).
You are totally confused here.
“text” is a std::string, that is an object with a size() method retuning the length of the string.
“string_ptr” is a pointer to a std::string, that is a pointer to an object. Since it is a pointer to an object, you don’t use text.size() to get the length, but string_ptr->size().
So first, no, you can’t compare a pointer with an integer constant, only with NULL or another pointer.
The first time you increase string_ptr it points to the memory after the variable text. At that point using *string_ptr for anything will crash.
Remember: std::string is an object.
I wrote a simple function that can get the size of a std::string class object, and I know that size() function in std::string does the same job, So I wanted to know if the size() function really works like my function or if it is more complicated? If it's more complicated, then how?
int sizeOfString(const string str) {
int i=0;
while (str[i] != '\0') {
++i;
}
return i;
}
An std::string can contain null bytes, so your sizeOfString() function will produce a different result on the following input:
std::string evil("abc\0def", 7);
As for your other question: the size() method simply reads out an internal size field, so it is always constant time, while yours is linear in the size of the string.
You can peek at the implementation of std::string::size for various implementations for yourself: libc++, MSVC, libstdc++.
No.
Firstly, a std::string can contain NUL characters that count as part of the length, so you can't use '\0' as a sentinal, in the way you would for C-strings.
Secondly, The Standard guarantees that std::string::size has constant complexity.
In practice there are a few slightly different ways to represent a std::string:
pointer to start of buffer, buffer size, length of current data - size() just has to return the length member.
pointer to start of buffer, pointer to end of current data, pointer to end of buffer - size() has to return a simple calculation.
It is different than your implementation.
Your function iterates over the string until it find a null byte. Null terminated string are how string are handled in C through char*. In C++ a string is a full object with member variables.
Specifically for C++, the size of the string is stored as part of the object, making the size() function simply read out the value of a variable.
For a interesting talk about how a string works in C++ check out this video from CppCon: https://www.youtube.com/watch?v=kPR8h4-qZdk
No. Not at all like that.
std::string actually maintains the size as one of its data member. Think of std::string as a container that keeps a pointer to the actual data(a char*) and length of that data separate.
When you call size(), it actually just returns this size, hence it's O(1).
One example to highlight it's effect in practicality will be
// WRONG IMPLEMENTATION
int wrongChangeLengthToZero(std::string& s)
{
assert(s.size() != 0);
s[0]='\';
return s.size(); // Won't return 0
}
// CORRECT
int correctChangeLengthToZero(std::string& s)
{
assert(s.size() != 0);
s.resize(0);
return s.size(); // Will return 0
}
I was working with a program that uses a function to set a new value in the registry, I used a const char * to get the value. However, the size of the value is only four bytes. I've tried to use std::string as a parameter instead, it didn't work.
I have a small example to show you what I'm talking about, and rather than solving my problem with the function I'd like to know the reason it does this.
#include <iostream>
void test(const char * input)
{
std::cout << input;
std::cout << "\n" << sizeof("THIS IS A TEST") << "\n" << sizeof(input) << "\n";
/* The code above prints out the size of an explicit string (THIS IS A TEST), which is 15. */
/* It then prints out the size of input, which is 4.*/
int sum = 0;
for(int i = 0; i < 15; i++) //Printed out each character, added the size of each to sum and printed it out.
//The result was 15.
{
sum += sizeof(input[i]);
std::cout << input[i];
}
std::cout << "\n" << sum;
}
int main(int argc, char * argv[])
{
test("THIS IS A TEST");
std::cin.get();
return 0;
}
Output:
THIS IS A TEST
15
4
THIS IS A TEST
15
What's the correct way to get string parameters? Do I have to loop through the whole array of characters and print each to a string (the value in the registry was only the first four bytes of the char)? Or can I use std::string as a parameter instead?
I wasn't sure if this was SO material, but I decided to post here as I consider this to be one of my best sources for programming related information.
sizeof(input) is the size of a const char* What you want is strlen(input) + 1
sizeof("THIS IS A TEST") is size of a const char[]. sizeof gives the size of the array when passed an array type which is why it is 15 .
For std::string use length()
sizeof gives a size based on the type you give it as a parameter. If you use the name of a variable, sizeof still only bases its result on the type of that variable. In the case of char *whatever, it's telling you the size of a pointer to char, not the size of the zero-terminated buffer it's point at. If you want the latter, you can use strlen instead. Note that strlen tells you the length of the content of the string, not including the terminating '\0'. As such, if (for example) you want to allocate space to duplicate a string, you need to add 1 to the result to tell you the total space occupied by the string.
Yes, as a rule in C++ you normally want to use std::string instead of pointers to char. In this case, you can use your_string.size() (or, equivalently, your_string.length()).
std::string is a C++ object, which cannot be passed to most APIs. Most API's take char* as you noticed, which is very different from a std::string. However, since this is a common need, std::string has a function for that: c_str.
std::string input;
const char* ptr = input.c_str(); //note, is const
In C++11, it is now also safe-ish to do this:
char* ptr = &input[0]; //nonconst
and you can alter the characters, but the size is fixed, and the pointer is invalidated if you call any mutating member of the std::string.
As for the code you posted, "THIS IS A TEST" has the type of const char[15], which has a size of 15 bytes. The char* input however, has a type char* (obviously), which has a size of 4 on your system. (Might be other sizes on other systems)
To find the size of a c-string pointed at by a char* pointer, you can call strlen(...) if it is NULL-terminated. It will return the number of characters before the first NULL character.
If the registry you speak of is the Windows registry, it may be an issue of Unicode vs. ASCII.
Modern Windows stores almost all strings as Unicode, which uses 2 bytes per character.
If you try to put a Unicode string into an std::string, it may be getting a 0 (null), which some implementations of string classes treat as "end of string."
You may try using a std::wstring (wide string) or vector< wchar_t > (wide character type). These can store strings of two-byte characters.
sizeof() is also not giving you the value you may think it is giving you. Your system probably runs 32-bit Windows -- that "4" value is the size of the pointer to the first character of that string.
If this doesn't help, please post the specific results that occur when you use std::string or std::wstring (more than saying that it doesn't work).
To put it simply, the size of a const char * != the size of a const char[] (if they are equal, it's by coincidence). The former is a pointer. A pointer, in the case of your system, is 4 bytes REGARDLESS of the datatype. It could be int, char, float, whatever. This is because a pointer is always a memory address, and is numeric. Print out the value of your pointer and you'll see it's actually 4 bytes. const char[] now, is the array itself and will return the length of the array when requested.
C++ novice here. I have some basic questions. In int main( int argc, char *argv[] )
How is char *argv[] supposed to be read (or spoken out to humans)?
Is it possible to clear/erase specific content(s), character(s) in this case, of such array? If yes, how?
Can arrays be resized? If yes, how?
How can I copy the entire content of argv[] to a single std::string variable?
Are there other ways of determining the number of words / parameters in argv[] without argc? If yes, how? (*)
I'd appreciate explanations (not code) for numbers 2-5. I'll figure out the code myself (I learn faster this way).
Thanks in advance.
(*) I know that main(char *argv[]) is illegal. What I mean is whether there's at least a way that does not involve argcat all, like in the following expressions:
for( int i = 0; i < argc; ++i ) {
std::cout << argv[i] << std::endl;
}
and
int i = 0;
while( i < argc ) {
std::cout << argv[i] << std::endl;
++i;
}
Or
int i = 0;
do {
std::cout << argv[i] << std::endl;
++i; } while( i < argc );
It's an array of pointers to char.
Sort of - you can overwrite them.
Only by copying to a new array.
Write a loop and append each argv[i] to a C++ string.
Most implementations terminate the array with a NULL pointer. I can't remember if this is standard or not.
char **argv[]
Is wrong. It should be either char **argv or char *argv[], not a mixture of both. And then it becomes a pointer-to-pointer to characters, or rather a pointer to c-strings, i.e., an array of c-strings. :) cdecl.org is also quite helpful at thing like this.
Then, for the access, sure. Just, well, access it. :) argv[0] would be the first string, argv[3] would be the 4th string. But I totally wouldn't recommend replacing stuff in an array that isn't yours or that you know the internals of.
On array resize, since you're writing C++, use std::vector, which does all the complicated allocation stuff for you and is really safe. Generally, it depends on the array type. Dynamically allocated arrays (int* int_arr = new int[20]) can, static arrays (int int_arr[20]) can't.
To copy everything in argv into a single std::string, loop through the array and append every c-string to your std::string. I wouldn't recommend that though, rather have a std::vector<std::string>, i.e., an array of std::strings, each holding one of the arguments.
std::vector<std::string> args;
for(int i=0; i < argc; ++i){
args.push_back(argv[i]);
}
On your last point, since the standard demands argv to be terminated by a NULL pointer, it's quite easy.
int myargc = 0;
char** argv_copy = argv;
while(++argv_copy)
++myargc;
The while(++argv_copy) will first increment the pointer of the array, letting it point to the next element (e.g., after the first iteration it will point at c-string #2 (argv[1])). After that, if the pointer evaluates to false (if it is NULL), then the loop brakes and you have your myargc. :)
Several options: array of pointer to char OR array of C-string.
You can assign to particular characters to clear them, or you can shift the rest of the array forwards to "erase" characters/elements.
Normal C-style arrays cannot be resized. If you need a resizable array in C++ you should use std::vector.
You'll have to iterate over each of the items and append them to a string. This can be accomplished with C++ algorithms such as copy in conjunction with an ostream_iterator used on an ostringstream.
No. If there was such a way, there wouldn't be any need for argc. EDIT: Apparently for argv only the final element of the array is a null pointer.
1) It is supposed to be char **argv or char *argv[] which is a pointer to an array of characters more commonly known as an array of strings
2) CString is the std library to manipulate C strings (arrays of characters). You cannot resize an array without reallocating, but you can change the contents of elements by referencing it by index:
for(int i = 0; i < argc; ++i)
{
//set all the strings to have a null character in the
//first slot so all Cstring operations on this array,
//will consider it a null (empty) string
argv[i] = 0;
}
3) Technically no, however they can be deleted then reallocated:
int *array = new int[15]; //array of size 15
delete[] array;
array = new int[50]; //array of size 50
4) This is one way:
string *myString;
if(argc > 0)
{
myString = new string(argv[0]);
for(int i = 1; i < argc; ++i)
myString->append(argv[i]);
}
5) Yes, according to Cubbi:
POSIX specifies the final null pointer
for argv, see for example "The
application shall ensure that the last
member of this array is a null
pointer." at
pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
Which means you can do:
char *val = NULL;
int i = 0;
do
{
val = argv[i++]; //access argv[i], store it in val, then increment i
//do something with val
} while(val != NULL); //loop until end of argv array
It is spoken as "array of pointers to pointers to character" (note that this is not the signature of the main function, which is either int argc, char **argv or int argc, char *argv[] -- which is equivalent).
The argv array is modifiable (lack of const). It is illegal to write beyond the end of one of the strings though or extend the array; if you need to extend a string, create a copy of it and store a pointer in the array; if you need to extend the array, create a copy of the array.
They cannot be resized per se, but reinterpreted as a smaller array (which sort of explains the answer to the last question).
You will be losing information this way -- argv is an array of arrays, because the individual arguments have already been separated for you. You could create a list of strings using std::list<std::string> args(&argv[1], &argv[argc]);.
Not really. Most systems have argv NULL terminated, but that is not a guarantee.
char *argv[] can be read as: "an array of pointers to char"
char **argv can be read as: "a pointer to a pointer to char"
Yes, you may modify the argv array. For example, argv[0][0] = 'H' will modify the first character of the first parameter. If by "erase/clear" you mean remove a character from the array and everything automatically shift over: there is no automatic way to do that - you will need to copy all the characters one-by-one over to the left (including the NULL termination)
No, arrays cannot be resized. You will need to create a new one and copy the contents
How do you want to represent ALL the parameter strings as 1 std::string? It would make more sense to copy it to an array of std::strings
No, there is no special indication of the last entry in the array. you need to use argc
Array in C/C++ is not an object, but just a pointer to first element of array, so you cannot simply delete or insert values.
Answering your questions:
char *argv[] can be read as 'array of pointers to char'
It's possible, but involves direct manipulations with data in memory, such as copying and/or moving bytes around.
No. But you may allocate new array and copy necesary data
By manually copying each element into std::string object
No.
As a summary: C++ is much more low-level language that you think.
I like "reinventing the wheel" for learning purposes, so I'm working on a container class for strings. Will using the NULL character as an array terminator (i.e., the last value in the array will be NULL) cause interference with the null-terminated strings?
I think it would only be an issue if an empty string is added, but I might be missing something.
EDIT: This is in C++.
"" is the empty string in C and C++, not NULL. Note that "" has exactly one element (instead of zero), meaning it is equivalent to {'\0'} as an array of char.
char const *notastring = NULL;
char const *emptystring = "";
emptystring[0] == '\0'; // true
notastring[0] == '\0'; // crashes
No, it won't, because you won't be storing in an array of char, you'll be storing in an array of char*.
char const* strings[] = {
"WTF"
, "Am"
, "I"
, "Using"
, "Char"
, "Arrays?!"
, 0
};
It depends on what kind of string you're storing.
If you're storing C-style strings, which are basically just pointers to character arrays (char*), there's a difference between a NULL pointer value, and an empty string. The former means the pointer is ‘empty’, the latter means the pointer points to an array that contains a single item with character value 0 ('\0'). So the pointer still has a value, and testing it (if (foo[3])) will work as expected.
If what you're storing are C++ standard library strings of type string, then there is no NULL value. That's because there is no pointer, and the string type is treated as a single value. (Whereas a pointer is technically not, but can be seen as a reference.)
I think you are confused. While C-strings are "null terminated", there is no "NULL" character. NULL is a name for a null pointer. The terminator for a C-string is a null character, i.e. a byte with a value of zero. In ASCII, this byte is (somewhat confusingly) named NUL.
Suppose your class contains an array of char that is used to store the string data. You do not need to "mark the end of the array"; the array has a specific size that is set at compile-time. You do need to know how much of that space is actually being used; the null-terminator on the string data accomplishes that for you - but you can get better performance by actually remembering the length. Also, a "string" class with a statically-sized char buffer is not very useful at all, because that buffer size is an upper limit on the length of strings you can have.
So a better string class would contain a pointer of type char*, which points to a dynamically allocated (via new[]) array of char s. Again, it makes no sense to "mark the end of the array", but you will want to remember both the length of the string (i.e. the amount of space being used) and the size of the allocation (i.e. the amount of space that may be used before you have to re-allocate).
When you are copying from std::string, use the iterators begin(), end() and you don't have to worry about the NULL - in reality, the NULL is only present if you call c_str() (in which case the block of memory this points to will have a NULL to terminate the string.) If you want to memcpy use the data() method.
Why don't you follow the pattern used by vector - store the number of elements within your container class, then you know always how many values there are in it:
vector<string> myVector;
size_t elements(myVector.size());
Instantiating a string with x where const char* x = 0; can be problematic. See this code in Visual C++ STL that gets called when you do this:
_Myt& assign(const _Elem *_Ptr)
{ // assign [_Ptr, <null>)
_DEBUG_POINTER(_Ptr);
return (assign(_Ptr, _Traits::length(_Ptr)));
}
static size_t __CLRCALL_OR_CDECL length(const _Elem *_First)
{ // find length of null-terminated string
return (_CSTD strlen(_First));
}
#include "Maxmp_crafts_fine_wheels.h"
MaxpmContaner maxpm;
maxpm.add("Hello");
maxpm.add(""); // uh oh, adding an empty string; should I worry?
maxpm.add(0);
At this point, as a user of MaxpmContainer who had not read your documentation, I would expect the following:
strcmp(maxpm[0],"Hello") == 0;
*maxpm[1] == 0;
maxpm[2] == 0;
Interference between the zero terminator at position two and the empty string at position one is avoided by means of the "interpret this as a memory address" operator *. Position one will not be zero; it will be an integer, which if you interpret it as a memory address, will turn out to be zero. Position two will be zero, which, if you interpret it as a memory address, will turn out to be an abrupt disorderly exit from your program.