what does ++array[str[i]]; does?

what does ++array[str[i]]; does? - c++

I found a program to print out the most frequent character in an array of char.
Here is the code.
void main()
{
int array[255] = {0}; // initialize all elements to 0
char str[] = "thequickbrownfoxjumpedoverthelazydog";
int i, max, index;
for(i = 0; str[i] != 0; i++)
{
++array[str[i]];
}
// then find the most used charater ...
}
I don't really understand what ++array[str[i]];does.
We initialized the array as int array[255] but it still accepts the index as str[i] which I believe is char type.
Is it because str[i] automatically turn into ASCII ? And what ++ preceding the command does ?

In this code
++array[str[i]];
i walks the length of str (because of the setup of the loop we are inside...).
For each character inside str, the expression str[i] gets the value of that character. I use "value" instead of "character", because it later is treated as an integer index.
With that value the expression array[str[i]] accesses one of the entries in the array. Each entry in that array corresponds to one possible ASCII "character".
The ++ increments the value in the array. I.e. it counts the number of occurrences of e.g. 'a'.
In total, the code makes a histogram of ASCII character frequency inside str.
Note however the important warning by WhozCraig, in case you intend to use this. You have to match the assumptions the code makes (copied with permission, for completeness):
Just fyi, not casting that index to unsigned char is a recipe for disaster. Further, this is not using a table guaranteed to hold enough slots to cover the domain. i.e. 1 << CHAR_BIT in width. It will "work" (term used loosely) for your input string presented here. It is not an end-all general solution to char counting.

First, the initialization of the array to size 255 is because the ascii values of the characters are in this range. so for example when you call str[i]=a it translate to the value 97 which is a part of the array. you could see the values in the following ascii table, http://www.asciitable.com
Second, the operator ++array[str[i]]; is called pre-increment which is just adds 1 to the value in the array, in the following case you could use the post-increment and you will get the same result, array[str[i]]++;
reference to read about the post/pre increment:
https://www.geeksforgeeks.org/pre-increment-and-post-increment-in-c/

Related

Why is strlen(s) different from the size of s, and why does cout char display a character not a number?

I wrote a piece of code to count how many 'e' characters are in a bunch of words.
For example, if I type "I read the news", the counter for how many e's are present should be 3.
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char s[255],n,i,nr=0;
cin.getline(s,255);
for(i=1; i<=strlen(s); i++)
{
if(s[i-1]=='e') nr++;
}
cout<<nr;
return 0;
}
I have 2 unclear things about characters in C++:
In the code above, if I replace strlen(s) with 255, my code just doesn't work. I can only type a word and the program stops. I have been taught at school that strlen(s) is the length for the string s, which in this case, as I declared it, is 255. So, why can't I just type 255, instead of strlen(s)?
If I run the program above normally, it doesn't show me a number, like it is supposed to do. It shows me a character (I believe it is from the ASCII table, but I'm not sure), like a heart or a diamond. It is supposed to print the number of e's from the words.
Can anybody please explain these to me?

strlen(s) gives you the length of the string held in the s variable, up to the first NULL character. So if you input "hello", the length will be 5, even though s has a capacity of 255....
nr is displayed as a character because it's declared as a char. Either declare it as int, for example, or cast it to int when cout'ing, and you'll see a number.

strlen() counts the actual length of strings - the number of real characters up to the first \0 character (marking end of string).
So, if you input "Hello":
sizeof(s) == 255
strlen(s) == 5
For second question, you declare your nr as char type. std::cout recognizes char as a single letter and tries it print it as such. Declare your variable as int type or cast it before printing to avoid this.
int nr = 42;
std::cout << nr;
//or
char charNr = 42;
std::cout << static_cast<int>(charNr);

Additional mistakes not mentioned by others, and notes:
You should always check whether the stream operation was successful before trying to use the result.
i is declared as char and cannot hold values greater than 127 on common platforms. In general, the maximum value for char can be obtained as either CHAR_MAX or std::numeric_limits<char>::max(). So, on common platforms, i <= 255 will always be true because 255 is greater than CHAR_MAX. Incrementing i once it has reached CHAR_MAX, however, is undefined behavior and should never be done. I recommend declaring i at least as int (which is guaranteed to have sufficient range for this particular use case). If you want to be on the safe side, use something like std::ptrdiff_t (add #include <cstddef> at the start of your program), which is guaranteed to be large enough to hold any valid array size.
n is declared but never used. This by itself is harmless but may indicate a design issue. It can also lead to mistakes such as trying to use n instead of nr.
You probably want to output a newline ('\n') at the end, as your program's output may look odd otherwise.
Also note that calling a potentially expensive function such as strlen repeatedly (as in the loop condition) can have negative performance implications (strlen is typically an intrinsic function, though, and the compiler may be able to optimize most calls away).
You do not need strlen anyway, and can use cin.gcount() instead.
Nothing wrong with return 0; except that it is redundant – this is a special case that only applies to the main function.
Here's an improved version of your program, without trying to change your code style overly much:
#include <iostream>
#include <cstring>
#include <cstddef>
using namespace std;
int main()
{
char s[255];
int nr=0;
if ( cin.getline(s,255) )
{ // only if reading was successful
for(int i=0; i<cin.gcount(); i++)
{
if(s[i]=='e') nr++;
}
cout<<nr<<'\n';
}
return 0;
}
For exposition, the following is a more concise and expressive version using std::string (for arbitrary length input), and a standard algorithm. (As an interviewer, I would set this, modulo minor stylistic differences, as the canonical answer i.e. worth full credit.)
#include <algorithm>
#include <iostream>
#include <string>
using namespace std;
int main()
{
string s;
if ( getline(cin, s) )
{
cout << std::count(begin(s), end(s), 'e') << '\n';
}
}

I have 2 unclear things about characters in C++: 1) In the code above,
if I replace the "strlen(s)" with 255, my code just doesn't work, I
can only type a word and the program stops, and I have been taught at
school that "strlen(s)" is the length for the string s, wich in this
case, as I declared it, is 255. So, why can't I just type 255, instead
of strlen(s);
That's right, but strings only go the null terminator, even if there's more space allocated. Consider this, per example:
char buf[32];
strcpy(buf, "Hello World!");
There's 32 chars worth of space, but my string is only 12 characters long. That's why strlen returns 12 in this example. It's because it doesn't know how long the buffer is, it only knows the address of the string and parses it until it finds the null terminator.
So if you enter 255, you're going past what was set by cin and you'll read the rest of the buffer. Which, in this case, is uninitialized. That's undefined behavior - in this case it will most likely read some rubbish values, and those might coincidentally have the 'e' value and thus give you a wrong result.
2) If you run the program above normaly, it doesn't show you a number,
like it's supposed to do, it shows me a character(I believe it's from
the ASCII table but I'm not sure), like a heart or a diamond, but it
is supposed to print the number of e's from the words. So can anybody
please explain these to me?
You declared nr as char. While that can indeed hold an integer value, if you print it like this, it will be printed as a character. Declare it as int instead or cast it when you print it.

Finding location of character in input2 from input1

Important note: string (C++ object) and any other library such as array or vectors that could store unlimited characters cannot be used.
For my question:
We are given input 1, which is a sentence of unlimited characters. eg. Life is Beautiful.
Input 2: character who's location we have to find using the reference point (the middle character in input 1 after it is sorted and repeating characters are deleted) taken as zero. eg. fee.
An example:
Input 1: Life is beautiful
Input 2: see
Output: 2, -2, -2
Explanation: So firstly, we remove any spaces from input 1 and make all lowercase, then sort it in ascending order after which we find the reference letter (For above example, it's 'i'). We remove the repeating characters and then finally, put positions to character in input 1.
Example 2
Input 1: abcde
Input 2: aad
Output: -2, -2, 1
If the input 2 contains reference point, then the code returns zero.
Eg.
An example:
Input 1: abcde
Input 2: cab
Output: 0
The input1 is always odd and input2 is always 10 character max.
The problem I have is that I am not sure how to store these inputs without using strings, array etc. And even if I know how to store them, I cannot compare the inputs like input1[1] = input2[1] because we cannot use arrays/strings.
Is list an useful option with regards to important note?
I have mostly done it with the use of array but not sure how to approach it without the array. I tried to loop a character but it only stores the first character.
My practice code:
#include <iostream>
using namespace std;
int main() {
char input1;
for(int i =0; i < 3; i++ ) //for chacking whether the loops work or not.
{
cin >> input1;
}
cout<< input1;
char input2;
}
Please add any relevant tags.
I hope all the edits help.

KushanMehta proposed a C-ish solution. A more C++ one would be to implement a class wrapping a dynamic array of elements. In C++ it could be:
template <class T>
class MyArr {
protected:
T *arr; // a pointer to the dynamic array
size_t len; // the used length
size_t capacity; // the allocated capacity
...
As it contains a pointer to dynamic array, you cannot rely on default members, and should implement copy and move constructor and assignation operator and a destructor.
In order to be able to use all the goodies of C++ algorithm library, you should declare [c]begin() and [c]end() functions pointing to beginning of array and one past last element.
const T* cbegin() const {
return arr;
}
const T* cend() const {
return arr + len;
}
Then you need a subclass for characters implementing some methods to convert all characters to lower case and remove spaces, sort the array and remove duplicates. You should write io specializations for operator << and >> to be able to input strings from stdin and output them
The MyArr class can be used directly to store the resul value: just derive a specialization for int elements and implement the required specifications.
That may not be really easy, but you will learn C++ that way (not C)

You could do one thing to store the sentences by the use of dynamic memory for each character (sounds absurd but it is the only possible thing without actually worrying about the size of the input)
Meaning you take input till the user wants, in the meanwhile you can use malloc and realloc() for each new character, incrementing the size of your pointer to char for every new character.
(This is probably the way vector etc works on the naive level - not sure of this though)
Code snippet for the same:
#include <iostream>
#include<cstdlib>
#include<cstring>
using namespace std;
int main() {
char temp;
char *sentence = (char*) malloc(2*sizeof(char));
int counter = 0;
while( cin>>temp ){
sentence[counter++] = temp;
sentence = (char*) realloc(sentence, (counter+2)*sizeof(char));
}
sentence[counter] = '\0';
cout<<"The sentence is"<<endl<<strlen(sentence)<<endl<<sentence;
}

C++ toupper Syntax

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.
for (int i = 0; i < string.length(); i++)
{
if (isalpha(string[i]))
{
if (islower(string[i]))
{
string[i] = toupper(string[i]);
}
}
}
Why do you have to list string[i] twice? Shouldn't this work?
toupper(string[i]); (I tried it, so I know it doesn't.)

toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:
// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
...
In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.
To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
char& c = string[i]; // reference to character inside string
c = std::toupper(c);
}
Using range-based for, the above can be written more briefly (and executed more efficiently) as:
for (auto& c: string)
c = std::toupper(c);

As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.
The prototype of toupper is:
int toupper( int ch );
As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.

As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.
What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.
#include <cctype>
#include <iostream>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
for (auto s = text; *s != '\0'; ++s)
*s = std::toupper(*s);
std::cout << text << '\n';
}
You could further eliminate the raw loop by using an algorithm, if you find this prettier.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>
int
main()
{
char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
std::transform(std::cbegin(text), std::cend(text), std::begin(text),
[](auto c){ return std::toupper(c); });
std::cout << text << '\n';
}

toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.

Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.
So what, you think.
Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).
In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.
The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):
int x = toupper ('é');
So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.
See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

How to determine the end of an integer array when manipulating with integer pointer?

Here is the code:
int myInt[] ={ 1, 2, 3, 4, 5 };
int *myIntPtr = &myInt[0];
while( *myIntPtr != NULL )
{
cout<<*myIntPtr<<endl;
myIntPtr++;
}
Output: 12345....<junks>..........
For Character array: (Since we have a NULL character at the end, no problem while iterating)
char myChar[] ={ 'A', 'B', 'C', 'D', 'E', '\0' };
char *myCharPtr = &myChar[0];
while( *myCharPtr != NULL )
{
cout<<*myCharPtr<<endl;
myCharPtr++;
}
Output: ABCDE
My question is since we say to add NULL character as end of the strings, we rule out such issues!
If in case, it is rule to add 0 to the end of integer array, we could have avoided this problem. What say?

C-strings convention is that a char* finish by a '\0' char. For array or any other C++ container there are other idioms that can be applied. Next follows my preferences
The best way to iterate on sequences is to use the Range-based for-loop included on C++0x
int my_array[] = {1, 2, 3, 4, 5};
for(int& x : my_array)
{
cout<<x<<endl;
}
If your compiler don't provide this yet, use iterators
for(int* it = std::begin(array); it!=std::end(array); ++it)
{
cout<<*it<<endl;
}
And if you can not use neither std::begin/end
for(int* it = &array[0]; it!=&array[sizeof(array)]; ++it)
{
cout<<*it<<endl;
}
P.S Boost.Foreach emulates the Range-based for-loop on C++98 compilers

In C++ the best solution is to use a std::vector, not an array. vectors carry their size around with them. The problem with using zero (or any other value) as a an end marker is that of course it can't appear elsewhere in the array. This is not so much of an issue for strings, as we rarely want to print the character with code zero, but it is an issue when using arrays of ints.

What about using sizeof? http://www.cppreference.com/wiki/keywords/sizeof

You can certainly decide on your own "sentinel" value to store at the end of your array of integers. If your integers are always expected to be nonnegative, for example, you can use -1 as the sentinel value that marks the end of the array.
int myInt[] ={ 1, 2, 3, 4, 5, -1 };
int *myIntPtr = &myInt[0];
while( *myIntPtr >= 0 )
{
cout<<*myIntPtr<<endl;
myIntPtr++;
}

The char value 0 has special meaning, standardized by convention and practice. The int value 0 does not, so this can't be a general rule. If it works in your specific case, you can go with it. However, in general it is better to just keep track of the length of integer arrays separately, since this works universally. Or use std::vector or a similar container which handles that job for you.

Both the ASCII and Unicode standard defines a character with value 0 as the NULL character, not an end-of-array/string marker. It is only C/C++ convention that strings are terminated with this character. Pascal uses a different notation. Also, the NULL character does not necessarily indicated the end of the array that contains the string. There are several Win32 API functions that use double null terminated strings (the open file dialog for one), like this:
"one\0two\0three\0" // there's an implicit '\0' appended in C/C++
This is valid C/C++ code, the NULL character does not mean the end of the array.
To adapt this idea of a NULL value to integer arrays means you have to sacrifice one of your integer values. If your data consists of a subset of the set of integers then this isn't a problem but if your data can consist of any integer value, then there is no way to detemine if a given integer is the end-of-array marker or a valid value. In this latter case, your need additional information about the number of elements in the array, either manually or automatically via a std::vector.

Firstly, we don't "add NULL character" at the end of the string. There's no such thing as "NULL character". We add zero character, which is sometimes called "NUL character". But NULL has absolutely nothing to do with it. NULL is normally used in pointer context and not in character or integer context. Your comparisons like *myCharPtr != NULL or *myIntPtr != NULL will compile (due to the way NULL is defined in C++), but make virtually no sense. If you are looking for a zero character in an array, you can check for it as *myCharPtr != '\0' or as *myCharPtr != 0 or as simply *myCharPtr, but never as *myCharPtr != NULL.
Secondly, the zero character is called zero character for a reason: it is equal to integer zero. Character type in C++ is just a plain integer type after all. The only reason we can use zero character as something special in string context is because it's meaning is reserved for that specific purpose. In general case in integer context reserving zero for that purpose is plainly impossible for obvious reasons: zero is as useful as any other integer value. Yet, if in your specific application integer zero can be used as a reserved value, feel free to use it that way. Or you can use any other integer value for that purpose. But in general case, referring to the question you ask in the title, there's no way to determine the end of an array. It is your responsibility to know where the end is (by knowing the total number of elements or by marking the end with a reserved value of your choice or in some other way). There's no way to determine the end of an array even with strings, because all you can hope for is to find the end of the string, which is not necessarily the end of the array that stores that string.
If you explicitly added a zero to the end of your integer array, your first cycle would happily stop at it. For some reason you explicitly added \0 at the end of your character array (and the second cycle stops), but you didn't add a zero at the end of your integer array (and the first cycle doesn't stop). You are wondering why your first cycle didn't stop at zero? Becuse you didn't put that zero in there. It is that simple.

Use std::vector, like Neil says.
Or do it the iterator way:
int myInt[] ={ 100, 200, 300, 400, 500 };
int *myIntPtr = &myInt[0];
int *myIntPtr_end = myIntPtr + 5;
while(myIntPtr != myIntPtr_end)
{
cout<<*myIntPtr<<endl;
++myIntPtr;
}

for(i=0; i < sizeof(myInt); i++ )
{
cout<<*myIntPtr<<endl;
myIntPtr++;
}
If you're suggesting your code where myIntPtr is manipulated has no idea of the chunk size it points to, you either have to decide for a magic value in your int array, or restructure your code so that sizeof(myInt) is also available.
Standard C library functions use the latter approach: whenever you need to pass a buffer area through a pointer, you have to pass them its size in the same call.

The generic way of creating the end pointer for any array is as follows: First determine the number of elements in the array using sizeof(array)/sizeof(array[0]). Note that sizeof appears twice because it returns the size of an item in bytes. So for a static array, this is the size of the array divided by the size of an element in the array. Then the end-pointer to an array is array+number_of_elements. So this should work:
int myInt[]={1, 2, 3, 4, 5};
int myIntNumElements = sizeof(myInt) / sizeof(myInt[0]);
int *myIntEnd = myInt + myIntNumElelents;
for (int *myIntPtr = myInt; myInt != myIntEnd; myIntPtr++)
{
cout << *myIntPtr << endl;
}
And now for some caveats:
The end pointer points to a location just after the end of the array! So *myIntPtr returns junk, not the value of the last element in the array.
This is only good for regular, static arrays! For containers, use the begin and end member functions and iterators.
This approach will work with any version of C++. However, if you are using C++-11 or later, it is advisable to use the std::begin and std::end functions in the for statement as follows:
for (int *myIntPtr = std::begin(myInt); myIntPtr != std::end(myIntPtr); myIntPtr++)
This method is intended to be considered in addition to the other answers. Which one is best is a matter of context.

using sizeof() can solve the problem
int arr[] = {10, 20};
int *p = arr;
int loop = sizeof(arr);
while (loop) {
cout<<*p++<<endl;
loop-=sizeof(int);
}

Assigning char value in one array to char value in another array

Sounds easy, but I've got a bug and I'm not sure what's causing it?
nopunccount = 0;
char *ra = new char[sizeof(npa)];
while (nopunccount <= strlen(npa)) {
ra[nopunccount] = npa[strlen(npa) - nopunccount];
nopunccount++;
}
ra never gets a value into it and I have verified that npa has char values to provide within the nopunccount range.
Any help is appreciated // :)

nopunccountstarts as 0, so in the first iteration of the loop the character assigned to ra[0] is npa[strlen(npa)]. This is the terminating '\0' of that string. So the resulting string in ra starts with a '\0' and is therefore considered to be ending at that first byte by the usual string functions.

What does the declaration of npa look like? If it is a pointer, sizeof(npa) will be the size of a pointer, rather than the allocated size. If these are zero-terminated strings (also known as "C strings"), then use strlen, not sizeof. If these aren't strings, you need to track how much you allocated in a separate variable.
I have some other critiques of this code, possibly unrelated to your problem.
while (nopunccount <= strlen(npa)) {
strlen is an O(n) operation. This code will traverse the string npa in every loop iteration. It's best to only compute the length once.
ra[nopunccount] = npa[strlen(npa) - nopunccount];
Same problem here.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js