Input C-style string and get the length

Input C-style string and get the length - c++

The string input format is like this
str1 str2
I DONT know the no. of characters to be inputted beforehand so need to store 2 strings and get their length.
Using the C-style strings ,tried to made use of the scanf library function but was actually unsuccessful in getting the length.This is what I have:
// M W are arrays of char with size 25000
while (T--)
{
memset(M,'0',25000);memset(W,'0',25000);
scanf("%s",M);
scanf("%s",W);
i = 0;m = 0;w = 0;
while (M[i] != '0')
{
++m; ++i; // incrementing till array reaches '0'
}
i = 0;
while (W[i] != '0')
{
++w; ++i;
}
cout << m << w;
}
Not efficient mainly because of the memset calls.
Note:
I'd be better off using std::string but then because of 25000 length input and memory constraints of cin I switched to this.If there is an efficient way to get a string then it'd be good

Aside from the answers already given, I think your code is slightly wrong:
memset(M,'0',25000);memset(W,'0',25000);
Do you really mean to fill the string with the character zero (value 48 or 0x30 [assuming ASCII before some pedant downvotes my answer and points out that there are other encodings]), or with a NUL (character of the value zero). The latter is 0, not '0'
scanf("%s",M);
scanf("%s",W);
i = 0;m = 0;w = 0;
while (M[i] != '0')
{
++m; ++i; // incrementing till array reaches '0'
}
If you are looking for the end of the string, you should be using 0, not '0' (as per above).
Of course, scanf will put a 0 a the end of the string for you, so there's no need to fill the whole string with 0 [or '0'].
And strlen is an existing function that will give the length of a C style string, and will most likely have a more clever algorithm than just checking each character and increment two variables, making it faster [for long strings at least].

You do not need memset when using scanf, scanf adds the terminating '\0' to string.
Also, strlen is more simple way to determine string's length:
scanf("%s %s", M, W); // provided that M and W contain enough space to store the string
m = strlen(M); // don't forget #include <string.h>
w = strlen(W);

C-style strlen without memset may looks like this:
#include <iostream>
using namespace std;
unsigned strlen(const char *str) {
const char *p = str;
unsigned len = 0;
while (*p != '\0') {
len++;
*p++;
}
return len;
}
int main() {
cout << strlen("C-style string");
return 0;
}
It's return 14.

Related

How to replace a char in string with another char fast(I think test didn't want common way)

I was asked this question in tech test.
They asked how to change ' ' to '_' in string.
I think they didn't want common answer. like this (I can assure this)
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar
{
for(size_t i = 0 ; i < strLength ; i++)
{
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
So I answered like this. Use WORD. ( Actually I didn't write code, They want just explaining how to do)
I think comparing Each 8 byte(64bit OS) of string with mask 8 byte.
if They eqaul, replace 8byte in a time.
When Cpu read data with size less than WORD , Cpu should do operation clearing rest bits.
It's slow. So I tried to use WORD in comparing chars.
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar //
{
size_t mask = 0;
size_t replaced = 0;
for(size_t i = 0 ; i < sizeof(size_t) ; i++)
{
mask |= originalChar << i;
replaced |= newChar << i;
}
for(size_t i = 0 ; i < strLength ; i++)
{
// if 8 byte data equal with 8 byte data filled with originalChar
// replace 8 byte data with 8 byte data filled with newChar
if(i % sizeof(size_t) == 0 &&
strLength - i > sizeof(size_t) &&
*(size_t*)(originalStr + i) == mask)
{
*(size_t*)(originalStr + i) = replaced;
i += sizeof(size_t);
continue;
}
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
Is There any faster way??

Do not try to optimize a code when you do not know what is the bottleneck of the code. Try to write a clear readable code.
This function declaration and definition
void replaceChar(char originalStr[], size_t strLength, char originalChar, char newChar
{
for(size_t i = 0 ; i < strLength ; i++)
{
if(originalStr[i] == originalChar)
{
originalStr[i] = newChar ;
}
}
}
does not make a sense because it duplicates the behavior of the standard algorithm std::replace.
Moreover for such a simple basic general-purpose function you are using too long identifier names.
If you need to write a similar function specially for C-strings then it can look for example the following way as it is shown in the demonstrative program below
#include <iostream>
#include <cstring>
char * replaceChar( char s[], char from, char to )
{
for ( char *p = s; ( p = strchr( p, from ) ) != nullptr; ++p )
{
*p = to;
}
return s;
}
int main()
{
char s[] = "Hello C strings!";
std::cout << replaceChar( s, ' ', '_' ) << '\n';
return 0;
}
The program output is
Hello_C_strings!
As for your second function then it is unreadable. Using the continue statement in a body of for loop makes it difficult to follow its logic.
As a character array is not necessary aligned by the value of size_t then the function is not as fast as you think.
If you need a very optimized function then you should write it directly in assembler.

The first thing in the road to being fast is being correct. The problem with the original proposal is that sizeof(s) should be a cached value of strlen(s). Then the obvious problem is that this approach scans the string twice -- first to find the terminating character and then the character to be replaced.
This should be addressed by a data structure with known length, or data structure, with enough guaranteed excess data so that multiple bytes can be processed at once without Undefined Behaviour.
Once this is solved (the OP has been edited to fix this) the problem with the proposed approach of scanning 8 bytes worth of data for ALL the bytes being the same is that a generic case does have 8 successive characters, but maybe only 7. In all those cases one would need to scan the same area twice (on top of scanning the string terminating character).
If the string length is not known, the best thing is to use a low level method:
while (*ptr != 0) {
if (*ptr == search_char) {
*ptr = replace_char;
}
++ptr;
}
If the string length is known, it's best to use a library method std::replace, or it's low level counterpart
for (auto i = 0; i < size; ++i) {
if (str[i] == search_char) {
str[i] = replace_char;
}
}
Any decent compiler is able to autovectorize this, although the compiler might generate a larger variety of kernels than intended (one kernel for small sizes, one for intermediate and one to process in chunks of 32 or 64 bytes).

Parsing string to get comma-separated integer character pairs

I'm working on a project where I'm given a file that begins with a header in this format: a1,b3,t11, 2,,5,\3,*4,344,00,. It is always going be a sequence of a single ASCII character followed by an integer separated by a comma with the sequence always ending with 00,.
Basically what I have to do is go through this and put each character/integer pair into a data type I have that takes both of these as parameters and make a vector of these. For example, the header I gave above would be a vector with ('a',1), ('b',3),('t',11),(',',5)(' ',2),('\',3),('*',4),('3',44) as elements.
I'm just having trouble parsing it. So far I've:
Extracted the header from my text file from the first character up until before the ',00,' where the header ends. I can get the header string in string format or as a vector of characters (whichever is easier to parse)
Tried using sscanf to parse the next character and the next int then adding those into my vector before using substrings to remove the part of the string I've already analyzed (this was messy and did not get me the right result)
Tried going through the string as a vector and checking each element to see if it is an integer, a character, or a comma and acting accordingly but this doesn't work for multiple-digit integers or when the character itself is an int
I know I can fairly easily split my string based on the commas but I'm not sure how to do this and still split the integers from the characters while retaining both and accounting for integers that I need to treat as characters.
Any advice or useful standard library or string functions would be greatly appreciated.

One possibility, of many, would be to store the data in a structure. This uses an array of structures but the structure could be allocated as needed with malloc and realloc.
Parsing the string can be accomplished using pointers and strtol which will parse the integer and give a pointer to the character following the integer. That pointer can be advanced to use in the next iteration to get the ASCII character and integer.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define SIZE 100
struct pair {
char ascii;
int integer;
};
int main( void) {
char input[] = "a1,b3,!0,t11, 2,,5,\\3,*4,34400,";
char *pt = input;//start with pt pointing to first character of input
char *end = input;
int each = 0;
int loop = 0;
int length = 0;
struct pair pairs[SIZE] = { { '\0', 0}};
//assuming input will always end in 00, ( or ,00,)
//remove those three ( or 4 ??) characters
length = strlen ( input);
if ( length > 3) {
input[length - 3] = '\0';
}
for ( each = 0; each < SIZE; each++) {
//get the ASCII character and advance one character
pairs[each].ascii = *pt;
pt++;
//get the integer
pairs[each].integer = strtol ( pt, &end, 10);
//end==pt indicates the expected integer is missing
if ( end == pt) {
printf ( "expected an integer\n");
break;
}
//at the end of the string?
if ( *end == '\0') {
//if there are elements remaining, add one to each as one more was used
if ( each < SIZE - 1) {
each++;
}
break;
}
//the character following the integer should be a comma
if ( *end != ',') {
//if there are elements remaining, add one to each as one more was used
if ( each < SIZE - 1) {
each++;
}
printf ( "format problem\n");
break;
}
//for the next iteration, advance pt by one character past end
pt = end + 1;
}
//loop through and print the used structures
for ( loop = 0; loop < each; loop++) {
printf ( "ascii[%d] = %c ", loop, pairs[loop].ascii);
printf ( "integer[%d] = %d\n", loop, pairs[loop].integer);
}
return 0;
}
Another option is to use dynamic allocation.
This also uses sscanf to parse the input. The %n will capture the number of characters processed by the scan. The offset and add variables can then be used to iterate through the input. The last scan will only capture the ascii character and the integer and the return from sscanf will be 2.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct pair {
char ascii;
int integer;
};
int main( void) {
char input[] = "a1,b3,!0,t11, 2,,5,\\3,*4,34400,";
char comma = '\0';
char ascii = '\0';
int integer = 0;
int result = 0;
int loop = 0;
int length = 0;
int used = 0;
int add = 0;
int offset = 0;
struct pair *pairs = NULL;//so realloc will work on first call
struct pair *temp = NULL;
//assuming input will always end in 00, ( or ,00,)
//remove those three ( or 4 ??) characters
length = strlen ( input);
if ( length > 3) {
input[length - 3] = '\0';
}
while ( ( result = sscanf ( &input[offset], "%c%d%c%n"
, &ascii, &integer, &comma, &add)) >= 2) {//the last scan will only get two items
if ( ( temp = realloc ( pairs, ( used + 1) * sizeof ( *pairs))) == NULL) {
fprintf ( stderr, "problem allocating\n");
break;
}
pairs = temp;
pairs[used].ascii = ascii;
pairs[used].integer = integer;
//one more element was used
used++;
//the character following the integer should be a comma
if ( result == 3 && comma != ',') {
printf ( "format problem\n");
break;
}
//for the next iteration, add to offset
offset += add;
}
for ( loop = 0; loop < used; loop++) {
printf ( "ascii[%d] = %c ", loop, pairs[loop].ascii);
printf ( "value[%d] = %d\n", loop, pairs[loop].integer);
}
free ( pairs);
return 0;
}

Since you have figured out that you can just ignore the last 3 characters, using sscanf will be sufficient.
You can use sscanf to read one character (or getch functions), use sscanf to read an integer and finally even ignore one character.
Comment if you are having problems understanding how to do so.

Comparing a char

So, I am trying to figure out the best/simplest way to do this. For my algorithms class we are supposed read in a string (containing up to 40 characters) from a file and use the first character of the string (data[1]...we are starting the array at 1 and wanting to use data[0] as something else later) as the number of rotations(up to 26) to rotate letters that follow (it's a Caesar cipher, basically).
An example of what we are trying to do is read in from a file something like : 2ABCD and output CDEF.
I've definitely made attempts, but I am just not sure how to compare the first letter in the array char[] to see which number, up to 26, it is. This is how I had it implemented (not the entire code, just the part that I'm having issues with):
int rotation = 0;
char data[41];
for(int i = 0; i < 41; i++)
{
data[i] = 0;
}
int j = 0;
while(!infile.eof())
{
infile >> data[j+1];
j++;
}
for(int i = 1; i < 27; i++)
{
if( i == data[1])
{
rotation = i;
cout << rotation;
}
}
My output is always 0 for rotation.
I'm sure the problem lies in the fact that I am trying to compare a char to a number and will probably have to convert to ascii? But I just wanted to ask and see if there was a better approach and get some pointers in the right direction, as I am pretty new to C++ syntax.
Thanks, as always.

Instead of formatted input, use unformatted input. Use
data[j+1] = infile.get();
instead of
infile >> data[j+1];
Also, the comparison of i to data[1] needs to be different.
for(int i = 1; i < 27; i++)
{
if( i == data[1]-'0')
// ^^^ need this to get the number 2 from the character '2'.
{
rotation = i;
std::cout << "Rotation: " << rotation << std::endl;
}
}

You can do this using modulo math, since characters can be treated as numbers.
Let's assume only uppercase letters (which makes the concept easier to understand).
Given:
static const char letters[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
const std::string original_text = "MY DOG EATS HOMEWORK";
std::string encrypted_text;
The loop:
for (unsigned int i = 0; i < original_text.size(); ++i)
{
Let's convert the character in the string to a number:
char c = original_text[i];
unsigned int cypher_index = c - 'A';
The cypher_index now contains the alphabetic offset of the letter, e.g. 'A' has index of 0.
Next, we rotate the cypher_index by adding an offset and using modulo arithmetic to "circle around":
cypher_index += (rotation_character - 'A'); // Add in the offset.
cypher_index = cypher_index % sizeof(letters); // Wrap around.
Finally, the new, shifted, letter is created by looking up in the letters array and append to the encrypted string:
encrypted_text += letters[cypher_index];
} // End of for loop.
The modulo operation, using the % operator, is great for when a "wrap around" of indices is needed.
With some more arithmetic and arrays, the process can be expanded to handle all letters and also some symbols.

First of all you have to cast the data chars to int before comparing them, just put (int) before the element of the char array and you will be okay.
Second, keep in mind that the ASCII table doesn't start with letters. There are some funny symbols up until 60-so element. So when you make i to be equal to data[1] you are practically giving it a number way higher than 27 so the loop stops.

The ASCII integer value of uppercase letters ranges from 65 to 90. In C and its descendents, you can just use 'A' through 'Z' in your for loop:
change
for(int i = 1; i < 27; i++)
to
for(int i = 'A'; i <= 'Z'; i++)
and you'll be comparing uppercase values. The statement
cout << rotation;
will print the ASCII values read from infile.

How much of the standard library are you permitted to use? Something like this would likely work better:
#include <iostream>
#include <string>
#include <sstream>
int main()
{
int rotation = 0;
std::string data;
std::stringstream ss( "2ABCD" );
ss >> rotation;
ss >> data;
for ( int i = 0; i < data.length(); i++ ) {
data[i] += rotation;
}
// C++11
// for ( auto& c : data ) {
// c += rotation;
// }
std::cout << data;
}
Live demo
I used a stringstream instead of a file stream for this example, so just replace ss with your infile. Also note that I didn't handle the wrap-around case (i.e., Z += 1 isn't going to give you A; you'll need to do some extra handling here), because I wanted to leave that to you :)
The reason your rotation is always 0 is because i is never == data[1]. ASCII character digits do not have the same underlying numeric value as their integer representations. For example, if data[1] is '5', it's integer value is actually 49. Hint: you'll need to know these values when handle the wrap-around case. Do a quick google for "ANSI character set" and you'll see all the different values.
Your determination of the rotation is also flawed in that you're only checking data[1]. What happens if you have a two-digit number, like 10?

C++ Vowels in string, comparison forbidden

I'm trying to count the total number of vowels in a string. I'm using strlen to get the total length of the string but then when I try and count through the string by each letter it says C++ forbids comparison. So I assume something is wrong in my if statement.
#include <iostream>
#include <cstring>
using namespace std;
int main() {
char sentence[] = "";
int count;
int total;
int length;
int lengthcount;
int output;
output = 0;
length = 0;
count = 0;
total = 0;
cin >> total;
while (total != count){
cin >> sentence;
length = strlen(sentence);
while (length != lengthcount)
if (sentence[length] == "a" ||sentence[length] == "e"||sentence[length] == "i"||sentence[length] == "o"||sentence[length] == "u"||sentence[length] == "y"){
++ output;
++ lengthcount;
else{
++lengthcount;
}
}
++count;
}
return 0;
}

sentence[length] is a single character. It should be compared to a 'a' and not "a".
"a" is a character array and direct comparison with the built in operator== is not supported.
sentence[index] == 'a'; // where index is probably lengthcount in your example
Should do the trick. If use of std::string is an option, you should favour that over char arrays.
In addition, your char sentence[] = ""; will need some more space than just the '\0' character. Some alternatives include the use of std::string and std::getline or char[nnn] with cin.get(...) to make sure that you don't overrun the buffer you allocate.

See Nialls answer for one of the main problems.
The algorithmic problem with your code is again in the if statement.
sentence[length] returns the last character of your c_string (in this case, the null character '/0' that terminates the string).
Your if statement should look more like:
if (sentence[lengthcount] == 'a'\
||sentence[lengthcount] == 'e'\
||sentence[lengthcount] == 'i'\
||sentence[lengthcount] == 'o'\
||sentence[lengthcount] == 'u'\
||sentence[lengthcount] == 'y')
{
\\do something
}
Please remember to pre-allocate space for the string too, i.e.
char sentence[50];
which would give you space for 49 chars + terminator.
Alternatively, use a std::string

If you wish to count the total number of vowels in the given string, you need to use sentence[lengthcount]. Lets say the sentence is abc strlen(sentence) would return 3, and since in c++, the indexing begins with 0 and not 1, therefore sentence[length] would check for '\0' hence in the entire loop you check against the last value which is '\0' which is meaningless. Also, don't forget to initialize lengthcount. Rest all the things per-mentioned.

char sentence [] = "" produces an array sentence with a length of 1.
cin >> sentence isn't going to work very well, is it, if sentence cannot hold more than one character and one character is already needed for the trailing nul byte?
lengthcount is an unitialised variable, and the rest of the code just makes my head hurt.

intToStr recursively

This is a task from school, I am supposed to write a recursive function that will convert a given int to a string, I know I'm close but I can't point the missing thing in my code, hints are welcome.
void intToStr(unsigned int num, char s[])
{
if (num < 10)
{
s[0] = '0' + num;
}
else
{
intToStr(num/10, s);
s[strlen(s)] = '0' + num%10;
}
}
Edit: my problem is that the function only works for pre initialized arrays, but if I let the function work on an uninitialized function it will not work.

Unless your array is zero-initialized, you are forgetting to append a null terminator when you modify it.
Just add it right after the last character:
void intToStr(unsigned int num, char s[])
{
if (num < 10)
{
s[0] = '0' + num;
s[1] = 0;
}
else
{
intToStr(num/10, s);
s[strlen(s)+1] = 0; //you have to do this operation here, before you overwrite the null terminator
s[strlen(s)] = '0' + num%10;
}
}
Also, your function is assuming that s has enough space to hold all the digits, so you better make sure it does (INT_MAX is 10 digits long I think, so you need at least 11 characters).

Andrei Tita already showed you the problem you had with the NULL terminators. I will show you an alternative, so you can compare and contrast different approaches:
int intToStr(unsigned int num, char *s)
{
// We use this index to keep track of where, in the buffer, we
// need to output the current character. By default, we write
// at the first character.
int idx = 0;
// If the number we're printing is larger than 10 we recurse
// and use the returned index when we continue.
if(num > 9)
idx = intToStr(num / 10, s);
// Write our digit at the right position, and increment the
// position by one.
s[idx++] = '0' + (num %10);
// Write a terminating NULL character at the current position
// to ensure the string is always NULL-terminated.
s[idx] = 0;
// And return the current position in the string to whomever
// called us.
return idx;
}
You will notice that my alternative also returns the final length of the string that it output into the buffer.
Good luck with your coursework going forward!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Input C-style string and get the length - c++

You do not need memset when using scanf, scanf adds the terminating '\0' to string. Also, strlen is more simple way to determine string's length: scanf("%s %s", M, W); // provided that M and W contain enough space to store the string m = strlen(M); // don't forget #include <string.h> w = strlen(W);

C-style strlen without memset may looks like this: #include <iostream> using namespace std; unsigned strlen(const char str) { const char p = str; unsigned len = 0; while (p != '\0') { len++; p++; } return len; } int main() { cout << strlen("C-style string"); return 0; } It's return 14.

Related

How to replace a char in string with another char fast(I think test didn't want common way)

Parsing string to get comma-separated integer character pairs

Comparing a char

C++ Vowels in string, comparison forbidden

intToStr recursively

Categories

Resources

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Input C-style string and get the length - c++

You do not need memset when using scanf, scanf adds the terminating '\0' to string. Also, strlen is more simple way to determine string's length: scanf("%s %s", M, W); // provided that M and W contain enough space to store the string m = strlen(M); // don't forget #include <string.h> w = strlen(W);

C-style strlen without memset may looks like this: #include <iostream> using namespace std; unsigned strlen(const char *str) { const char *p = str; unsigned len = 0; while (*p != '\0') { len++; *p++; } return len; } int main() { cout << strlen("C-style string"); return 0; } It's return 14.

Related

How to replace a char in string with another char fast(I think test didn't want common way)

Parsing string to get comma-separated integer character pairs

Comparing a char

C++ Vowels in string, comparison forbidden

intToStr recursively

Categories

Resources

C-style strlen without memset may looks like this: #include <iostream> using namespace std; unsigned strlen(const char str) { const char p = str; unsigned len = 0; while (p != '\0') { len++; p++; } return len; } int main() { cout << strlen("C-style string"); return 0; } It's return 14.