counting number of frequency of a word in a text [duplicate] - c++

This question already has answers here:
Counting the Frequency of Specific Words in Text File
(4 answers)
Closed 9 years ago.
I wrote a function for counting frequency of specific word in a text.This program every time return zero.How can I improve it?
while (fgets(sentence, sizeof sentence, cfPtr))
{
for(j=0;j<total4;j++)
{
frequency[j] = comparision(sentence,&w);
all_frequency+=frequency[j];
}}
.
.
.
int comparision(const char sentence[ ],char *w)
{
int length=0,count=0,l=0,i;
length= strlen(sentence);
l= strlen(w);
while(sentence[i]!= '\n')
if(strncmp(sentence,w,l))
count++;
i++;
return count;
}

I have proofread your code and have commented on coding style and variable names. There
is still a flaw I left with the conditional, which is due to not iterating through the
sentence.
Here is your code marked up:
while(fgets(sentence, sizeof sentence, cfPtr)) {
for(j=0;j<total4;j++){
frequency[j] = comparision(sentence,&w);
all_frequency+=frequency[j];
}
}
// int comparision(const char sentence[ ],char *w) w is a poor variable name in this case.
int comparison(const char sentence[ ], char *word) //word is a better name.
{
//int length=0,count=0,l=0,i;
//Each variable should get its own line.
//Also, i should be initialized and l is redundant.
//Here are properly initialized variables:
int length = 0;
int count = 0;
int i = 0;
//length= strlen(sentence); This is redundant, as you know that the line ends at '\n'
length = strlen(word); //l is replaced with length.
//while(sentence[i]!= '\n')
//The incrementor and the if statement should be stored inside of a block
//(Formal name for curley braces).
while(sentence[i] != '\n'){
if(strncmp(sentence, word, length) == 0) //strncmp returns 0 if equal, so you
count++; //should compare to 0 for equality
i++;
}
return count;
}

Related

How to parse out integers from a line with characters and integers

For a C/C++ assignment, I need to take an input line, starting with the character 's', followed by UP TO 3 separate integers. My issue is that, without vectors, I don't know how to account for an unknown number of integers (1-20).
For example, a test input would look like:
s 1 12 20
It was suggested to me to use cin.getline and take the whole line as a string, but how would I know where each integer would lie in a character array because of the possibility of single or double digits, let alone the number of integers in said string?
Construct a std::istringstream from the contents of the line, then keep using operator>> into an int, until it fail()s, stuffing each integer into a std::vector (after using the operator>> initially, once, to take care of the leading character).
You can mimic vectors using dynamic memory allocation. Initially create an array of size 2, using int *a = new int[2];
When this array fills up, make a new array of double the size, copy the old array in the new one and reassign a to the new array. Keep doing this until you have met the requirement.
EDIT
So getting the numbers through the string stream, if the array fills up, you could do:
int changeArr(int *a, int size){
int *b = new int[size*2];
for(int i=0;i<size;i++){
b[i] = a[i];
}
a = b;
return size*2;
}
int getNos(istringstream ss){
int *a = new int[2];
int cap = 2, i=0, number;
while(ss){
if(i>=cap){
cap = changeArr(a, cap);
}
ss >> a[i];
i++;
}
}
I have skipped the part about the first character, but I guess you can handle that.
Without vectors, you have a couple of approaches. (1) read an entire line at a time and tokenize the line with strtok or strsep, or (2) use the standard features built into strtol to walk down the string separating values with the pointer and end-pointer parameters to the function.
Since you know the format, you can easily use either. Both 1 & 2 above do the same thing, you are just using the tools in strtol to both tokenize and convert to a number in a single step. Here is a short example for handling a string followed by an unknown number of digits on each line:
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
enum { BASE = 10, MAXC = 512 };
long xstrtol (char *p, char **ep, int base);
int main (void) {
char buf[MAXC] = "";
while (fgets (buf, MAXC, stdin)) { /* for each line of input */
char *p, *ep; /* declare pointers */
p = buf; /* reset values */
errno = 0;
printf ("\n%s\n", p); /* print the original full buffer */
/* locate 1st digit in string */
for (; *p && (*p < '0' || '9' < *p); p++) {}
if (!*p) { /* validate digit found */
fprintf (stderr, "warning: no digits in '%s'\n", buf);
continue;
}
/* separate integer values */
while (errno == 0)
{ int idx = 0;
long val;
/* parse/convert each number in line into long value */
val = xstrtol (p, &ep, BASE);
if (val < INT_MIN || INT_MAX < val) { /* validate int value */
fprintf (stderr, "warning: value exceeds range of integer.\n");
continue;
}
printf (" int[%2d]: %d\n", idx++, (int) val); /* output int */
/* skip delimiters/move pointer to next digit */
while (*ep && *ep != '-' && (*ep < '0' || *ep > '9')) ep++;
if (*ep)
p = ep;
else
break;
}
}
return 0;
}
/** a simple strtol implementation with error checking.
* any failed conversion will cause program exit. Adjust
* response to failed conversion as required.
*/
long xstrtol (char *p, char **ep, int base)
{
errno = 0;
long val = strtol (p, ep, base);
/* Check for various possible errors */
if ((errno == ERANGE && (val == LONG_MIN || val == LONG_MAX)) ||
(errno != 0 && val == 0)) {
perror ("strtol");
exit (EXIT_FAILURE);
}
if (*ep == p) {
fprintf (stderr, "No digits were found\n");
exit (EXIT_FAILURE);
}
return val;
}
(the xstrtol function just moves the normal error checking to a function to unclutter the main body of the code)
Example Input
$ cat dat/varyint.txt
some string 1, 2, 3
another 4 5
one more string 6 7 8 9
finally 10
Example Use/Output
$ ./bin/strtolex <dat/varyint.txt
some string 1, 2, 3
int[ 0]: 1
int[ 1]: 2
int[ 2]: 3
another 4 5
int[ 0]: 4
int[ 1]: 5
one more string 6 7 8 9
int[ 0]: 6
int[ 1]: 7
int[ 2]: 8
int[ 3]: 9
finally 10
int[ 0]: 10
You can provide a bit of tidying up, but this method can be used to parse an unknown number of values reliably. Look it over and let me know if you have any questions.
Since vectors aren't allowed, you'll need to find out how many numbers are in the line before you can make an array to hold them.
I won't just give you the entire code, since this is homework, but I'll show you what I would do to solve your problem.
If your lines will always look like this: "s number" or "s number number" or "s number number number", then you can easily find the number of numbers in the line by counting the spaces!
There will be one space in any string with one number (between the s and that number), and one more space for each number that follows the first.
So let's count the spaces!
int countSpaces(string s) {
int count = 0;
for (int i = 0; i < s.size(); i++) {
if (s[i] == ' ') {
count++;
}
}
return count;
}
Passing these strings:
string test1 = "s 123 4 99999";
string test2 = "s 1";
string test3 = "s 555 1337";
to the countSpaces function will give us:
3
1
2
And with that information, we can make an array with the correct size to hold each value!
EDIT
Now I realize that you're having trouble grabbing the numbers from the string.
What I would do, is use the above method to find the number of numbers in the line. Then, I would use the std::string.find() function to determine where, and if, any spaces are in the string.
So let's say we had the line: s 123 45 678
countSpaces would tell us we have 3 numbers.
Then we make an array to hold our three numbers. I would also cut off the s part so you don't have to worry about it anymore. Note that you can use std::stoi to turn a string into a number!
Now we can loop while find(' ') doesn't return -1.
In our loop, I would take the substring from 0 to the first space, like so:
num = std::stoi( myLine.substr(0, myLine.find(' ') )
Then you can cut off the part you just used:
myLine = myLine.substr( myLine.find(' ') );
This will grab a number off the front of your string, then chop off that number from the string, and repeat the process while there is still a space in the string.
EDIT:
If you aren't guaranteed to have one space between each number, then you can delete excess spaces before doing this method or you can do it during the countSpaces loop. At that point, it would make more sense to call the function countNums or such.
An example function to remove stretches of spaces and replace them with one space:
void removeExtraSpaces(string s) {
bool inSpaces = (s[0] == ' ');
for (int i = 1; i < s.size(); i++) {
if (s[i] == ' ') {
if(inSpaces) {
s.erase(i);
} else {
inSpaces = true;
}
} else if(inSpaces) {
inSpaces = false;
}
}
}

Find the minimum number of moves to get a "Good" string

A string is called to be good if and only if "All the distinct characters in String are repeated the same number of times".
Now, Given a string of length n, what is the minimum number of changes we have to make in this string so that string becomes good.
Note : We are only allowed to use lowercase English letters, and we can change any letter to any other letter.
Example : Let String is yyxzzxxx
Then here answer is 2.
Explanation : One possible solution yyxyyxxx. We have changed 2 'z' to 2 'y'. Now both 'x' and 'y' are repeated 4 times.
My Approach :
Make a hash of occurrence of all 26 lowercase letters.
Also find number of distinct alphabets in string.
Sort this hash array and start checking if length of string is divisible by number of distinct characters.If yes then we got the answer.
Else reduce distinct characters by 1.
But its giving wrong answers for some results as their may be cases when removing some character that has not occur minimum times provide a good string in less moves.
So how to do this question.Please help.
Constraints : Length of string is up to 2000.
My Approach :
string s;
cin>>s;
int hash[26]={0};
int total=s.length();
for(int i=0;i<26;i++){
hash[s[i]-'a']++;
}
sort(hash,hash+total);
int ans=0;
for(int i=26;i>=1;i--){
int moves=0;
if(total%i==0){
int eachshouldhave=total/i;
int position=26;
for(int j=1;j<26;j++){
if(hash[j]>eachshouldhave && hash[j-1]<eachshouldhave){
position=j;
break;
}
}
int extrasymbols=0;
//THE ONES THAT ARE BELOW OBVIOUSLY NEED TO BE CHANGED TO SOME OTHER SYMBOL
for(int j=position;j<26;j++){
extrasymbols+=hash[j]-eachshouldhave;
}
//THE ONES ABOVE THIS POSITION NEED TO GET SOME SYMBOLS FROM OTHERS
for(int j=0;j<position;j++){
moves+=(eachshouldhave-hash[j]);
}
if(moves<ans)
ans=moves;
}
else
continue;
}
Following should fix your implementation:
std::size_t compute_change_needed(const std::string& s)
{
int count[26] = { 0 };
for(char c : s) {
// Assuming only valid char : a-z
count[c - 'a']++;
}
std::sort(std::begin(count), std::end(count), std::greater<int>{});
std::size_t ans = s.length();
for(std::size_t i = 1; i != 27; ++i) {
if(s.length() % i != 0) {
continue;
}
const int expected_count = s.length() / i;
std::size_t moves = 0;
for(std::size_t j = 0; j != i; j++) {
moves += std::abs(count[j] - expected_count);
}
ans = std::min(ans, moves);
}
return ans;
}

Can anyone please explain this output with a slight modification in the usual getchar_unlocked function()? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I wanted to change the general getchar_unlocked program to print the position of space when the second number is entered.So i introduced 2 more variables r and a, such that r gets incremented every time a non space character is input and a gets r's value when a space character is input. However, by the change i made only the first no. is getting stored in s and a garbage value is getting stored in a. Why so?
#include <cstdio>
#define getcx getchar_unlocked
using namespace std;
inline void scan( int &n, int &a) //extra parameter a added
{
n=0; int r=0;
char ch;
ch=getcx();
while (ch<'0' || ch>'9') //Inclusion of a !=' ' condition
{
if (ch==' ')
a=r;
else
r++;
ch=getcx();
}
while (ch>='0' && ch<='9')
{
r++;
n=10*n + ch -'0';
ch=getcx();
}
}
int main()
{
int t,s,a;
scan(t,a);
printf("%d\n",t);
scan(s,a);
printf("%d",s);
printf("%d",a);
return 0;
}
The scan function skips non-digits, then reads 1 or more digits as an integer into n, and finally skips a single non-digit character. In the first loop to skip non-digits, it counts the number of non-spaces read and stores that number into a whenever it sees a space. If it never sees a space, a will remain unmodified.
So if you give your program an input like 10 20 (with just a single space), that space will be skipped by the first call to scan (the non-digit after the number), and a will never be initialized.
If you want your scan routine to not skip the character after the number it reads, you need to call ungetc to put it back after reading it and discovering that it is not a digit.
You are assigning the value of r to pointer a with a=r;, should be *a=r; and you have similar problems with n=10*n + ch -'0'; should be *n=10*(*n) + ch -'0';. Never forget your function arguments ( int &n, int &a) make n and a pointers and will need to be dereferenced to assign values. In fact, in the original thread, I took the arguments to be pseudo-code (int &n, int &a), int actual implementation, I would do (int *n, int *a) and adjust the code accordingly.
Here is how I updated the original, this illustrates the pass by reference implementation:
#include <stdio.h>
#define getcx getchar_unlocked
/* input parser reads - sign, and any digits that follow */
inline void inp ( int *num )
{
int n = 0;
int sign = 1;
char ch = getcx();
/* get the sign */
while ( ch < '0' || ch > '9' )
{ if (ch == '-') sign = -1; ch = getcx();}
/* add each char read. n is accumulator, (n << 3) + (n << 1) is just n * 10 */
while ( ch >= '0' && ch <= '9' )
n = (n << 3) + (n << 1) + ch - '0', ch = getcx(); /* = n*10 + ch - 48 */
n = n * sign;
*num = n;
}
int main (void) {
int number = 0;
printf ("enter something containing nummbers\n");
inp (&number);
printf ("number: %d\n", number);
return 0;
}

intToStr recursively

This is a task from school, I am supposed to write a recursive function that will convert a given int to a string, I know I'm close but I can't point the missing thing in my code, hints are welcome.
void intToStr(unsigned int num, char s[])
{
if (num < 10)
{
s[0] = '0' + num;
}
else
{
intToStr(num/10, s);
s[strlen(s)] = '0' + num%10;
}
}
Edit: my problem is that the function only works for pre initialized arrays, but if I let the function work on an uninitialized function it will not work.
Unless your array is zero-initialized, you are forgetting to append a null terminator when you modify it.
Just add it right after the last character:
void intToStr(unsigned int num, char s[])
{
if (num < 10)
{
s[0] = '0' + num;
s[1] = 0;
}
else
{
intToStr(num/10, s);
s[strlen(s)+1] = 0; //you have to do this operation here, before you overwrite the null terminator
s[strlen(s)] = '0' + num%10;
}
}
Also, your function is assuming that s has enough space to hold all the digits, so you better make sure it does (INT_MAX is 10 digits long I think, so you need at least 11 characters).
Andrei Tita already showed you the problem you had with the NULL terminators. I will show you an alternative, so you can compare and contrast different approaches:
int intToStr(unsigned int num, char *s)
{
// We use this index to keep track of where, in the buffer, we
// need to output the current character. By default, we write
// at the first character.
int idx = 0;
// If the number we're printing is larger than 10 we recurse
// and use the returned index when we continue.
if(num > 9)
idx = intToStr(num / 10, s);
// Write our digit at the right position, and increment the
// position by one.
s[idx++] = '0' + (num %10);
// Write a terminating NULL character at the current position
// to ensure the string is always NULL-terminated.
s[idx] = 0;
// And return the current position in the string to whomever
// called us.
return idx;
}
You will notice that my alternative also returns the final length of the string that it output into the buffer.
Good luck with your coursework going forward!

loop logic, encrypting array C++

I am trying to perform some operations on an array which the final goal is to do a simple encryption. But anyways my array is 458 characters long which consists of mostly letters and some commas, periods, etc. I am trying to start from last character of array and go to the first character and uppercase all the letters in the array. It reads the last character "" correctly, but then the next step in the for loop is like 4 characters over and skipped a few letters. Is something wrong with my control logic?
void EncryptMessage (ofstream& outFile, char charArray[], int length)
{
int index;
char upperCased;
char current;
for (index = length-1; index <= length; --index)
{
if (charArray[index] >= 'A' && charArray[index] <= 'Z')
{
upperCased = static_cast<char>(charArray[index]);
current = upperCased;
outFile << current;
}
else
{
charArray[index]++;
current = charArray[index];
}
}
}
Change:
for (index = length-1; index <= length; --index)
to:
for (index = length-1; index >= 0; --index)
In the else leg of your if statement, you're setting the value of current, but never writing it out, so all that gets written out are what start as capital letters (and, as others have pointed out, your loop condition isn't correct).
If I were doing this, I'd structure it a bit differently. I'd write a small functor to encrypt a single letter:
struct encrypt {
char operator()(char input) {
if (isupper(input))
return input;
else
return input+1;
}
};
Then I'd put the input into an std::string, and operate on it using std::transform:
std::string msg("content of string goes here.");
std::transform(msg.rbegin(), msg.rend(),
std::ostream_iterator<char>(outFile, ""),
encrypt());