Counting the frequency of characters in a file. C++ - c++

The task is to print the (given text file) encountered Latin characters using the frequency table (without distinguishing between uppercase and lowercase letters) to file f1. The table must be sorted alphabetically.
So far my program only counts the letter A. I'm having problems with creating the loops which go through the whole alphabet and prints the table into another file, could you help me with those?
#include <stdio.h>
const char FILE_NAME[] = "yo.txt";
#include <stdlib.h>
#include <iostream>
using namespace std;
int main() {
int count = 0; /* number of characters seen */
FILE *in_file; /* input file */
/* character or EOF flag from input */
int ch;
in_file = fopen(FILE_NAME, "r");
if (in_file == NULL) {
printf("Cannot open %s\n", FILE_NAME);
system("Pause");
exit(8);
}
while (1) {
char cMyCharacter = 'A';
int value = (int)cMyCharacter;
ch = fgetc(in_file);
if (ch == EOF){
break;
}
int file_character = (int) ch;
if (file_character == value || file_character == value+ 32) {
count++;
}
}
printf("Number of characters in %s is %d\n", FILE_NAME, count);
char cMyCharacter = 'A';
int iMyAsciiValue = (int)cMyCharacter;
cout << iMyAsciiValue;
system("Pause");
fclose(in_file);
return 1;
}

First, get an array of size 26 for frequencies of a to z
int freq[26] = {0};
freq[0] for 'a', freq[1] for 'b', etc.
Second, change
if (file_character == value || file_character == value+ 32)
to
if (file_character >= 'a' && file_character <= 'z')
for all the lower-case alphabets (i.e. 'a' to 'z').
Third, get index and count by
freq[file_character - 'a']++;
, file_character - 'a' calculates the index, and the rest does count.
Fourth, print the freq array.
Fifth, add
else if (file_character >= 'A' && file_character <= 'Z')
for upper-case characters, and change subsequent codes accordingly.
It is your homework, you should try to figure out the whole program yourself. I hope this answer provides enough hints for you.

Related

C++ How to output the letters or numbers from input of letters or numbers

So let's say we have the following case: for ”12323465723” possible answers would be ”abcbcdfegbc” (1 2 3 2 3 4 6 5 7 2 3), ”awwdfegw” (1 23 23 4 6 5 7 23), ”lcwdefgw” (12 3 23 4 6 5 7 23), in this case, the user will input numbers from 1 to 26, not divided by any space and the program itself will suggest 3 ways of interpreting the numbers, getting the most of the combinations from 1 to 26 these being the values from a to z
As you can see this is edited, as this is the last part of the problem, Thank you all who have helped me this far, I've managed to solve half of my problem, only the above mentioned one is left.
SOLVED -> Thank you
This involves a decision between 0 to 2 outcomes at each step. The base cases are there are no more characters or none of them can be used. In the latter case, we backtrack to output the entire tree. We store the word in memory like dynamic programming. This naturally leads to a recursive algorithm.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* (f)printf */
#include <errno.h> /* errno */
#include <string.h> /* strlen */
static char word[2000];
static size_t count;
static void recurse(const char *const str) {
/* Base case when it hits the end of the string. */
if(*str == '\0') { printf("%.*s\n", (int)count, word); return; }
/* Bad input. */
if(*str < '0' || *str > '9') { errno = ERANGE; return; }
/* Zero is not a valid start; backtrack without output. */
if(*str == '0') return;
/* Recurse with one digit. */
word[count++] = *str - '0' + 'a' - 1;
recurse(str + 1);
count--;
/* Maybe recurse with two digits. */
if((*str != '1' && *str != '2')
|| (*str == '1' && (str[1] < '0' || str[1] > '9'))
|| (*str == '2' && (str[1] < '0' || str[1] > '6'))) return;
word[count++] = (str[0] - '0') * 10 + str[1] - '0' + 'a' - 1;
recurse(str + 2);
count--;
}
int main(int argc, char **argv) {
if(argc != 2)
return fprintf(stderr, "Usage: a.out <number>\n"), EXIT_FAILURE;
if(strlen(argv[1]) > sizeof word)
return fprintf(stderr, "Too long.\n"), EXIT_FAILURE;
recurse(argv[1]);
return errno ? (perror("numbers"), EXIT_FAILURE) : EXIT_SUCCESS;
}
When run on your original input, ./a.out 12323465723, it gives,
abcbcdfegbc
abcbcdfegw
abcwdfegbc
abcwdfegw
awbcdfegbc
awbcdfegw
awwdfegbc
awwdfegw
lcbcdfegbc
lcbcdfegw
lcwdfegbc
lcwdfegw
(I think you have made a transposition in lcwdefgw.)
According to ASCII table we know that from 65 to 90 it A to Z.
so below is the simple logic to achieve what you're trying.
int main(){
int n;
cin>>n;
n=n+64;
char a=(char) n;
if (a>=64 && a<=90)
cout<<a;
else cout<<"Error";
return 0;
}
If you want to count the occurencs of "ab" then this will do it:
int main()
{
char line[150];
int grup = 0;
cout << "Enter a line of string: ";
cin.getline(line, 150);
for (int i = 0; line[i] != '\0'; ++i)
{
if (line[i] == 'a' && line[i+1] == 'b')
{
++grup;
}
}
cout << "Occurences of ab: " << grup << endl;
return 0;
}
If you want to convert an int to an ASCII-value you can do that using this code:
// Output ASCII-values
int nr;
do {
cout << "\nEnter a number: ";
cin >> nr;
nr += 96; // + 96 because the ASCII-values of lower case letters start after 96
cout << (char) nr;
} while (nr > 96 && nr < 123);
Here I use the C style of casting values to keep things simple.
Also bear in mind ASCII-values: ASCII Table
Hope this helps.
This could be an interesting problem and you probably tagged it wrong, There's nothing specific to C++ here, but more on algorithm.
First of all the "decode" method that you described from numerical to alphabetical strings is ambiguious. Eg., 135 could be interpreted as either "ace" or "me". Is this simply an oversight or the intended question?
Suppose the ambiguity is just an oversight, and the user will enter numbers properly separated by say a white space (eg., either "1 3 5" or "13 5"). Let nstr be the numerical string, astr be the alphabetical string to count, then you would
Set i=0, cnt=0.
Read the next integer k from nstr (like in this answer).
Decode k into character ch
If ch == astr[i], increment i
If i == astr.length(), set i=0 and increment cnt
Repeat from 2 until reaching the end of nstr.
On the other hand, suppose the ambiguous decode is intended (the numerical string is supposed to have multiple ways to be decoded), further clarification is needed in order to write a solution. For example, how many k's are there in "1111"? Is it 1 or 2, given "1111" can be decoded either as aka or kk, or maybe even 3, if the counting of k doesn't care about how the entire "1111" is decoded?

Finding if two strings are anagrams in O(n) - solution using XOR

I'm working on a problem from hackerearth
The goal is to find if the input strings are anagrams in O(n) time.
Input format:
First line, contains an intger 'T' denoting no. of test cases.
Each test consists of a single line, containing two space separated
strings S1 and S2 of equal length.
My code:
#include <iostream>
#include <string>
int main()
{
int T;
std::cin >> T;
std::cin.ignore();
for(int i = 0; i < T; ++i)
{
std::string testString;
std::getline(std::cin, testString);
char test = ' ';
for (auto& token : testString)
{
if(token != ' ')
test ^= token;
}
if (test == ' ')
std::cout << "YES\n";
else
std::cout << "NO\n";
}
}
The code above fails 5/6 hackerearth tests.
Where is my mistake? Is this a good approach to the problem?
Note: Your question title says that the second word must be an anagram of the first. But, the linked to problem on hackerearth uses the term rearranged, which is more restrictive than an anagram and also says:
Two strings S1 and S2 are said to be identical, if any of the permutation of string S1 is equal to the string S2
One algorithm is to maintain a histogram of the incoming chars.
This is done with two loops, one for the first word and another for the second word.
For the first word, proceed char-by-char and increment the histogram value. Calculate the length of the first word by maintaining a running count.
When the space is reached, do the other loop which decrements the histogram. Maintain a count of the number of histogram cells that reach zero. In the end, this must match the length of the first word (i.e. success).
In the second loop, if a histogram cell goes negative, this is a mismatch because either the second word has a char not in the first word or has too many of a char in the first word.
Caveat: I apologize for this being a C-like solution, but it can easily be adapted to use more STL components
Also, char-at-a-time input may be faster than reading in the entire line into a buffer string
Edit: I've added annotation/comments to the code example to try to make things more clear
#include <stdio.h>
#include <stdlib.h>
char buf[(200 * 1024) + 100];
void
dotest(FILE *xf)
{
int histo[26] = { 0 };
int len = 0;
int chr;
int match = 0;
int fail = 0;
int cnt;
// scan first word
while (1) {
chr = fgetc(xf);
// stop on delimiter between first and second words
if (chr == ' ')
break;
// convert char to histogram index
chr -= 'a';
// increment the histogram cell
cnt = ++histo[chr];
// calculate number of non-zero histogram cells
if (cnt == 1)
++len;
}
// scan second word
while (1) {
chr = fgetc(xf);
// stop on end-of-line or EOF
if (chr == '\n')
break;
if (chr == EOF)
break;
// convert char to histogram index
chr -= 'a';
// decrement the histogram cell
cnt = --histo[chr];
// if the cell reaches zero, we [seemingly] have a match (i.e. the
// number of instances of this char in the second word match the
// number of instances in the first word)
if (cnt == 0)
match += 1;
// however, if we go negative, the second word has too many instances
// of this char to match the first word
if (cnt < 0)
fail = 1;
}
do {
// too many letters in second word that are _not_ in the first word
if (fail)
break;
// the number of times the second word had an exact histogram count
// against the first word must match the number of chars in the first
// [and second] word (i.e. all scrambled chars in the second word had
// a place in the first word)
fail = (match != len);
} while (0);
if (fail)
printf("NO\n");
else
printf("YES\n");
}
// main -- main program
int
main(int argc,char **argv)
{
char *file;
FILE *xf;
--argc;
++argv;
file = *argv;
if (file != NULL)
xf = fopen(file,"r");
else
xf = stdin;
fgets(buf,sizeof(buf),xf);
int tstcnt = atoi(buf);
for (int tstno = 1; tstno <= tstcnt; ++tstno)
dotest(xf);
if (file != NULL)
fclose(xf);
return 0;
}
UPDATE:
I've only had a glance at the code but it seems that len goes up for every char found (string lenght). and match goes up only when a unique char (histogram element) is exausted, so the check match == len will not be good?
len is only incremented in the first loop. (i.e.) It is the length of the first word only (as mentioned in the algorithm description above).
In the first loop, there is a check for the char being a space [which is guaranteed by the problem definition of the input to delimit the end of the first word] and the loop is broken out of at that point [before len is incremented], so len is correct.
The use of len, match, and fail speed things up. Otherwise, at the end, we'd have to scan the entire histogram and ensure all elements are zero to determine success/failure (i.e. any non-zero element means mismatch/failure).
Note: When doing such timed coding challenges before, I've noted that they can be pretty strict on elapsed time/speed and space. It's best to try to optimize as much as possible because, even if the algorithm is technically correct, it can fail the test for using too much memory or taking too much time.
That's why I suggested not using a string buffer because the maximum size as defined by the problem can be 100,000 bytes. Also, doing the [unnecessary] scan of the histogram at the end would also add time.
UPDATE #2:
It may be faster to read a full line at a time and then use a char pointer to traverse the buffer. Here's a version that does that. Which method is faster would need to be tried/benchmarked to see.
#include <stdio.h>
#include <stdlib.h>
char buf[(200 * 1024) + 100];
void
dotest(FILE *xf)
{
char *cp;
int histo[26] = { 0 };
int len = 0;
int chr;
int match = 0;
int fail = 0;
int cnt;
cp = buf;
fgets(cp,sizeof(buf),xf);
// scan first word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on delimiter between first and second words
if (chr == ' ')
break;
// convert char to histogram index
chr -= 'a';
// increment the histogram cell
cnt = ++histo[chr];
// calculate number of non-zero histogram cells
if (cnt == 1)
++len;
}
// scan second word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on end-of-line
if (chr == '\n')
break;
// convert char to histogram index
chr -= 'a';
// decrement the histogram cell
cnt = --histo[chr];
// if the cell reaches zero, we [seemingly] have a match (i.e. the
// number of instances of this char in the second word match the
// number of instances in the first word)
if (cnt == 0)
match += 1;
// however, if we go negative, the second word has too many instances
// of this char to match the first word
if (cnt < 0) {
fail = 1;
break;
}
}
do {
// too many letters in second word that are _not_ in the first word
if (fail)
break;
// the number of times the second word had an exact histogram count
// against the first word must match the number of chars in the first
// [and second] word (i.e. all scrambled chars in the second word had
// a place in the first word)
fail = (match != len);
} while (0);
if (fail)
printf("NO\n");
else
printf("YES\n");
}
// main -- main program
int
main(int argc,char **argv)
{
char *file;
FILE *xf;
--argc;
++argv;
file = *argv;
if (file != NULL)
xf = fopen(file,"r");
else
xf = stdin;
fgets(buf,sizeof(buf),xf);
int tstcnt = atoi(buf);
for (int tstno = 1; tstno <= tstcnt; ++tstno)
dotest(xf);
if (file != NULL)
fclose(xf);
return 0;
}
UPDATE #3:
The above two examples had a slight bug. It would report a false negative on an input line of (e.g.) aaa aaa.
The increment of len was always done in the first loop. This was incorrect. I've edited the above two examples to do the increment of len conditionally (i.e. only if the histogram cell was zero before the increment). Now, len is "the number of non-zero histogram cells in the first string". This takes into account duplicates in the string (e.g. aa).
As I mentioned, the use of len, match, and fail was to obviate the need to scan all histogram cells at the end, looking for a non-zero cell which means mismatch/failure.
This would [possibly] run faster for short input lines, where the post scan of the histogram took longer than the input line loops.
However, given that input lines can be 200k in length, the probability is that [almost] all of the histogram cells will be incremented/decremented. Also, the post scan of the histogram (e.g. check 26 integer array values for non-zero) is now a negligible part of the overall time.
Thus, the simple implementation [below] that eliminates len/match calculations inside the first two loops may be the fastest/best choice. This is because the two loops are slightly faster.
#include <stdio.h>
#include <stdlib.h>
char buf[(200 * 1024) + 100];
void
dotest(FILE *xf)
{
char *cp;
char buf[(200 * 1024) + 100];
int histo[26] = { 0 };
int chr;
int fail = 0;
cp = buf;
fgets(cp,sizeof(buf),xf);
// scan first word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on delimiter between first and second words
if (chr == ' ')
break;
// convert char to histogram index
chr -= 'a';
// increment the histogram cell
++histo[chr];
}
// scan second word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on end-of-line
if (chr == '\n')
break;
// convert char to histogram index
chr -= 'a';
// decrement the histogram cell
--histo[chr];
}
// scan histogram
for (int idx = 0; idx < 26; ++idx) {
if (histo[idx]) {
fail = 1;
break;
}
}
if (fail)
printf("NO\n");
else
printf("YES\n");
}
// main -- main program
int
main(int argc,char **argv)
{
char *file;
FILE *xf;
--argc;
++argv;
file = *argv;
if (file != NULL)
xf = fopen(file,"r");
else
xf = stdin;
fgets(buf,sizeof(buf),xf);
int tstcnt = atoi(buf);
for (int tstno = 1; tstno <= tstcnt; ++tstno)
dotest(xf);
if (file != NULL)
fclose(xf);
return 0;
}
The downside is that there is no "early escape" from the second loop. We would have to finish the scan of the second string even though we might be able to tell early that the second string can't match (e.g.):
aaaaaaaaaa baaaaaaaaa
baaaaaaaaa bbaaaaaaaa
With the simple version we couldn't terminate the second loop early even though we know the second string can never match when we see the b (i.e. the histogram cell goes negative) and skip over the scan of the multiple a in the second word.
So, here's a version that has a simple first loop as above. It adds back the on-the-fly check for a cell going negative in the second loop.
Once again, which version [of the four I've presented] is the best needs some experimentation/benchmarking.
#include <stdio.h>
#include <stdlib.h>
char buf[(200 * 1024) + 100];
void
dotest(FILE *xf)
{
char *cp;
int histo[26] = { 0 };
int chr;
int fail = 0;
int cnt;
cp = buf;
fgets(cp,sizeof(buf),xf);
// scan first word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on delimiter between first and second words
if (chr == ' ')
break;
// convert char to histogram index
chr -= 'a';
// increment the histogram cell
++histo[chr];
}
// scan second word
for (chr = *cp++; chr != 0; chr = *cp++) {
// stop on end-of-line
if (chr == '\n')
break;
// convert char to histogram index
chr -= 'a';
// decrement the histogram cell
cnt = --histo[chr];
// however, if we go negative, the second word has too many instances
// of this char to match the first word
if (cnt < 0) {
fail = 1;
break;
}
}
do {
// too many letters in second word that are _not_ in the first word
if (fail)
break;
// scan histogram
for (int idx = 0; idx < 26; ++idx) {
if (histo[idx]) {
fail = 1;
break;
}
}
} while (0);
if (fail)
printf("NO\n");
else
printf("YES\n");
}
// main -- main program
int
main(int argc,char **argv)
{
char *file;
FILE *xf;
char buf[100];
--argc;
++argv;
file = *argv;
if (file != NULL)
xf = fopen(file,"r");
else
xf = stdin;
fgets(buf,sizeof(buf),xf);
int tstcnt = atoi(buf);
for (int tstno = 1; tstno <= tstcnt; ++tstno)
dotest(xf);
if (file != NULL)
fclose(xf);
return 0;
}
public static final int ASC = 97;
static boolean isAnagram(String a, String b) {
boolean res = false;
int len = a.length();
if (len != b.length()) {
return res;
}
a = a.toLowerCase();
b = b.toLowerCase();
int[] a_ascii = new int[26];
int aval = 0;
for (int i = 0; i < 2 * len; i++) {
if (i < len) {
aval = a.charAt(i) - ASC;
a_ascii[aval] = (a_ascii[aval] == 0) ? (aval * len + 1) : (a_ascii[aval] + 1);
} else {
aval = b.charAt(i - len) - ASC;
if (a_ascii[aval] == 0) {
return false;
}
a_ascii[aval] = a_ascii[aval] - 1;
res = (a_ascii[aval] == aval * len) ? true : false;
}
}
return res;
}

How Sorting a String by ASCII order

I just make a program to sort a string in alphabetically order, but i have problem if i am input number it's not shown in output . How i sort in ASCII order. Any one can help ?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void sort_string(char*);
int main()
{
char string[100];
printf("Enter some text\n");
gets(string);
sort_string(string);
printf("%s\n", string);
return 0;
}
void sort_string(char *s)
{
int c, d = 0, length;
char *pointer, *result, ch;
length = strlen(s);
result = (char*)malloc(length+1);
pointer = s;
for ( ch = 'A' ; ch <= 'z' ; ch++ ) // i don't know how add range
{
for ( c = 0 ; c < length ; c++ )
{
if ( *pointer == ch )
{
*(result+d) = *pointer;
d++;
}
pointer++;
}
pointer = s;
}
*(result+d) = '\0';
strcpy(s, result);
free(result);
}
Sorry if my code still bad, i am still learn c++
When you look at the ASCII table, the number '0' starts on 0x30 and ends with a '9' at 0x39. In your loop, the first letter starts with an 'A', on 0x41.
Just start your loop from '0' to 'z', and it'll include the numbers too. (It will also include signs such as <, #, etc...)

Keeping Tally of Characters in Arrays

I'm working on a Caesar Cipher program for an assignment and I have the general understanding planned out, but my function for determining the decipher key is unnecessarily long and messy.
while(inFile().peek != EOF){
inFile.get(character);
if (character = 'a'|| 'A')
{ aCount++; }
else if (character = 'b' || 'B')
{ bCount++; }
so on and so on.
What way, if it's possible, can I turn this into an array?
You can use the following code:
int count [26] = {0};
while(inFile().peek != EOF){
inFile.get(character);
if (int (character) >=65 || int (character) <=90)
{ count [(int (character)) - 65] ++; }
else if (int (character) >=97 || int (character) <=122)
{ count [(int (character)) - 97] ++; }
}
P.S. This is checking for the ASCII value of each character and then increment its respective element in the array of all characters, having 0 index for A/a and 1 for B/b and so on.
Hope this helps...
P.S. - There was an error in your code, = is an assignment operator and == is a conditional operator and you do not assign value in if statement, you check for condition... So always use == to check for equality...
You can use an array in the following manner
int letterCount['z'] = {0}; //z is the highest letter in the uppercase/lowercase alphabet
while(inFile().peek != EOF){
inFile.get(character);
if (character > 'A' && character < 'z')
letterCount[character]++;
}
You can also use a hashmap like this
#include <unordered_map>
std::unordered_map<char,int> charMap;
while(inFile().peek != EOF){
inFile.get(character);
if (charMap.find(character) == charMap.end())
charMap[character] = 1;
else
charMap[character] = charMap[character] + 1;
}
In case you do not know, a hashmap functions as an array, where the index can be any class you like, as long as it implements a hash function.

Word Unscrambling Program - C++

Hi I'm working a program to unscramble a set of letters and output all the words that can be made from that set of letters, for example: If i inputed the letters "vlei", the program would output "live", "evil", and "vile".
So far I have looked through the internet about this quiiiite a bit and can't find anything on my specific questions relevant to my skill level at this point (level 2 noob).
So far I have gotten as far as making all the possible combinations from the the given letters. Excluding any that are less than 7 letters, which is a problem.
This is the code I have so far:
string letter;
char newWord[7];
int main()
{
cout << "Type letters here: ";
cin >> letter;
for(int i = 0 ; i < 7 ; i++)
{
for(int j = 0 ; j < 7 ; j++)
{
for(int k = 0 ; k < 7 ; k++)
{
for(int l = 0 ; l < 7 ; l++)
{
for(int m = 0 ; m < 7 ; m++)
{
for(int n = 0 ; n < 7 ; n++)
{
for(int o = 0 ; o < 7 ; o++)
{
sprintf(newWord, "%c%c%c%c%c%c%c", letter[i], letter[j], letter[k], letter[l], letter[m], letter[n], letter[o]);
}
}
}
}
}
}
}
return 0;
}
I was wondering if anyone has any experience with anything like this, and can offer and hints or advice.
Specifically what I'm having difficulty with is how to read in a .txt file to use as a dictionary to compare words to.
Also, I was having trouble using strcmp() which is what I was planning to use to compare the scrambled words to the dictionary. So if there are any other maybe simpler ways to compare the two strings, that would be greatly appreciated.
Thanks in advance.
Hi guys, so I've just finished my program and I hope it can help someone else. Thanks a lot for all your help.
#include <iostream>
#include <fstream>
#include <string>
#include <cstring>
#include <stdio.h>
#include <stdlib.h>
#include <algorithm>
#include <vector>
#include <array>
using namespace std;
//declaring variables
int i;
int scores[531811]; //array for scores of found words
string wordlist[531811]; //array for found matched words
string word[531811]; //array of strings for dictionary words about to be read it
string tester;//string for scrambled letters that will be read in
int scorefinder(string scrab) //SCORE FINDER FUNCTION
{
int score = 0;
int x = 0;
int j = 0;
while (scrab[j])
{
char ltr = toupper(scrab[j]); //converts to all caps
//assings values to each letter and adds it to itself
if(ltr == 'A' || ltr == 'E' || ltr == 'I' || ltr == 'L' || ltr == 'N' || ltr == 'O' || ltr == 'R' || ltr == 'S' || ltr == 'T' || ltr == 'U')
x += 1;
else if(ltr == 'D' || ltr == 'G')
x += 2;
else if(ltr == 'B' || ltr == 'C' || ltr == 'M' || ltr == 'P')
x += 3;
else if(ltr == 'F' || ltr == 'H' || ltr == 'V' || ltr == 'W' || ltr == 'Y')
x += 4;
else if(ltr == 'K')
x += 5;
else if(ltr == 'J' || ltr == 'X')
x += 8;
else if(ltr == 'Q' || ltr == 'Z')
x += 10;
++j;
}
score = x;
return score;
}
int main () {
//READS IN DICTIONARY
ifstream file("words.txt"); //reads in dictionary
if (!file.is_open()){ //checks if file is being NOT read correctly
cout << "BROEKN \n"; //prints error message if so
}
if(file.is_open()){ //checks if file IS being read correctly
for(int i = 0; i < 531811; i++){
file >> word[i]; //read in each word from the file and
} //assigns each to it's position in the words array
}
//END OF READ IN DICTIONARY
cout << "Enter scrambled letters: ";
cin >> tester; //reads in scrambled letters
sort(tester.begin(),tester.end()); //sorts scrambled letters for next_permutation
while (next_permutation(tester.begin(),tester.end())){ //while there are still permutations available
for(i=0;i<531811;i++){
if ( is_permutation (word[i].begin(),word[i].end(), tester.begin())){
wordlist[i] = word[i]; //assigns found word to foundword array
scores[i] = scorefinder(word[i]); //assigns found word score to foundscore array
}
}
}
//PRINTS OUT ONLY MATCHED WORDS AND SCORES
for(i=0;i<531811;i++){
if(scores[i]!=0){
cout << "Found word: " << wordlist[i] << " " << scores[i] << "\n";
}
}
}
Well, what you need is some sort of comparison. C++ doesn´t know, what a right word in english is. So you may need a wordlist. Then you can Brutforce(that´s what you´re doing at the moment) until you find a match.
For comparing your brutforced result, you may use a .txt with as many english words as you can find. Then you have to use a FileStream for iterating through every word and comparing it to your brutforce result.
After you sucessfully unscrambled a word, you should think about your solution again. As you can see, you are limited to a specific amount of chars which is not that nice.
This sounds like an interesting Task for a beginner ;)
Suppose you have found a word list in the form of plain text file on the Internet, you may load all the words into a vector for string first.
ifstream word_list_file("word_list.txt");
string buffer;
vector<string> all_words;
while (getline(word_list_file, buffer))
all_words.push_back(buffer);
Then we want to compare the input letters with the each entry of all_words. I suggest using std::is_permutation. It compares two sequence regardless the order. But it can have trouble when the two sequence has different length, so compare the length yourself first.
// Remember to #include <algorithm>
bool match(const string& letters, const string& each_word)
{
if (letters.size() != each_word.size())
return false;
return is_permutation(begin(letters), end(letters), begin(each_word));
}
Note that I have not tested my codes. But that's the idea.
An edit responsing the comment:
In short, just use std::string, not std::array. Or copy my match function directly, and invoke it. This will be easier for your case.
Details:
std::is_permutation can be used with any container and any element type. For example:
#include <string>
#include <array>
#include <vector>
#include <list>
#include <algorithm>
using namespace std;
int main()
{
//Example 1
string str1 = "abcde";
string str2 = "ecdba";
is_permutation(begin(str1), end(str1), begin(str2));
//Example 2
array<double, 4> array_double_1{ 4.1, 4.2, 4.3, 4.4 };
array<double, 4> array_double_2{ 4.2, 4.1, 4.4, 4.3 };
is_permutation(begin(array_double_1), end(array_double_1), begin(array_double_2));
//Example 3
list<char> list_char = { 'x', 'y', 'z' };
string str3 = "zxy";
is_permutation(begin(list_char), end(list_char), begin(str3));
// Exampl 4
short short_integers[4] = { 1, 2, 3, 4 };
vector<int> vector_int = { 3, 4, 2, 1 };
is_permutation(begin(list_char), end(list_char), begin(str3));
return 0;
}
Example 1 uses std::string as containers of chars, which is exactly how my match function work.
Example 2 uses two arrays of double of size 4.
Example 3 even uses two different kinds of containers, with the same element types. (Have you heard of `std::list'? Never mind, just focus on our problem first.)
Example 4 is even stranger. One container is old style raw array, another is a std::vector. There are also two element types, short and int, but they are both integer. (The exact difference between short and int is not relevant here.)
Yet, all four cases can use is_permutation. Very flexiable.
The flexibility is enabled by the following facts:
is_permutation is not exactly a function. It is a function template, which is a language feature to generate new functions according to the data type you pass to it.
The containers and is_permutation algorithm do not know each other. They communicate through a middleman called "iterator". The begin and end functions together give us a pair of iterators representing the "range" of elements.
It requires more studies to understand these facts. But the general idea is not hard. Also, these facts are also true for other algorithms in the Standard Library.
Try this :
# include <stdio.h>
/* Function to swap values at two pointers */
void swap (char *x, char *y)
{
char temp;
temp = *x;
*x = *y;
*y = temp;
}
/* Function to print permutations of string
This function takes three parameters:
1. String
2. Starting index of the string
3. Ending index of the string. */
void permute(char *a, int i, int n)
{
int j;
if (i == n)
printf("%s\n", a);
else
{
for (j = i; j <= n; j++)
{
swap((a+i), (a+j));
permute(a, i+1, n);
swap((a+i), (a+j)); //backtrack
}
}
}
/* Driver program to test above functions */
int main()
{
char a[] = "vlei";
permute(a, 0, 3);
getchar();
return 0;
}