C++ reading sentences - c++

string a = MwZwXxZwDwJrBxHrHxMrGrJrGwHxMrFrZrZrDrKwZxLrZrFwZxErMrXxArZw;
Assume i have this data in my string . I want to record how many M , Z , X , D , J (including those capital letters i didn't mentions ) in in string how can do it ? My friends say use vector can do it but i does not really know how to use vector is there any alternative way to do it .
I tried using for loops to do and find the M , and reset the pointer to 0 to continue find the next capital value , but not sure is there any easier way to do it .

first I'll show you a 'easier' way to me.
#include <iostream>
#include <map>
using namespace std;
int main(int argc, const char * argv[]) {
string str = "MwZwXxZwDwJrBxHrHxMrGrJrGwHxMrFrZrZrDrKwZxLrZrFwZxErMrXxArZw";
map<char,int> map;
for (int i=0; i<str.length(); i++) {
char ch = str[i];
if (isupper(ch)) {
map[ch] ++;
}
}
for (auto item : map) {
cout<<item.first<<':'<<item.second<<endl;
}
return 0;
}
you'll only need to use 1 loop to solve your problem.
the 'isupper(int _c)' is a function from the standard library, it can tell you wether a character is a capital letter.
the 'map' is a data structure from the standard library too, it can do key-value storage for you.
this program outputs this:
A:1
B:1
D:2
E:1
F:2
G:2
H:3
J:2
K:1
L:1
M:4
X:2
Z:8
is this what you want?

Use regex.
using namespace std;
// regex_search example
#include <iostream>
#include <string>
#include <regex>
int main ()
{
std::string s ("MwZwXxZwDwJrBxHrHxMrGrJrGwHxMrFrZrZrDrKwZxLrZrFwZxErMrXxArZw;");
std::smatch m;
std::regex e ("[A-Z\s]+");
map<string,int> map;
std::cout << "Target sequence: " << s << std::endl;
std::cout << "Regular expression: [A-Z\s]+" << std::endl;
std::cout << "The following matches and submatches were found:" << std::endl;
while (std::regex_search (s,m,e)) {
for (auto x:m)
{
//cout << x << " ";
map[x.str()] ++;
}
//cout << std::endl;
s = m.suffix().str();
}
for (auto item : map) {
cout<<item.first<<':'<<item.second<<endl;
}
return 0;
}

The most direct translation of "loop through the string and count the uppercase letters" into C++ I can think of:
#include <iostream>
#include <map>
#include <cctype>
int main()
{
string a = "MwZwXxZwDwJrBxHrHxMrGrJrGwHxMrFrZrZrDrKwZxLrZrFwZxErMrXxArZw";
std::map<char, int> count;
// Loop through the string...
for (auto c: a)
{
// ... and count the uppercase letters.
if (std::isupper(c))
{
count[c] += 1;
}
}
// Show the result.
for (auto it: count)
{
std::cout << it.first << ": " << it.second << std::endl;
}
}

Related

How can I find the positions from characters in a string with string::find?

I need the positions of characters in a string.
The String contains:
"username":"secret", "password":"also secret", "id":"secret too", "token":"secret"
and I need the positions of the quotation marks from the token that are bold: "token":"secret".
I have experimented with the code from http://www.cplusplus.com/reference/string/string/find
but everything didn't work. Can anyone help me?
Here is what i have tried but it only gives out a 0:
#include <iostream>
#include <string>
int main() {
std::string buffer("\"username\":\"secret\", \"password\":\"also secret\", \"id\":\"secret too\", \"token\":\"secret\"");
size_t found = buffer.find('"');
if (found == std::string::npos)std::cout << "something went wrong\n";
if (found != std::string::npos)
std::cout << "first " << '"' << " found at: " << found << '\n';
for (int j = 0; j <= 17; ++j) {
found = buffer.find('"');
found + 1, 6;
if (found != std::string::npos)
std::cout << "second " << '"' << " found at : " << found << '\n';
}
return 0;
There are so many possible solutions. So, it is hard to answer.
What basically needs to be done, is to iterate through the string, position by position, then check if the character is the searched one, and then do something with the result.
A first simple implementation could be:
#include <iostream>
#include <string>
const std::string buffer("\"username\":\"secret\", \"password\":\"also secret\", \"id\":\"secret too\", \"token\":\"secret\"");
int main() {
for (size_t position{}, counter{}; position < buffer.length(); ++position) {
if (buffer[position] == '\"') {
++counter;
std::cout << "Character \" number " << counter << " found at position " << position << '\n';
}
}
return 0;
}
But then, your question was about the usage of std::string.find(). In your implementation, you start always the search at the beginning of the std::string. And because of that, you will always find the same " at position 0.
Solution: After you have found the first match, use the resulting pos (incremented by one) as the second parameter to the std::string.find() function. Then you will start the search after the first found " and hence find the next one. And all this can be done in a normal for-loop.
See below the next easy example:
#include <iostream>
#include <string>
const std::string buffer("\"username\":\"secret\", \"password\":\"also secret\", \"id\":\"secret too\", \"token\":\"secret\"");
int main() {
for (size_t position{}, counter{}; std::string::npos != (position = buffer.find("\"", position)); ++position, ++counter) {
std::cout << "Character \" number " << counter << " found at position " << position << '\n';
}
return 0;
}
There are more solutions, depending on what you really want to do. You coud extract all keywords and data with a simple regex.
Something like this:
#include <iostream>
#include <string>
#include <regex>
#include <vector>
const std::regex re{ R"(\"([ a-zA-Z0-9]+)\")" };
const std::string buffer("\"username\":\"secret\", \"password\":\"also secret\", \"id\":\"secret too\", \"token\":\"secret\"");
int main() {
std::vector part(std::sregex_token_iterator(buffer.begin(), buffer.end(), re, 1), {});
std::cout << part[7] << '\n';
return 0;
}
Or, you can split everything into tokens and values. Like this:
#include <iostream>
#include <string>
#include <regex>
#include <vector>
#include <map>
#include <iomanip>
const std::regex re1{ "," };
const std::regex re2{ R"(\"([^\"]+)\")" };
const std::string buffer("\"username\":\"secret\", \"password\":\"also secret\", \"id\":\"secret too\", \"token\":\"secret\"");
int main() {
std::vector<std::string> block(std::sregex_token_iterator(buffer.begin(), buffer.end(), re1, -1), {});
std::map<std::string, std::string> entry{};
for (const auto& b : block) {
std::vector blockPart(std::sregex_token_iterator(b.begin(), b.end(), re2, 1), {});
entry[blockPart[0]] = blockPart[1];
}
for (const auto& [token, value] : entry)
std::cout << std::setw(20) << token << " --> " << value << '\n';
return 0;
}
But if you have a complex given format, like JSON, there are so many special cases that the only meaningful approach is to use an existing library.

How do I remove repeated words from a string and only show it once with their wordcount

Basically, I have to show each word with their count but repeated words show up again in my program.
How do I remove them by using loops or should I use 2d arrays to store both the word and count?
#include <iostream>
#include <stdio.h>
#include <iomanip>
#include <cstring>
#include <conio.h>
#include <time.h>
using namespace std;
char* getstring();
void xyz(char*);
void tokenizing(char*);
int main()
{
char* pa = getstring();
xyz(pa);
tokenizing(pa);
_getch();
}
char* getstring()
{
static char pa[100];
cout << "Enter a paragraph: " << endl;
cin.getline(pa, 1000, '#');
return pa;
}
void xyz(char* pa)
{
cout << pa << endl;
}
void tokenizing(char* pa)
{
char sepa[] = " ,.\n\t";
char* token;
char* nexttoken;
int size = strlen(pa);
token = strtok_s(pa, sepa, &nexttoken);
while (token != NULL) {
int wordcount = 0;
if (token != NULL) {
int sizex = strlen(token);
//char** fin;
int j;
for (int i = 0; i <= size; i++) {
for (j = 0; j < sizex; j++) {
if (pa[i + j] != token[j]) {
break;
}
}
if (j == sizex) {
wordcount++;
}
}
//for (int w = 0; w < size; w++)
//fin[w] = token;
//cout << fin[w];
cout << token;
cout << " " << wordcount << "\n";
}
token = strtok_s(NULL, sepa, &nexttoken);
}
}
This is the output I get:
I want to show, for example, the word "i" once with its count of 5, and then not show it again.
First of all, since you are using c++, I would recommend you to split text in c++ way(some examples are here), and store every word in map or unordered_map. Example of my realization you can find here
But if you don't want to rewrite your code, you can simply add a variable that will indicate whether a copy of the word was found before or after the word position. If a copy was not found in front, then print your word
This post gives an example to save each word from your 'strtok' function into a vector of string. Then, use string.compare to have each word compared with word[0]. Those indexes match with word[0] are marked in an int array 'used'. The count of match equals to the number marks in the array used ('nused'). Those words of marked are then removed from the vector, and the remaining carries on to the next comparing process. The program ends when no word remained.
You may write a word comparing function to replace 'str.compare(str2)', if you prefer not to use std::vector and std::string.
#include <iostream>
#include <string>
#include <vector>
#include<iomanip>
#include<cstring>
using namespace std;
char* getstring();
void xyz(char*);
void tokenizing(char*);
int main()
{
char* pa = getstring();
xyz(pa);
tokenizing(pa);
}
char* getstring()
{
static char pa[100] = "this is a test and is a test and is test.";
return pa;
}
void xyz(char* pa)
{
cout << pa << endl;
}
void tokenizing(char* pa)
{
char sepa[] = " ,.\n\t";
char* token;
char* nexttoken;
std::vector<std::string> word;
int used[64];
std::string tok;
int nword = 0, nsize, nused;
int size = strlen(pa);
token = strtok_s(pa, sepa, &nexttoken);
while (token)
{
word.push_back(token);
++nword;
token = strtok_s(NULL, sepa, &nexttoken);
}
for (int i = 0; i<nword; i++) std::cout << word[i] << std::endl;
std::cout << "total " << nword << " words.\n" << std::endl;
nsize = nword;
while (nsize > 0)
{
nused = 0;
tok = word[0] ;
used[nused++] = 0;
for (int i=1; i<nsize; i++)
{
if ( tok.compare(word[i]) == 0 )
{
used[nused++] = i; }
}
std::cout << tok << " : " << nused << std::endl;
for (int i=nused-1; i>=0; --i)
{
for (int j=used[i]; j<(nsize+i-nused); j++) word[j] = word[j+1];
}
nsize -= nused;
}
}
Notice that the removal of used words has to do in backward order. If you do it in sequential order, the marked indexes in the 'used' array will need to be changed. A running test:
$ ./a.out
this is a test and is a test and is test.
this
is
a
test
and
is
a
test
and
is
test
total 11 words.
this : 1
is : 3
a : 2
test : 3
and : 2
I read your last comment.
But I am very sorry, I do not know C. So, I will answer in C++.
But anyway, I will answer with the C++ standard approach. That is usually only 10 lines of code . . .
#include <iostream>
#include <algorithm>
#include <map>
#include <string>
#include <regex>
// Regex Helpers
// Regex to find a word
static const std::regex reWord{ R"(\w+)" };
// Result of search for one word in the string
static std::smatch smWord;
int main() {
std::cout << "\nPlease enter text: \n";
if (std::string line; std::getline(std::cin, line)) {
// Words and its appearance count
std::map<std::string, int> words{};
// Count the words
for (std::string s{ line }; std::regex_search(s, smWord, reWord); s = smWord.suffix())
words[smWord[0]]++;
// Show result
for (const auto& [word, count] : words) std::cout << word << "\t\t--> " << count << '\n';
}
return 0;
}

How to compare two text files and find the similarities between then?

i have loaded both of my files into an array and im trying to compare both of the files to get the comparisons inside the file. However when I run my code I don't receive an output.
This is the contents of both files.
file1
tdogicatzhpigu
file2
dog
pig
cat
rat
fox
cow
So when it does a comparison between the words from search1.txt and the words from text1.txt. I want to find the occurence of each word from search1.txt in text1.txt
What I want to eventually output is whether it has been found the index of the location inside the array.
e.g
"dog". Found, location 1.
Here is my code
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
ifstream file1("text1.txt");
if (file1.is_open())
{
string myArray[1];
for (int i = 0; i < 1; i++)
{
file1 >> myArray[i];
any further help would be greatly appreciated. Thanks in advance.
I believe the goal is to search the text in file1 for each word in file2.
You can't use equality for the two strings, as they aren't equal. You'll need to use the std::string::find method:
std::string target_string;
std::getline(file1, target_string);
std::string keyword;
while (getline(file2, keyword))
{
const std::string::size_type position = target_string.find(keyword);
std::cout << "string " << keyword << " ";
if (position == std::string::npos)
{
std::cout << "not found.\n";
}
else
{
std::cout << "found at position " << position << "\n";
}
}
Edit 1:
An implemented example:
#include <iostream>
#include <string>
using std::cout;
using std::string;
using std::endl;
int main()
{
const std::string target_string = "tdogicatzhpigu";
const std::string key_list[] =
{
"dog",
"pig",
"cat",
"rat",
"fox",
"cow",
};
static const unsigned int key_quantity =
sizeof(key_list) / sizeof(key_list[0]);
for (unsigned int i = 0; i < key_quantity; ++i)
{
const std::string::size_type position = target_string.find(key_list[i]);
std::cout << "string " << key_list[i] << " ";
if (position == std::string::npos)
{
std::cout << "not found.\n";
}
else
{
std::cout << "found at position " << position << "\n";
}
}
return 0;
}

Removing all the characters (a-z, A-Z) from a string in C++

Here's my code:
#include <iostream>
using namespace std;
string moveString(string t, int index)
{
for (int i=index; t[i]!=NULL;i++)
{
t[i]=t[i+1];
}
return t;
}
string delChars(string t)
{
for (int i=0; t[i]!=NULL; i++)
{
if (t[i]>'a' && t[i]<'z')
{
moveString(t, i);
}
else if (t[i]>'A' && t[i]<'Z')
{
moveString(t, i);
}
}
return t;
}
int main()
{
int numberOfSpaces;
string t;
cout << "Text some word: "; cin>>t;
cout<<delChars(t);
return 0;
}
First function moveString should (in theory) take down every single character from a string by 1 index down (starting from given index) - to remove 1 character. The rest is pretty obvious. But:
Input: abc123def
Output: abc123def
What am I doing wrong?
And a additional mini-question: Acutally, what's the best way to "delete" an element from an array? (array of ints, chars, etc.)
Logic Stuff is right but his answer is not enough. You shouldn't increase i after move. Since the i.th character is removed and i points to the next character now.
string delChars(string t)
{
for (int i=0; t[i]!=NULL; )
{
if (t[i]>'a' && t[i]<'z')
{
t = moveString(t, i);
}
else if (t[i]>'A' && t[i]<'Z')
{
t = moveString(t, i);
}
else
i++;
}
return t;
}
moveString takes t by value and you're not assigning its return value, so it doesn't change t in delChars. So, make sure the next thing you learn are references.
Apart from that, I don't know what to tell about t[i] != NULL (if it is undefined behavior or not), but we have std::string::size to get the length of std::string, e.g. i < t.size(). And if you havet[i + 1], the condition should then be i + 1 < t.size().
Whatever, don't play with it like with char arrays, leaving the string with previous size. You can pop_back the last (duplicate) character after shifting the characters.
It's worth mentioning that it can be done in one line of idiomatic C++ algorithms, but you want to get your code working...
What am I doing wrong?
Not using standard algorithms
Actually, what's the best way to "delete" an element from array? (array of ints, chars, etc.)
By using the standard remove-erase idiom:
#include <iostream>
#include <string>
#include <algorithm>
#include <iomanip>
#include <cstring>
int main()
{
using namespace std;
auto s = "!the 54 quick brown foxes jump over the 21 dogs."s;
cout << "before: " << quoted(s) << endl;
s.erase(std::remove_if(s.begin(),
s.end(),
[](auto c) { return std::isalpha(c); }),
s.end());
cout << "after: " << quoted(s) << endl;
return 0;
}
expected output:
before: "!the 54 quick brown foxes jump over the 21 dogs."
after: "! 54 21 ."
I'm not allowed to use standard algorithms
Then keep it simple:
#include <iostream>
#include <string>
#include <algorithm>
#include <iomanip>
#include <cstring>
std::string remove_letters(const std::string& input)
{
std::string result;
result.reserve(input.size());
for (auto c : input) {
if (!std::isalpha(c)) {
result.push_back(c);
}
}
return result;
}
int main()
{
using namespace std;
auto s = "!the 54 quick brown foxes jump over the 21 dogs."s;
cout << "before: " << quoted(s) << endl;
auto s2 = remove_letters(s);
cout << "after: " << quoted(s2) << endl;
return 0;
}

string parsing for C++

I have a text file that has #'s in it...It looks something like this.
#Stuff
1
2
3
#MoreStuff
a
b
c
I am trying to use std::string::find() function to get the positions of the # and then go from there, but I'm not sure how to actually code this.
This is my attempt:
int pos1=0;
while(i<string.size()){
int next=string.find('#', pos1);
i++;}
Here's one i made a while ago... (in C)
int char_pos(char c, char *str) {
char *pch=strchr(str,c);
return (pch-str)+1;
}
Port it to C++ and there you go! ;)
If : Not Found Then returns Negative.
Else : Return 'Positive', Char's 1st found position (1st match)
It's hard to tell from your question what you mean by "position", but it looks like you are trying to do something like this:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream incoming{"string-parsing-for-c.txt"};
std::string const hash{"#"};
std::string line;
for (auto line_number = 0U; std::getline(incoming, line); ++line_number)
{
auto const column = line.find(hash);
if (std::string::npos != column)
{
std::cout << hash << " found on line " << line_number
<< " in column " << column << ".\n";
}
}
}
...or possibly this:
#include <fstream>
#include <iostream>
int main()
{
std::ifstream incoming{"string-parsing-for-c.txt"};
char const hash{'#'};
char byte{};
for (auto offset = 0U; incoming.read(&byte, 1); ++offset)
{
if (hash == byte)
{
std::cout << hash << " found at offset " << offset << ".\n";
}
}
}