find repeated words number in c++ - c++

is there anyway to find out how many times a word repeated in a text .
the text is in character arrays (char[])
text = this is a book,and this book
is about book.
word = book
result = 3

Because this is clearly homework and not tagged as such, I'll give you a solution you clearly can't submit as your assignment because your teacher will know you got it on the internet.
There were no requirements such as ignoring punctuation, so I've allowed myself to write a version that only works for clearly separated words and thus inserted spaces in your sample text string.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
// Count clearly separated occurrences of `word` in `text`.
std::size_t count ( const std::string& text, const std::string& word )
{
std::istringstream input(text);
return (std::count(std::istream_iterator<std::string>(input),
std::istream_iterator<std::string>(), word));
}
int main ( int, char ** )
{
const char text[] = "this is a book , and this book is about book .";
const char word[] = "book";
std::cout << count(text, word) << std::endl;
}
Output:
3

You might want to implement this using std::string and here is a sample for you to start from.

The simplest way would be to loop through the string, counting the number of times that you find the word that you're looking for. I'm sure that you could use a function in <algorithm> to do it fairly easily, but if you have to ask whether it's possible to do this in C++, I wouldn't think that you're advanced enough to try using the algorithm library, and doing it yourself would be more instructional anyway.
I would suggest using std::string though if you're allowed to (since this question does sound like homework, which could carry additional restrictions). Using std::string is easier and less error-prone than char arrays. It can be done with both though.

It is possible.
You have an array of characters. Try to do the search on a piece of paper, character by character:
First character is a T. This is not a b, so it can't be the first character of "book"
Second character is a h, so again, it is not b...
[...]
The next character is a b... Oah, this could be it. Is the next character a o? YES!!! And then next another o???... etc. etc..
When you can do it this way, you will be able to use C++ to do it.
Remember that you can access the n-th character in an array by using the [] operator:
char c = array[5] ; // c is the 6th character in the array
Now, going toward the C++ way would be, at first, to use a std::string instead of an array of chars, and use the strings methods. Google for std::string methods, and I guess you should find somes that you could use...
So you should manage to write some code that will iterate each character until the end
I guess this should be more than enough.
The point of your homework (because everyone here knows this is a homework question) is more about searching for the solution than finding it: This is not rote learning.
I doubt anyone on Stack Overflow remembers the solution to this classical problem. But I guess most will know how to find one solution. You need to learn the "how to find" mindset, so get your compiler and try again...
P.S.: Of course, if you know little or nothing of C++, then you're screwed, and you could start by Googling some C++ Tutorials.

Related

C++ String.h Char Tables cutting-off word without strstr

I need help with C++ <string.h> char tables.... How to cut word from sentence, using "*" operator, with no strstr? For example: "StackOverFlow is online website". I have to cut off "StackOverFlow" and leave in table "is online website" using operator, with no strstr. I couldn't find it anywhere.
Mostly like:
char t[]
int main
{
strcpy(t,"Stackoverflow is online website");
???
(Setting first char to NULL, then strcat/strcpy rest of sentence into table)
}
Sorry for English problems/Bad naming... I'm starting to learning C++
You can do something like this. Explain better what you need, please.
char szFirstStr[] = "StackOverflow, flowers and vine.";
strcpy(szFirstStr, szFirstStr + 15);
std::cout << szFirstStr << std::endl;
Will output "flowers and vine".
Using c strings is not good style for C++ programmer, use std::string class.
Your code is obviously syntactically incorrect, but I guess you are aware of that.
Your variable t is really a char array and you have a pointer that points to the first character of that char array, like you have a pointer that points to the first character of your null terminated string. What you can do is to change the pointer value to point to the new starting point of your string.
You can either do that, or if you indeed use an array, you can copy from the pointer of the new starting point you wish to use. So if the data you wish to copy resides in memory pointed to by:
const char* str = "Stackoverflow is an online website";
This looks like the following in memory:
Stackoverflow is an online website\0
str points to: --^
If you want to point to a different starting point you can alter the pointer to point at a different starting location:
Stackoverflow is an online website\0
str + 14 points to: --------------^
You can pass the address of the "i" to your strcpy, like so:
strcpy(t, str + 14);
Obviously it is not certain that you know the size to cut off without an analysis (the 14), what you might do is search through the string for the first character following a white space.
// Notice that this is just a sample of a search that could be made
// much more elegant, but I will leave that to you.
const char* FindSecondWord(const char* strToSearch) {
// Loop until the end of the string is reached or the first
// white space character
while (*strToSearch && !isspace(*strToSearch)) strToSearch++;
// Loop until the end of the string is reached or the first
// non white space character is found (our new starting point)
while (*strToSearch && isspace(*strToSearch)) strToSearch++;
return strToSearch;
}
strcpy(t, FindSecondWord("Stackoverflow is an online website"));
cout << t << endl;
This will output: is an online website
Since this is most likely a school assignment, I will skip the lecture on more modern C++ string handling, as I expect this has something to do with learning pointers. But obviously this is very low level modification of a string.
As a beginner why make it harder then it really have to be?
Use std::string
and
substr()
Link

How can search number in string?

Hello? I want to know how can I find number in my string code.
This is my c++ code.
string firSen;
cout<<"write the senctence : "<<endl;
getline(cin,firSen);
int a=firSen.find("pizza");
int b=firSen.find("hamburger");
int aa=firSen.find(100);
int bb=firSen.find(30);
I want to write
I want to eat pizza 100g, hamburger 30g!!
and I want to know 100 and 30 address.
I know how to find pizza and hamburger address.(It's the right code)
but I don't know how to find number..(I think int aa=firSen.find(100); int bb=firSen.find(30); is wrong code)
Could you help me?
The std::string::find() function takes a std::string or a const char* as valid search keys.
If you want to search for 'generic numbers' you'll have to convert them to a std::string or use a const char* literal
size_type aa=firSen.find("100");
or
int num = 100;
size_type aa=firSen.find(std::to_string(num));
See the std::to_string() function reference
As it looks from your input sample, you don't know the numeric values beforehand, thus looking up something like
size_type aa=firSen.find("100");
renders useless.
What you actually need is some decent parser, that enables you reading the numeric values after some certain keywords, that require a numeric attribute (like weight in your sample).
The simplest way might be, to find your keywords like "hamburger" or "pizza", and move on from the found position, to find the next digit ('0-9'), and extract the number from that position.
Using std::regex as proposed in #deeiip's answer, might be a concise solution for your problem.
I'd use this in your situation (if I was searching for just a number, not a specific number):
std::regex rgx("[0-9]+");
std::smatch res;
while (std::regex_search(firSen, res, rgx)) {
std::cout << res[0] << std::endl;
s = res.suffix().str();
}
This is c++11 standard code using <regex>. What it does is: search for any occurence of a number. This is what [0-9]+ means. And It keep on searching for this pattern in your string.
This solution should only be used when I dont know what number I'm expecting otherwise it'll be much more expensive than other solution mentioned.

Dynamically allocated strings in C

I was doing a relatively simple string problem in UVa's online judge to practice with strings since I've been having a hard time with them in C. The problem basically asks to check if a string B contains another string A if you remove the 'clutter' and concatenate the remaining characters, for example if "ABC" is contained in "AjdhfmajBsjhfhC" which in this case is true.
So, my question is how can I efficiently allocate memory for a string which I don't know its length? What I did was to make a string really big char Mstring[100000], read from input and then use strlen(Mstring) to copy the string the a properly sized char array. Something like :
char Mstring[100000];
scanf("%s",Mstring);
int length = strlen(Mstring);
char input[length+1]={0};
for(int i = 0; i<length;i++){
input[i]=Mstring[i];
}
Is there a better/standard way to do this in C? I know that C does not has a great support for strings, if there is not a better way to do it in C maybe in C++?
If you have the option of using C++ (as you mentioned), that is going to make your life a lot easier. You can then use a STL string (std::string) which manages dynamically sized strings for you. You can also drop the old scanf() beast and use std::cin.
Example:
#include <iostream>
#include <string>
void main()
{
std::string sInput;
std::getline(std::cin, sInput);
// alternatively, you could execute this line instead:
// std::cin >> sInput;
// but that will tokenize input based on whitespace, so you
// will only get one word at a time rather than an entire line
}
Describing how to manage strings that can grow dynamically in C will take considerably more explanation and care, and it sounds like you really don't need that. If so, however, here is a starting point: http://www.strchr.com/dynamic_arrays.

c++ creating ambigram from string

I have a task to implement "void makeAmbigram(char*)" that will print on screen ambigram of latin string or return something like 'ambigram not possible'. Guess it's just about checking if string contains only of SNOXZHI and printing string backwards. Or am I wrong ?
I'm a complete noob when dealing with cpp so that's what I've created :
#include <iostream>
using namespace std;
char[]words;
char[]reversed;
char[] ret_str(char* s)
{
if(*s != '\0')
ret_str(s+1);
return s;
}
void makeAmbigram(char* c)
{
/* finding chars XIHNOZS and printing ambigram */
}
int main()
{
cin>>words;
reversed = ret_str(words);
makeAmbigram(reversed);
return 0;
}
I can reverse string but how to check if my reversed string contains only needed chars ?
I've found some function but it's hard or even imposible to implement it for greater amount of chars : www.java2s.com/Code/C/String/Findcharacterinstringhowtousestrchr.htm
You need to allocate space in your arrays or use std::vector. The arrays word and reversed are just pointers and no space is allocated. The C++ language does not support dynamic arrays; however, the STL provides std::vector which dynamically allocates space as required.
Change:
char[]words;
char[]reversed;
To:
#define MAX_LETTERS 64
char words[MAX_LETTERS + 1]; // + 1 for terminating nul character ('\0')
char reversed[MAX_LETTERS + 1];
Or:
#include <string>
std::string words;
std::string reversed;
Or:
#include <vector>
std::vector<char> words;
std::vector<char> reversed;
As far as the ambigram rules go, you need to talk to your instructor. Also, if this is homework, add a tag indicating so.
Hint: The std::string data type has some reverse iterators which may be of use to you.
std::string has an entire family of member functions along the lines of find_first_of. You can pass in a string containing all the letters your ambigram test requires, and they'll find whether any of those letters are present in the source string.
The complete list of string functions is available here.
As for the definition of ambigrams, given the wiki page you've included in the question...you need to check if a letter is legible if viewed upside down, for eg. u/n, w/m, d/p, q/b and so on. There are of course more complex rules was well, for eg. 'ui' can resemble 'm' if viewed upside down.
However, if you're only required to check if your string contains only SNOXZHI, you can look into a regular expression (regex) for the same, and compare input string character-wise to your regex.

How to remove accents and tilde in a C++ std::string

I have a problem with a string in C++ which has several words in Spanish. This means that I have a lot of words with accents and tildes. I want to replace them for their not accented counterparts. Example: I want to replace this word: "había" for habia. I tried replace it directly but with replace method of string class but I could not get that to work.
I'm using this code:
for (it= dictionary.begin(); it != dictionary.end(); it++)
{
strMine=(it->first);
found=toReplace.find_first_of(strMine);
while (found!=std::string::npos)
{
strAux=(it->second);
toReplace.erase(found,strMine.length());
toReplace.insert(found,strAux);
found=toReplace.find_first_of(strMine,found+1);
}
}
Where dictionary is a map like this (with more entries):
dictionary.insert ( std::pair<std::string,std::string>("á","a") );
dictionary.insert ( std::pair<std::string,std::string>("é","e") );
dictionary.insert ( std::pair<std::string,std::string>("í","i") );
dictionary.insert ( std::pair<std::string,std::string>("ó","o") );
dictionary.insert ( std::pair<std::string,std::string>("ú","u") );
dictionary.insert ( std::pair<std::string,std::string>("ñ","n") );
and toReplace strings is:
std::string toReplace="á-é-í-ó-ú-ñ-á-é-í-ó-ú-ñ";
I obviously must be missing something. I can't figure it out.
Is there any library I can use?.
Thanks,
I disagree with the currently "approved" answer. The question makes perfect sense when you are indexing text. Like case-insensitive search, accent-insensitive search is a good idea. "naïve" matches "Naïve" matches "naive" matches "NAİVE" (you do know that an uppercase i is İ in Turkish? That's why you ignore accents)
Now, the best algorithm is hinted at the approved answer: Use NKD (decomposition) to decompose accented letters into the base letter and a seperate accent, and then remove all accents.
There is little point in the re-composition afterwards, though. You removed most sequences which would change, and the others are for all intents and purposes identical anyway. WHat's the difference between æ in NKC and æ in NKD?
First, this is a really bad idea: you’re mangling somebody’s language by removing letters. Although the extra dots in words like “naïve” seem superfluous to people who only speak English, there are literally thousands of writing systems in the world in which such distinctions are very important. Writing software to mutilate someone’s speech puts you squarely on the wrong side of the tension between using computers as means to broaden the realm of human expression vs. tools of oppression.
What is the reason you’re trying to do this? Is something further down the line choking on the accents? Many people would love to help you solve that.
That said, libicu can do this for you. Open the transform demo; copy and paste your Spanish text into the “Input” box; enter
NFD; [:M:] remove; NFC
as “Compound 1” and click transform.
(With help from slide 9 of Unicode Transforms in ICU. Slides 29-30 show how to use the API.)
I definitely think you should look into the root of the problem. That is, look for a solution that will allow you to support characters encoded in Unicode or for the user's locale.
That being said, your problem is that you're dealing with multi-character strings. There is std::wstring but I'm not sure I'd use that. For one thing, wide characters aren't meant to handle variable width encodings. This hole goes deep, so I'll leave it at that.
Now, as for the rest of your code, it is error prone because you mix the looping logic with translation logic. Thus, at least two kinds of bugs can occur: translation bugs and looping bugs. Do use the STL, it can help you a lot with the looping part.
The following is a rough solution for replacing characters in a string.
main.cpp:
#include <iostream>
#include <string>
#include <iterator>
#include <algorithm>
#include "translate_characters.h"
using namespace std;
int main()
{
string text;
cin.unsetf(ios::skipws);
transform(istream_iterator<char>(cin), istream_iterator<char>(),
inserter(text, text.end()), translate_characters());
cout << text << endl;
return 0;
}
translate_characters.h:
#ifndef TRANSLATE_CHARACTERS_H
#define TRANSLATE_CHARACTERS_H
#include <functional>
#include <map>
class translate_characters : public std::unary_function<const char,char> {
public:
translate_characters();
char operator()(const char c);
private:
std::map<char, char> characters_map;
};
#endif // TRANSLATE_CHARACTERS_H
translate_characters.cpp:
#include "translate_characters.h"
using namespace std;
translate_characters::translate_characters()
{
characters_map.insert(make_pair('e', 'a'));
}
char translate_characters::operator()(const char c)
{
map<char, char>::const_iterator translation_pos(characters_map.find(c));
if( translation_pos == characters_map.end() )
return c;
return translation_pos->second;
}
You might want to check out the boost (http://www.boost.org/) library.
It has a regexp library, which you could use.
In addition it has a specific library that has some functions for string manipulation (link) including replace.
Try using std::wstring instead of std::string. UTF-16 should work (as opposed to ASCII).
I could not link the ICU libraries but I still think it's the best solution. As I need this program to be functional as soon as possible I made a little program (that I have to improve) and I'm going to use that. Thank you all for for suggestions and answers.
Here's the code I'm gonna use:
for (it= dictionary.begin(); it != dictionary.end(); it++)
{
strMine=(it->first);
found=toReplace.find(strMine);
while (found != std::string::npos)
{
strAux=(it->second);
toReplace.erase(found,2);
toReplace.insert(found,strAux);
found=toReplace.find(strMine,found+1);
}
}
I will change it next time I have to turn my program in for correction (in about 6 weeks).
If you can (if you're running Unix), I suggest using the tr facility for this: it's custom-built for this purpose. Remember, no code == no buggy code. :-)
Edit: Sorry, you're right, tr doesn't seem to work. How about sed? It's a pretty stupid script I've written, but it works for me.
#!/bin/sed -f
s/á/a/g;
s/é/e/g;
s/í/i/g;
s/ó/o/g;
s/ú/u/g;
s/ñ/n/g;
/// <summary>
///
/// Replace any accent and foreign character by their ASCII equivalent.
/// In other words, convert a string to an ASCII-complient string.
///
/// This also get rid of special hidden character, like EOF, NUL, TAB and other '\0', except \n\r
///
/// Tests with accents and foreign characters:
/// Before: "äæǽaeöœoeüueÄAeÜUeÖOeÀÁÂÃÄÅǺĀĂĄǍΑΆẢẠẦẪẨẬẰẮẴẲẶАAàáâãåǻāăąǎªαάảạầấẫẩậằắẵẳặаaБBбbÇĆĈĊČCçćĉċčcДDдdÐĎĐΔDjðďđδdjÈÉÊËĒĔĖĘĚΕΈẼẺẸỀẾỄỂỆЕЭEèéêëēĕėęěέεẽẻẹềếễểệеэeФFфfĜĞĠĢΓГҐGĝğġģγгґgĤĦHĥħhÌÍÎÏĨĪĬǏĮİΗΉΊΙΪỈỊИЫIìíîïĩīĭǐįıηήίιϊỉịиыїiĴJĵjĶΚКKķκкkĹĻĽĿŁΛЛLĺļľŀłλлlМMмmÑŃŅŇΝНNñńņňʼnνнnÒÓÔÕŌŎǑŐƠØǾΟΌΩΏỎỌỒỐỖỔỘỜỚỠỞỢОOòóôõōŏǒőơøǿºοόωώỏọồốỗổộờớỡởợоoПPпpŔŖŘΡРRŕŗřρрrŚŜŞȘŠΣСSśŝşșšſσςсsȚŢŤŦτТTțţťŧтtÙÚÛŨŪŬŮŰŲƯǓǕǗǙǛŨỦỤỪỨỮỬỰУUùúûũūŭůűųưǔǖǘǚǜυύϋủụừứữửựуuÝŸŶΥΎΫỲỸỶỴЙYýÿŷỳỹỷỵйyВVвvŴWŵwŹŻŽΖЗZźżžζзzÆǼAEßssIJIJijijŒOEƒf'ξksπpβvμmψpsЁYoёyoЄYeєyeЇYiЖZhжzhХKhхkhЦTsцtsЧChчchШShшshЩShchщshchЪъЬьЮYuюyuЯYaяya"
/// After: "aaeooeuueAAeUUeOOeAAAAAAAAAAAAAAAAAAAAAAAaaaaaaaaaaaaaaaaaaaaaaaBbCCCCCCccccccDdDDjddjEEEEEEEEEEEEEEEEEEeeeeeeeeeeeeeeeeeeFfGGGGGgggggHHhhIIIIIIIIIIIIIiiiiiiiiiiiiJJjjKKkkLLLLllllMmNNNNNnnnnnOOOOOOOOOOOOOOOOOOOOOOooooooooooooooooooooooPpRRRRrrrrSSSSSSssssssTTTTttttUUUUUUUUUUUUUUUUUUUUUUUUuuuuuuuuuuuuuuuuuuuuuuuYYYYYYYYyyyyyyyyVvWWwwZZZZzzzzAEssIJijOEf'kspvmpsYoyoYeyeYiZhzhKhkhTstsChchShshShchshchYuyuYaya"
///
/// Tests with invalid 'special hidden characters':
/// Before: "\0\0\000\0000Bj��rk�\'\"\\\0\a\b\f\n\r\t\v\u0020���oacu\'\\\'te�"
/// After: "00000Bjrk'\"\\\n\r oacu'\\'te"
///
/// </summary>
private string Normalize(string StringToClean)
{
string normalizedString = StringToClean.Normalize(NormalizationForm.FormD);
StringBuilder Buffer = new StringBuilder(StringToClean.Length);
for (int i = 0; i < normalizedString.Length; i++)
{
if (CharUnicodeInfo.GetUnicodeCategory(normalizedString[i]) != UnicodeCategory.NonSpacingMark)
{
Buffer.Append(normalizedString[i]);
}
}
string PreAsciiCompliant = Buffer.ToString().Normalize(NormalizationForm.FormC);
StringBuilder AsciiComplient = new StringBuilder(PreAsciiCompliant.Length);
foreach (char character in PreAsciiCompliant)
{
//Reject all special characters except \n\r (Carriage-Return and Line-Feed).
//Get rid of special hidden character, like EOF, NUL, TAB and other '\0'
if (((int)character >= 32 && (int)character < 127) || ((int)character == 10 || (int)character == 13))
{
AsciiComplient.Append(character);
}
}
return AsciiComplient.ToString().Trim(); // Remove spaces at start and end of string if any
}