Let's assume that I have the following file: wow.txt which reads
4 a 1 c
and what I want to do is I would like to output the following:
d 1 a 3
change the integer to the corresponding alphabet (d is 4th letter, a is 1st letter
in the alphabet), and alphabet letter to the corresponding integer
(a is 1st letter in the alphabet, c is the 3rd letter in the alphabet)
I started with the following code in C++.
int main()
{
ifstream inFile;
inFile.open("wow.txt");
ofstream outFile;
outFile.open("sample.txt");
int k, g;
char a, b;
inFile>>k>>a>>g>>b;
outFile<<(char)(k+96)<<(int)a-96<<(char)(g+96)<<(int)b-96
}
inFile.close();
outFile.close();
}
but then here, I could only do it because I knew that the text in wow.txt
goes integer, character, integer, character.
Also, even if I knew the pattern, if the text in wow.txt is super-long, then
there's no way I could've solved the problem using the method I used, manually
typing in each input (Defining k, g as integers, a, b as characters, and
then doing inFile>>k>>a>>g>>b;)
Also, I didn't know know the pattern, there's no way
I could've solved it. I was wondering if there's a C++
function that reads the input from the given text file and determines its type, so
that I could attack the this type of problem in the more general case.
I'm very new to C++ programming language (or programming in general),
so any help about this would be greatly appreciated.
The term you're searching for is parsing. It is the idea of taking in text and transforming it into something meaningful. Your C++ compiler, for example, does exactly that with your program's code -- it reads in text, parses it into a series of internal representations that it does transforms on, then outputs binary code that, when run, carries out the intent of the code you wrote.
In your case, you want to turn the problem on its head -- instead of telling the input stream what to expect next from the file, you simply extract everything as text, and then figure it out yourself (you let the stream tell you what's there). If you think about it, it's text (or rather, binary data, but close enough) all the way down anyway, even when you're asking for, say, an integer to be read from the stream -- the stream does the integer parsing for you in that case, but it's still just text being parsed.
Here's some example code (untested) to get you started:
std::ifstream fin("wow.txt");
// Read everything in (works well for short files; longer
// ones could be read incrementally (streamed), but this
// adds complexity
fin.seekg(0, fin.end);
std::size_t size = fin.tellg();
fin.seekg(0, fin.beg);
std::vector<char> text(size);
fin.read(&size[0], size);
fin.close();
// Now 'tokenize' the text (into words, in this case characters)
enum TokenType { Letter, Number };
struct Token {
const char* pos;
std::size_t length;
TokenType type;
};
std::vector<Token> tokens;
for (const char* pos = &text[0]; pos != &text[0] + text.size(); ++pos) {
if (*pos >= 'a' && *pos <= 'z') {
// Letter! (lowercase)
Token tok = { pos, 1, Letter };
tokens.push_back(tok);
// TODO: Validate that the next character is whitespace (or EOF)
}
else if (*pos >= '0' && *pos <= '9') {
Token tok = { pos, 1, Number };
while (*pos >= '0' && *pos <= '9') {
++pos;
++tok.length;
}
tokens.push_back(tok);
// TODO: Validate that the next character is whitespace (or EOF)
}
else if (*pos == ' ' || *pos == '\t' || *pos == '\r' || *pos == '\n') {
// Whitespace, skip
// Note that newlines are normally tracked in order to give
// the correct line number in error messages
}
else {
std::cerr << "Unexpected character "
<< *pos
<< " at position "
<< (pos - &text[0]) << std::endl;
}
}
// Now that we have tokens, we can transform them into the desired output
std::ofstream fout("sample.txt");
for (auto it = tokens.begin(); it != tokens.end(); ++it) {
if (it->type == Letter) {
fout << static_cast<int>(*(it->pos) - 'a') + 1;
}
else {
int num = 0;
for (int i = 0; i != tok.length; ++i) {
num = num * 10 + (tok.pos[i] - '0');
}
// TODO: Make sure number is within bounds
fout << static_cast<char>(num - 1 + 'a');
}
fout << ' ';
}
fout.close();
Related
I'm trying to write a program that looks at the last letter of each word in a single string and determines if it ends in y or z and count it.
For example:
"fez day" -> 2
"day fyyyz" -> 2
Everything I've looked up uses what looks to be arrays, but I don't know how to use those yet. I'm trying to figure out how to do it using for loops.
I honestly don't know where to start. I feel like some of my smaller programs could be used to help this, but I'm struggling in trying to figure out how to combine them.
This code counts the amount of words in a string:
int words = 0;
bool connectedLetter;
for (auto c : s)
{
if (c == ' ')
{
connectedLetter = false;
}
if ( c != ' ' && connectedLetter == false)
{
++words;
connectedLetter = true;
}
and it might be useful to try and figure out how to get the code to see separate words.
I've used this program to count the amount of vowels in the entire program:
int vowels{0};
for (auto c : s)
{
if (c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u'
|| c == 'A' || c == 'E' || c == 'I' || c == 'O' || c == 'U')
{
++vowels;
}
}
and then I've done a small program to see every other letter in a string
auto len = s.size();
for (auto i = 0; i < len; i = i + 2)
{
result += s.at(i);
}
I feel like I know the concepts behind it, but its configuring it together which is stopping me
You may also use existing C++ functions that are dedicated to do, what you want.
The solution is to take advantage of basic IOstream functionalities. You may know that the extractor operator >> will extract words from an stream (like std::cin or any other stream) until it hits the next white space.
So reading words is simple:
std::string word{}; std::cin >> word;
will read a complete word from std::cin.
OK, we have a std::string and no stream. But here C++ helps you with the std::istringstream. This will convert a std::string to a stream object. You can then use all iostream functionalities with this stringstream.
Then, for counting elements, following a special requirement, we have a standard algorithm from the C++ library: std::count_if.
It expects a begin and an end iterator. And here we simply using the std::istream_iterator which will call the extractor operator >> for all strings that are in the stream.
WIth a Lambda, given to the std::count_if, we check, if a word meets the required condition.
We will get then a very compact piece of code.
#include <iostream>
#include <sstream>
#include <string>
#include <algorithm>
#include <iterator>
int main() {
// test string
std::string testString{ "day fyyyz" };
// We want to extract words from the string, so, convert string to stream.
std::istringstream iss{ testString };
// count words, meeting a special condition
std::cout << std::count_if(std::istream_iterator<std::string>(iss), {},
[](const std::string& s) { return s.back() == 'y' || s.back() == 'z'; });
return 0;
}
Of course there are tons of other possible solutions.
Edit
Pete Becker asked for a more flexible solution. Also here C++ offers a dedicated functionality. The std::sregex_token_iterator.
Here we can specify any word pattern with a regex and the simply get or count the matches.
An even simpler piece of code is the result:
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <regex>
const std::regex re{ R"(\w+[zy])" };
int main() {
// test string
std::string s{ "day, fyyyz, abc , zzz" };
// count words, meeting a special condition
std::cout << std::vector(std::sregex_token_iterator(s.begin(), s.end(), re), {}).size();
return 0;
}
If you're not going to use an array (or something similar, like a string) it's probably easiest to just use two ints. For simplicity, let's call them current and previous. You'll also need a count, which you'll want to initialize to 0.
Start by initializing both to EOF.
Read a character into current.
If current is a space or EOF (well, anything you don't consider part of a word), and previous is z or previous is y, increment count.
If current is EOF, print out count, and you're done.
Copy the value in current into previous.
Go back to step 2.
std::string is much smarter than many people realize. In particular, it has member functions find_first_of, find_first_not_of, find_last_of, and find_last_not_of that are very helpful for simple parsing. I'd approach it like this:
std::string str = "fez day"; // for example
std::string targets = "yz";
int target_count = 0;
char delims = ' ';
std::string::pos_type pos = str.find_first_not_of(delims);
while (pos < str.length()) {
pos = str.find_first_of(delims, pos);
if (pos == std::string::npos)
pos = str.length();
if (targets.find(str[pos-1] != std::string::npos)
++target_count;
pos = str.find_first_not_of(delims, pos);
}
std::cout << target_count << '\n';
Now, if I need to change this to accommodate comma-separated words, I just change
char delims = ' ';
to
std::string delims = " ,";
or to
const char* delims = " ,"; // my preference
and if I need to change the characters that I'm looking for, just change the contents of targets. (In fact, I'd use const char* targets = "xy"; and search with std::strchr, which reduces overhead a bit, but that's not particularly important.)
I have a runtime problem with code below.
The purpose is to "recognize" the formats (%s %d etc) within the input string.
To do this, it returns an integer that matches the data type.
Then the extracted types are manipulated/handled in other functions.
I want to clarify that my purpose isn't to write formatted types in a string (snprintf etc.) but only to recognize/extract them.
The problem is the crash of my application with error:
Debug Assertion Failed!
Program:
...ers\Alex\source\repos\TestProgram\Debug\test.exe
File: minkernel\crts\ucrt\appcrt\convert\isctype.cpp
Line: 36
Expression: c >= -1 && c <= 255
My code:
#include <iostream>
#include <cstring>
enum Formats
{
TYPE_INT,
TYPE_FLOAT,
TYPE_STRING,
TYPE_NUM
};
typedef struct Format
{
Formats Type;
char Name[5 + 1];
} SFormat;
SFormat FormatsInfo[TYPE_NUM] =
{
{TYPE_INT, "d"},
{TYPE_FLOAT, "f"},
{TYPE_STRING, "s"},
};
int GetFormatType(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return format.Type;
}
return -1;
}
bool isValidFormat(const char* formatName)
{
for (const auto& format : FormatsInfo)
{
if (strcmp(format.Name, formatName) == 0)
return true;
}
return false;
}
bool isFindFormat(const char* strBufFormat, size_t stringSize, int& typeFormat)
{
bool foundFormat = false;
std::string stringFormat = "";
for (size_t pos = 0; pos < stringSize; pos++)
{
if (!isalpha(strBufFormat[pos]))
continue;
if (!isdigit(strBufFormat[pos]))
{
stringFormat += strBufFormat[pos];
if (isValidFormat(stringFormat.c_str()))
{
typeFormat = GetFormatType(stringFormat.c_str());
foundFormat = true;
}
}
}
return foundFormat;
}
int main()
{
std::string testString = "some test string with %d arguments"; // crash application
// std::string testString = "%d some test string with arguments"; // not crash application
size_t stringSize = testString.size();
char buf[1024 + 1];
memcpy(buf, testString.c_str(), stringSize);
buf[stringSize] = '\0';
for (size_t pos = 0; pos < stringSize; pos++)
{
if (buf[pos] == '%')
{
if (buf[pos + 1] == '%')
{
pos++;
continue;
}
else
{
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
int typeFormat;
if (isFindFormat(bufFormat, stringSize, typeFormat))
{
std::cout << "type = " << typeFormat << "\n";
// ...
}
}
}
}
}
As I commented in the code, with the first string everything works. While with the second, the application crashes.
I also wanted to ask you is there a better/more performing way to recognize types "%d %s etc" within a string? (even not necessarily returning an int to recognize it).
Thanks.
Let's take a look at this else clause:
char bufFormat[1024 + 1];
memcpy(bufFormat, buf + pos, stringSize);
bufFormat[stringSize] = '\0';
The variable stringSize was initialized with the size of the original format string. Let's say it's 30 in this case.
Let's say you found the %d code at offset 20. You're going to copy 30 characters, starting at offset 20, into bufFormat. That means you're copying 20 characters past the end of the original string. You could possibly read off the end of the original buf, but that doesn't happen here because buf is large. The third line sets a NUL into the buffer at position 30, again past the end of the data, but your memcpy copied the NUL from buf into bufFormat, so that's where the string in bufFormat will end.
Now bufFormat contains the string "%d arguments." Inside isFindFormat you search for the first isalpha character. Possibly you meant isalnum here? Because we can only get to the isdigit line if the isalpha check passes, and if it's isalpha, it's not isdigit.
In any case, after isalpha passes, isdigit will definitely return false so we enter that if block. Your code will find the right type here. But, the loop doesn't terminate. Instead, it continues scanning up to stringSize characters, which is the stringSize from main, that is, the size of the original format string. But the string you're passing to isFindFormat only contains the part starting at '%'. So you're going to scan past the end of the string and read whatever's in the buffer, which will probably trigger the assertion error you're seeing.
Theres a lot more going on here. You're mixing and matching std::string and C strings; see if you can use std::string::substr instead of copying. You can use std::string::find to find characters in a string. If you have to use C strings, use strcpy instead of memcpy followed by the addition of a NUL.
You could just demand it to a regexp engine which bourned to search through strings
Since C++11 there's direct support, what you have to do is
#include <regex>
then you can match against strings using various methods, for instance regex_match which gives you the possibility, together with an smatch to find out your target with just few lines of codes using standard library
std::smatch sm;
std::regex_match ( testString.cbegin(), testString.cend(), sm, str_expr);
where str_exp is your regex to find what you want specifically
in the sm you have now every matched string against your regexp, which you can print in this way
for (int i = 0; i < sm.size(); ++i)
{
std::cout << "Match:" << sm[i] << std::endl;
}
EDIT:
to better express the result you would achieve i'll include a simple sample below
// target string to be searched against
string target_string = "simple example no.%d is: %s";
// pattern to look for
regex str_exp("(%[sd])");
// match object
smatch sm;
// iteratively search your pattern on the string, excluding parts of the string already matched
cout << "My format strings extracted:" << endl;
while (regex_search(target_string, sm, str_exp))
{
std::cout << sm[0] << std::endl;
target_string = sm.suffix();
}
you can easily add any format string you want modifying the str_exp regex expression.
I'm programming a hash table thing in C++, but this specific piece of code will not run properly. It should return a string of alpha characters and ' and -, but I get cases like "t" instead of "art" when I try to input "'aRT-*".
isWordChar() return a bool value depending on whether the input is a valid word character or not using isAlpha()
// Words cannot contain any digits, or special characters EXCEPT for
// hyphens (-) and apostrophes (') that occur in the middle of a
// valid word (the first and last characters of a word must be an alpha
// character). All upper case characters in the word should be convertd
// to lower case.
// For example, "can't" and "good-hearted" are considered valid words.
// "12mOnkEYs-$" will be converted to "monkeys".
// "Pa55ive" will be stripped "paive".
std::string WordCount::makeValidWord(std::string word) {
if (word.size() == 0) {
return word;
}
string r = "";
string in = "";
size_t incr = 0;
size_t decr = word.size() - 1;
while (incr < word.size() && !isWordChar(word.at(incr))) {
incr++;
}
while (0 < decr && !isWordChar(word.at(decr))) {
decr--;
}
if (incr > decr) {
return r;
}
while (incr <= decr) {
if (isWordChar(word.at(incr)) || word.at(incr) == '-' || word.at(incr) == '\'') {
in =+ word.at(incr);
}
incr++;
}
for (size_t i = 0; i < in.size(); i++) {
r += tolower(in.at(i));
}
return r;
}
Assuming you can use standard algorithms its better to rewrite your function using them. This achieves 2 goals:
code is more readable, since using algorithms shows intent along with code itself
there is less chance to make error
So it should be something like this:
std::string WordCount::makeValidWord(std::string word) {
auto first = std::find_if(word.cbegin(), word.cend(), isWordChar);
auto last = std::find_if(word.crbegin(), word.crend(), isWordChar);
std::string i;
std::copy_if(first, std::next(last), std::back_inserter(i), [](char c) {
return isWordChar(c) || c == '-' || c == '\'';
});
std::string r;
std::transform(i.cbegin(), i.cend(), std::back_inserter(r), std::tolower);
return r;
}
I am going to echo #Someprogrammerdude and say: Learn to use a debugger!
I pasted your code into Visual Studio (changed isWordChar() to isalpha()), and stepped it through with the debugger. Then it was pretty trivial to notice this happening:
First loop of while (incr <= decr) {:
Second loop:
Ooh, look at that; the variable in does not update correctly - instead of collecting a string of the correct characters it only holds the last one. How can that be?
in =+ word.at(incr); Hey, that is not right, that operator should be +=.
Many errors are that easy and effortless to find and correct if you use a debugger. Pick one up today. :)
Been working on this program which requires the use of a function that compares a string input by the user and gives the user the opportunity to leave the characters that he/she doesn't know out of the input, replacing them with * . The input represents a license-plate of a car that has 6 characters (for instance ABC123) and the user is allowed to leave any of those characters out (for instance AB** 23 or ** C12* etc.). So the function needs to return all objects that match the characters in the right position, but it cannot return if, say, A is in the right position but any of the other characters are not. The user is, however, allowed to only enter A* * * * *, for instance, and the function should return all objects that have A in the first position.
What I did was use a function to remove all the asterisks from the input string, then create sub-strings and send them to the function as a vector.
string removeAsterisk(string &rStr)// Function to remove asterisks from the string, if any.
{
stringstream strStream;
string delimiters = "*";
size_t current;
size_t next = -1;
do
{
current = next + 1;
next = rStr.find_first_of( delimiters, current );
strStream << rStr.substr( current, next - current ) << " ";
}
while (next != string::npos);
return strStream.str();
}
int main()
{
string newLicensePlateIn;
newLicensePlateIn = removeAsterisk(licensePlateIn);
string buf; // Have a buffer string
stringstream ss(newLicensePlateIn); // Insert the string into a stream
vector<string> tokens; // Create vector to hold our words
while (ss >> buf)
tokens.push_back(buf);
myRegister.showAllLicense(tokens);
}
The class function that receives the vector currently looks something like this:
void VehicleRegister::showAllLicense(vector<string>& tokens)//NOT FUNCTIONAL
{
cout << "\nShowing all matching vehicles: " << endl;
for (int i = 0; i < nrOfVehicles; i++)
{
if(tokens[i].compare(vehicles[i]->getLicensePlate()) == 0)
{
cout << vehicles[i]->toString() << endl;
}
}
}
If anyone understand what I'm trying to do and might have some ideas, please feel free to reply, I would appreciate any advice.
Thanks for reading this/ A.
Just iterate through the characters, comparing one at a time. If either character is an asterisk, consider that a match, otherwise compare them for equality. For example:
bool LicensePlateMatch(std::string const & lhs, std::string const & rhs)
{
assert(lhs.size() == 6);
assert(rhs.size() == 6);
for (int i=0; i<6; ++i)
{
if (lhs[i] == '*' || rhs[i] == '*')
continue;
if (lhs[i] != rhs[i])
return false;
}
return true;
}
Actually, you don't have to restrict it to 6 characters. You may want to allow for vanity plates. In that case, just ensure both strings have the same length, then iterate through all the character positions instead of hardcoding 6 in there.
This is the requirement: Read a string and loop it, whenever a new word is encountered insert it into std::list. If the . character has a space, tab, newline or digit on the left and a digit on the right then it is treated as a decimal point and thus part of a word. Otherwise it is treated as a full stop and a word separator.
And this is the result I run from the template program:
foo.bar -> 2 words (foo, bar)
f5.5f -> 1 word
.4.5.6.5 -> 1 word
d.4.5f -> 3 words (d, 4, 5f)
.5.6..6.... -> 2 words (.5.6, 6)
It seems very complex for me in first time dealing with string c++. Im really stuck to implement the code. Could anyone suggest me a hint ? Thanks
I just did some scratch ideas
bool isDecimal(std::string &word) {
bool ok = false;
for (unsigned int i = 0; i < word.size(); i++) {
if (word[i] == '.') {
if ((std::isdigit(word[(int)i - 1]) ||
std::isspace(word[(int)i -1]) ||
(int)(i - 1) == (int)(word.size() - 1)) && std::isdigit(word[i + 1]))
ok = true;
else {
ok = false;
break;
}
}
}
return ok;
}
void checkDecimal(std::string &word) {
if (!isDecimal(word)) {
std::string temp = word;
word.clear();
for (unsigned int i = 0; i < temp.size(); i++) {
if (temp[i] != '.')
word += temp[i];
else {
if (std::isalpha(temp[i + 1]) || std::isdigit(temp[i + 1]))
word += ' ';
}
}
}
trimLeft(word);
}
I think you may be approaching the problem from the wrong direction. It seems much easier if you turn the condition upside down. To give you some pointers in a pseudocode skeleton:
bool isSeparator(const std::string& string, size_t position)
{
// Determine whether the character at <position> in <string> is a word separator
}
void tokenizeString(const std::string& string, std::list& wordList)
{
// for every character in string
// if(isSeparator(character) || end of string)
// list.push_back(substring from last separator to this one)
}
I suggest to implement it using flex and bison with c++ implementation