Parse int and string - c++

Hi I'm not sure if this is the right place to ask this question.
Anyway I have written this code to parse a molecule formula and split it into atoms and amount of each atoms.
For instance if I input "H2O" I will for the atom array get {"H", "O"} and in the amount array I will get {2, 1}. I haven't taken account for amount that is larger than 9, since I don't think there are molecule which can bind to something that is larger than 8.
Anyway I'm quite newbie, so I wonder if this piece of code can be made better?
string formula = "H2O";
int no, k = 0, a = 0;
string atom[10];
int amount[10];
bool flag = true;
stringstream ss(formula);
for(int i = 0; i < formula.size(); ++i)
{
no = atoi(&formula[i]);
if(no == 0 && (flag || islower(formula[i]) ) )
{
cout << "k = " << k << endl;
atom[k] += formula[i];
flag = false;
cout << "FOO1 " << atom[k] << endl;
amount[a] = 1;
}
else if(no != 0)
{
amount[a] = no;
cout << "FOO2 " << amount[a] << endl;
a++;
flag = true;
k++;
}
else
{
k++;
a++;
atom[k] = formula[i];
cout << "FOO3 " << atom[k] << endl;
amount[a] = 1;
flag = false;
}
cout << no << endl;
}

Have you considered an approach with regular expressions? Do you have access to Boost or TR1 regular expressions? An individual atom and its count can easily be represented as:
(after edits based on comments)
([A-Z][a-z]{0,2})([0-9]*)
Then you just need to repeatedly find this pattern in your input string and extract the different parts.

There are many potential improvements that could be made, of course. But as a newbie, I guess you only want the immediate ones. The first improvement is to change this from a program that has a hard coded formula to a program that reads a formula from the user. Then try testing yout program by inputting different formulae, and check that the output is correct.

What if you modified it to be like this algorithm? This would maybe be less code, but would definitely be more clear:
// while not at end of input
// gather an uppercase letter
// gather any lowercase letters
// gather any numbers
// set the element in your array
This could be implemented with 3 very simple loops inside of your main loop, and would make your intentions to future maintainers much more obvious.

Related

Generating random text based on a regex in C++

I just started learning to code, starting with C++ yesterday. I need it for a project I'm doing and I like to build generation tools as an "onboarding" process when I learn a new skills. So I thought I'd try building out a regex generation tool.
I googled, I binged, and I looked through the similar questions and only saw answers pertaining to Ruby, Perl, or JS. Frankly, I'm a bit surprised given the utility and prevalence of C++, not more people have tried this.
I don't know how to go about the task, as I'm not a professional or really knowledgeable about what I'm doing. I'm not sure how to ask such questions, either. Please bare with me while I explain my current thoughts.
I am currently toying around with generating strings using byte arrays (I find the C++ type system and casting is confusing at times). I wanted to see if there were any specific ranges of random values that produce strings with latin characters more than others. I get a lot of different values, and found a few ranges that looked like sweet spots, but I ultimately don't know what numbers correlate to what characters.
I wanted to establish a pattern, then set the rand() ranges to correlate with the projected total byte value of what the pattern should generate as a string, then go fishing. I understand that I have to account for upper bounds for characters. So the generated values would be something like:
//not implemented
int getBoundary(string expression){
srand(time(0));
int boundaries[2] = {0};
boundaries[0] = getCeilingValue(expression)
boundaries[1] = getFloorValue(expression)
return boundaries
}
practice.cpp
/*
Method actually producing the byte strings
*/
void practice::stuub(int boundaries[2]){
srand(time(0)); //seed
basic_string<char> byteArray = {}; //"byte array" instantiation
for (int i = 0; i < 1000; i += 1) {
if(i % 2 ==0){
byteArray.push_back(rand() % boundaries[0]);//ceiling
}else{
byteArray.push_back(rand() % boundaries[1]);//floor
}
}
std::string s(byteArray, sizeof(byteArray)); //convert to string
cout << s << "\n";
}
/*
just a copy pasta validation function that I don't know if I need yet
*/
bool isNumeric(string str) {
for (int i = 0; i < str.length(); i++)
if (isdigit(str[i]) == false)
return false; //when one non numeric value is found, return false
return true;
}
/*
current putzing around. It's just been real fun to play around with,
but I plan to replace the instantiation of values of the "mod" array with
the upper/lower bounds of the string projected values This currently takes
a value and just does random stuff to it on a fishing expedition to see
if I can find any patterns.
*/
void practice::randomStringGen() {
try {
srand(time(0));
int mod[2] = {0};
string choice;
while (choice != "q") {
cout << "\n enter an integer to generate random byte strings or press (q) to quit \n";
cin >> choice;
if(choice != "q") {// make sure its not quit, otherwise it still carries out the tasks
if (isNumeric(choice)) {//make sure its numeric
mod[0] = stoi(choice);
if(mod[0] > 0) {//make sure its not 0
mod[0] = int(pow(mod[0], mod[0]));//do some weirdo math
mod[1] = rand() % mod[0]+1; //get another weirdo number
cout << "\n random string start:\n";
stuub(mod);//generate random string
cout << "\n :random string end\n";
}else{//user entered invalid integer
cout << "\n you did not enter a valid integer. Enter numbers greater than 0";
}
}else{
cout << "\n " << choice << " is not an integer";
}
}
}
}catch(std::exception& e){
cout << e.what();
}
}
I hope that provides enough explanation of what I am trying to accomplish.
I'm not any sort of pro, and I have very little understanding of what I'm doing.
I picked this up yesterday as a absolute beginner.
Talk to me like I'm 5 if you can.
Also, any recommendations on how to improve and "discretize" what I'm currently doing would be much appreciated. I think the nested "ifs" look wonky, but that's just a gut instinct.
Thanks!

Issue Comparing strings for an Answer Key (C++)

I'm working on a midterm project for my coding class, and while I've gotten the majority of kinks worked out I'm struggling with comparing two string values and determining if they are equal or not. The strings in question are ANSWERKEYand studentAnswers. The former is a constant that the latter is compared to.
The code in question is as follows.
if (studentAnswers == ANSWERKEY)
{
percentScore = 100.0;
cout << "Score: " << percentScore << " % " << 'A' << endl;
}
else if (studentAnswers != ANSWERKEY)
{
int count = 0;
double answerCount = 0.0;
while (count < ANSWERKEY.length())
{
if (studentAnswers.substr(count, count+1) == ANSWERKEY.substr(count, count+1)
{
answerCount++;
count++;
}
else
{
cout << "Incorrect answer." << endl;
count++;
}
}
percentScore = ((answerCount) / (double)ANSWERKEY.length()) * 100.0;
cout << "Percent score is " << percentScore << "%" << endl;
}
The exact issue I'm facing is that I can't work out a better way to compare the strings. With the current method, the output is the following:
The intro to the code runs fine. Only when I get to checking the answers against the key, in this case "abcdefabcdefabcdefab", do I run into issues. Regardless of what characters are changed, the program marks roughly half of all characters as mismatching and drops the score down because of it.
I've thought of using a pair of arrays, but then I can't find a solution to setting up the array when some values of it are empty. If the student's answers are too short, e.g. only 15 characters long, I don't know how to compare the blank space, or even store it in the array.
Thank you for any help you can give.
First:
if (studentAnswers == ANSWERKEY)
{...}
else if (studentAnswers != ANSWERKEY)
{ ...}
looks like an overkill when comparing strings. And where is the else part ?
Second, this is risky. Read the IEE754 and articles about cancellation, or even SO:
double answerCount = 0.0;
...
answerCount++
Third:
You are checking character by character using substr. To me it feels like using a hammer to kill a bacteria.
studentAnswers.substr(count, count+1) == ANSWERKEY.substr(count, count+1)
Fourth:
What if studentAnswers is shorter than ANSWERKEY ?
Conclusion:
You need to clarify inputs/expected outputs and use the debugger to better understand what is happening during execution. Carefully check all your variables at each step fo your program.

While loop in C++ (using break)

I'm currently working through the book C++ Primer (recommended on SO book list). An exercise was given that was essentially read through some strings, check if any strings were repeated twice in succession, if a string was repeated print which word and break out of the loop. If no word was repeated, print that. Here is my solution, I'm wondering a) if it's not a good solution and b) is my test condition for no repeated words ok? Because I had to add 1 to the variable to get it to work as expected. Here is my code:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main() {
vector<string> words = {"Cow", "Cat", "Dog", "Dog", "Bird"};
string tempWord;
unsigned int i = 0;
while (i != words.size())
{
if (words[i] == tempWord)
{
cout << "Loop exited as the word " << tempWord << " was repeated.";
break;
}
else
{
tempWord = words[i];
}
// add 1 to i to test equality as i starts at 0
if (i + 1 == words.size())
cout << "No word was repeated.";
++i;
}
return 0;
}
The definition of "good solution" will somewhat depend on the requirements - the most important will always be "does it work" - but then there may be speed and memory requirements on top.
Yours seems to work (unless you have the first string being blank, in which case it'll break); so it's certainly not that bad.
The only suggestion I could make is that you could have a go at writing a version that doesn't keep a copy of one of the strings, because what if they're really really big / lots of them and copying them will be an expensive process?
I would move the test condition outside of the loop, as it seems unnecessary to perform it at every step. For readability I would add a bool:
string tempWord;
unsigned int i = 0;
bool exited = false;
while (i != words.size())
{
if (words[i] == tempWord)
{
cout << "Loop exited as the word " << tempWord << " was repeated.";
exited = true;
break;
}
else
{
tempWord = words[i];
}
++i;
}
// Doing the check afterwards instead
if (!exited)
{
cout << "No word was repeated.";
}
a) if it's not a good solution
For the input specified it is a good solution (it works). However, tempWord is not initialized, so the first time the loop runs it will test against an empty string. Because the input does not contain an empty string, it works. But if your input started with an empty string it would falsely find as repeating.
b) is my test condition for no repeated words ok? Because I had to add 1 to the variable to get it to work as expected.
Yes, and it is simply because the indexing of the array starts from zero, and you are testing it against the count of items in the array. So for example an array with count of 1 will have only one element which will be indexed as zero. So you were right to add 1 to i.
As an answer for the training task your code (after some fixes suggested in other answers) look good. However, if this was a real world problem (and therefore it didn't contain strange restrictions like "use a for loop and break"), then its writer should also consider ways of improving readability.
Usage of default STL algorithm is almost always better than reinventing the wheel, so I would write this code as follows:
auto equal = std::find_adjacent(words.begin(), words.end());
if (equal == words.end())
{
cout << "No word was repeated" << endl;
}
else
{
cout << "Word " << *equal << " was repeated" << endl;
}

How to "Bind" a number to a string of words/phrase so that I can call it up in a loop?

I'm working on a project where I need to have the computer print the 12 days of Christmas lyrics. I thought of an idea where I make a FOR loop and have it repeat 12 times. Every time the day changes with the unary operator "++" Here's what I mean:
int main()
{
string Print = first = 1; //Here I want first to become a number so that I can call it up in FOR loop.
cout << "On the first day of Christmas, \nmy true love sent to me\nA partridge in a pear tree.\n" << endl;
for(int loop = 0; loop <= 12; loop++)//This part is a simple for loop, it starts at 0 and goes to 12 until it stops.
{
cout << "On the " << (1,2,3,4,5,6,7,8,9...12) << " day of Christmas,\nmy true love sent to me\n" << endl; HERE!!!!
Here is where I'm having issue. I want the numbers to call in strings to say the day. As in x = 1 will call in "First" and then I can move the number up by using "x++" which will result in x = 2 and then it will say "Second".. all the way to 12. Anyone know how I can resolve this issue?
}
This involves a simple but important part of programming called an array. I don't want to give you the answer directly - you need to use these (or similar structures) all the time, and it is very important to practice their use and understand them. Let's make a simple program using arrays that prints "Hello World":
#include <iostream>
#include <string>
int main() {
std::string words[2]; //make an array to hold our words
words[0] = "Hello"; //set the first word (at index 0)
words[1] = "World"; //set the second word (at index 1)
int numWords = 2; //make sure we know the number of words!
//print each word on a new line using a loop
for(int i = 0; i < numWords; ++i)
{
std::cout << words[i] << '\n';
}
return 0;
}
You should be able to figure out how to use a similar tactic to get the functionality you asked for above. Working Ideone here.

Loop Design: Counting & Subsequent Code Duplication

Exercise 3-3 in Accelerated C++ has led me to two broader questions about loop design. The exercise's challenge is to read an arbitrary number of words into a vector, then output the number of times a given word appears in that input. I've included my relevant code below:
string currentWord = words[0];
words_sz currentWordCount = 1;
// invariant: we have counted i of the current words in the vector
for (words_sz i = 1; i < size; ++i) {
if (currentWord != words[i]) {
cout << currentWord << ": " << currentWordCount << endl;
currentWord = words[i];
currentWordCount = 0;
}
++currentWordCount;
}
cout << currentWord << ": " << currentWordCount << endl;
Note that the output code has to occur again outside the loop to deal with the last word. I realize I could move it to a function and simply call the function twice if I was worried about the complexity of duplicated code.
Question 1: Is this sort of workaround is common? Is there a typical way to refactor the loop to avoid such duplication?
Question 2: While my solution is straightforward, I'm used to counting from zero. Is there a more-acceptable way to write the loop respecting that? Or is this the optimal implementation?
Why can't you use a map http://www.cplusplus.com/reference/stl/map/ with word as key and value as the count?