Generating random text based on a regex in C++

Generating random text based on a regex in C++ - c++

I just started learning to code, starting with C++ yesterday. I need it for a project I'm doing and I like to build generation tools as an "onboarding" process when I learn a new skills. So I thought I'd try building out a regex generation tool.
I googled, I binged, and I looked through the similar questions and only saw answers pertaining to Ruby, Perl, or JS. Frankly, I'm a bit surprised given the utility and prevalence of C++, not more people have tried this.
I don't know how to go about the task, as I'm not a professional or really knowledgeable about what I'm doing. I'm not sure how to ask such questions, either. Please bare with me while I explain my current thoughts.
I am currently toying around with generating strings using byte arrays (I find the C++ type system and casting is confusing at times). I wanted to see if there were any specific ranges of random values that produce strings with latin characters more than others. I get a lot of different values, and found a few ranges that looked like sweet spots, but I ultimately don't know what numbers correlate to what characters.
I wanted to establish a pattern, then set the rand() ranges to correlate with the projected total byte value of what the pattern should generate as a string, then go fishing. I understand that I have to account for upper bounds for characters. So the generated values would be something like:
//not implemented
int getBoundary(string expression){
srand(time(0));
int boundaries[2] = {0};
boundaries[0] = getCeilingValue(expression)
boundaries[1] = getFloorValue(expression)
return boundaries
}
practice.cpp
/*
Method actually producing the byte strings
*/
void practice::stuub(int boundaries[2]){
srand(time(0)); //seed
basic_string<char> byteArray = {}; //"byte array" instantiation
for (int i = 0; i < 1000; i += 1) {
if(i % 2 ==0){
byteArray.push_back(rand() % boundaries[0]);//ceiling
}else{
byteArray.push_back(rand() % boundaries[1]);//floor
}
}
std::string s(byteArray, sizeof(byteArray)); //convert to string
cout << s << "\n";
}
/*
just a copy pasta validation function that I don't know if I need yet
*/
bool isNumeric(string str) {
for (int i = 0; i < str.length(); i++)
if (isdigit(str[i]) == false)
return false; //when one non numeric value is found, return false
return true;
}
/*
current putzing around. It's just been real fun to play around with,
but I plan to replace the instantiation of values of the "mod" array with
the upper/lower bounds of the string projected values This currently takes
a value and just does random stuff to it on a fishing expedition to see
if I can find any patterns.
*/
void practice::randomStringGen() {
try {
srand(time(0));
int mod[2] = {0};
string choice;
while (choice != "q") {
cout << "\n enter an integer to generate random byte strings or press (q) to quit \n";
cin >> choice;
if(choice != "q") {// make sure its not quit, otherwise it still carries out the tasks
if (isNumeric(choice)) {//make sure its numeric
mod[0] = stoi(choice);
if(mod[0] > 0) {//make sure its not 0
mod[0] = int(pow(mod[0], mod[0]));//do some weirdo math
mod[1] = rand() % mod[0]+1; //get another weirdo number
cout << "\n random string start:\n";
stuub(mod);//generate random string
cout << "\n :random string end\n";
}else{//user entered invalid integer
cout << "\n you did not enter a valid integer. Enter numbers greater than 0";
}
}else{
cout << "\n " << choice << " is not an integer";
}
}
}
}catch(std::exception& e){
cout << e.what();
}
}
I hope that provides enough explanation of what I am trying to accomplish.
I'm not any sort of pro, and I have very little understanding of what I'm doing.
I picked this up yesterday as a absolute beginner.
Talk to me like I'm 5 if you can.
Also, any recommendations on how to improve and "discretize" what I'm currently doing would be much appreciated. I think the nested "ifs" look wonky, but that's just a gut instinct.
Thanks!

Related

Why does my function not switch the first character with the last one of my string?

I picked up a challenge on r/dailyprogrammer on reddit which wants me to match a necklace and put the last letter at the beginning of a string. I've considered using nested for loops for this but this has made me really confused.
Instead I chose the way of replacing the last with the first character in an if-statement. But I am not getting my desired output with it, though I've tried everything what comes into my mind.
I used even std::swap() which didn't lead me to success either.
Here's the code:
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
string same_necklace(string& sInput, string& sOutput)
{
for (string::size_type i = 0; i < sInput.size(); i++)
{
if (sInput[i] == sInput.size())
{
sInput[0] = sInput[sInput.size()];
}
}
for (string::size_type j = 0; j < sOutput.size(); j++)
{
if (sOutput[j] == sOutput.size() - 1)
{
sOutput[0] = sOutput[sOutput.size()];
}
}
return sInput, sOutput;
}
int main()
{
system("color 2");
string sName{ "" };
string sExpectedOutput{ "" };
cout << "Enter a name: ";
cin >> sName;
cout << "Enter expected output: ";
cin >> sExpectedOutput;
cout << "Result: " << same_necklace(sName , sExpectedOutput) << endl;
return 0;
}
And of course the link to my challenge (don't worry, it's just Reddit!):
https://www.reddit.com/r/dailyprogrammer/comments/ffxabb/20200309_challenge_383_easy_necklace_matching/
While I am waiting (hopefully) for a nice response, I will keep on trying to solve my problem.

In your if you compare the value of the current index (inside the loop) with the size of the string. Those are two unrelated things.
Also, you use a loop though you only want to do something on a single, previously known index.
for (string::size_type i = 0; i < sInput.size(); i++)
{
if (sInput[i] == sInput.size())
{
sInput[0] = sInput[sInput.size()];
}
}
You could change the if condition like this to achieve your goal:
if (i == sInput.size()-1) /* size as the index is one too high to be legal */
But what is sufficient and more elegant is to drop the if and the loop. completely
/* no loop for (string::size_type i = 0; i < sInput.size(); i++)
{ */
/* no if (sInput[i] == sInput.size())
{*/
sInput[0] = sInput[sInput.size()-1]; /* fix the index*/
/* }
} */
I.e.
sInput[0] = sInput[sInput.size()-1]; /* fix the index*/
Same for he output, though you got the correct index already correct there.
This is not intended to solve the challenge which you linked externally,
if you want that you need to describe the challenge completely and directly here.
I.e. this only fixes your code, according to the desription you provide here in the body of your question,
"put the last letter at the beginning of a string".
It does not "switch" or swap first and last. If you want that please find the code you recently wrote (surely, during your quest for learning programming) which swaps the value of two variables. Adapt that code to the two indexes (first and last, 0 and size-1) and it will do the swapping.
So much for the loops and ifs, but there is more wrong in your code.
This
return sInput, sOutput;
does not do what you expect. Read up on the , operator, the comma-operator.
Its result is the second of the two expressions, while the first one is only valuated for side effects.
This means that this
cout << "Result: " << same_necklace(sName , sExpectedOutput) << endl;
will only output the modified sExpectedOutput.
If you want to output both, the modified input and the modified output, then you can simply
cout << "Result: " << sName << " " << sExpectedOutput << endl;
because both have been given as reference to the function and hence both contain the changes the function made.
This also might not answer the challenge, but it explains your misunderstandings and you will be able to adapt to the challenge now.

You have not understand the problem i guess.
Here you need to compare two strings that can be made from neckless characters.
Lets say you have neckless four latters word is nose.
Combination is possible
1)nose
2)osen
3)seno
4)enos
your function (same_necklace) should be able to tell that these strings are belongs to same necklace
if you give any two strings as inputs to your function same_necklace
your function should return true.
if you give one input string from above group and second input string from other random word thats not belongs to above group, your function should return false.
In that sense, you just take your first string as neckless string and compare other string with all possible combination of first string.
just move move you first latter of first input string to end and then compare each resulting string to second input string.
below is the function which you can use
void swap_character(string &test)
{
int length = test.length();
test.insert(length, 1, test[0]);
test.erase(0, 1);
}

C++ string number of occurence

This is my first time asking something on stackoverflow, so I'm sorry if I fail in any aspect of building the topic etc...
So I'm a newbie at C++, I'm still at the beginning. I'm using a guide someone recommended me, and I'm stuck in a exercise which is about char and strings.
It's the following: They ask me to create a function that says the number of times that a certain word was repeated on a string.
I'll leave my code below for someone who can help me, if possible dont give me an obvious response like the code and then I just copy paste it. If you can just give me some hints on how to do it, I want to try to solve it on my own. Have a good night everyone.
#include <iostream>
#include <string.h>
#define MAX 50
using namespace std;
int times_occ(string s, string k) {
int count = 0, i = 0;
char word[sizeof(s)];
// while (s[i] == k[i])
// {
// i++;
// if (s[i] == '\0')
// {
// break;
// }
// }
for (i = 0; i <= sizeof(s); i++) {
if (s[i] == ' ' || s[i] == '\0') {
break;
}
word[i] = s[i];
}
word[i] = '\0';
for (i = 0; i <= sizeof(k); i++) {
if (word) {
if (k[i] == word[a]) {
a++;
count++;
}
}
}
cout << word << endl;
cout << count << endl; // this was supposed to count the number of times
// certain word was said in a string.
return count;
}
int main() {
char phrase[MAX];
char phrase1[MAX];
cin.getline(phrase, MAX);
cin.getline(phrase, MAX);
times_occ(phrase, phrase1);
}

Okay, first of all, the way you've used sizeof isn't really valid.
sizeof won't tell you the length of a string. For that, you want std::string::size() instead.
In this case, std::string is an object of some class, and sizeof will tell you the size of an object of that class. Every object of that type will yield the same size, regardless of the length of the string.
For example, consider code like this:
std::string foo("123456789");
std::string bar("12345");
std::cout << sizeof(foo) << "\t" << foo.size() << "\n";
std::cout << sizeof(bar) << "\t" << bar.size() << "\n";
When I run this, I get output like this:
8 9
8 5
So on this implementation, sizeof(string) is always 8, but some_string.size() tells us the actual length of the string.
So, that should at least be enough to get you started moving in a useful direction.

As #JerryCoffin mentioned, your word array has an invalid size. But - I want to make a more fundamental point:
Your code has two loops and a bunch of variables with arbitrary names. How should I know what's the difference between s and k? I even get k and i mixed up in the sense of forgetting that k is a string, not an integer. That kind of code difficult to read, and to debug. And we are a bit lazy and don't like debugging other people's code...
I suggest that you:
Have a very clear idea what your loops do, or what the different parts of your function do.
Create small self-contained functions - no more than one loop each please! - for each of those parts.
Use meaningful names for each function's parameters and for the local variables.
And then, if your program doesn't work - try debugging one function at a time.

Issue Comparing strings for an Answer Key (C++)

I'm working on a midterm project for my coding class, and while I've gotten the majority of kinks worked out I'm struggling with comparing two string values and determining if they are equal or not. The strings in question are ANSWERKEYand studentAnswers. The former is a constant that the latter is compared to.
The code in question is as follows.
if (studentAnswers == ANSWERKEY)
{
percentScore = 100.0;
cout << "Score: " << percentScore << " % " << 'A' << endl;
}
else if (studentAnswers != ANSWERKEY)
{
int count = 0;
double answerCount = 0.0;
while (count < ANSWERKEY.length())
{
if (studentAnswers.substr(count, count+1) == ANSWERKEY.substr(count, count+1)
{
answerCount++;
count++;
}
else
{
cout << "Incorrect answer." << endl;
count++;
}
}
percentScore = ((answerCount) / (double)ANSWERKEY.length()) * 100.0;
cout << "Percent score is " << percentScore << "%" << endl;
}
The exact issue I'm facing is that I can't work out a better way to compare the strings. With the current method, the output is the following:
The intro to the code runs fine. Only when I get to checking the answers against the key, in this case "abcdefabcdefabcdefab", do I run into issues. Regardless of what characters are changed, the program marks roughly half of all characters as mismatching and drops the score down because of it.
I've thought of using a pair of arrays, but then I can't find a solution to setting up the array when some values of it are empty. If the student's answers are too short, e.g. only 15 characters long, I don't know how to compare the blank space, or even store it in the array.
Thank you for any help you can give.

First:
if (studentAnswers == ANSWERKEY)
{...}
else if (studentAnswers != ANSWERKEY)
{ ...}
looks like an overkill when comparing strings. And where is the else part ?
Second, this is risky. Read the IEE754 and articles about cancellation, or even SO:
double answerCount = 0.0;
...
answerCount++
Third:
You are checking character by character using substr. To me it feels like using a hammer to kill a bacteria.
studentAnswers.substr(count, count+1) == ANSWERKEY.substr(count, count+1)
Fourth:
What if studentAnswers is shorter than ANSWERKEY ?
Conclusion:
You need to clarify inputs/expected outputs and use the debugger to better understand what is happening during execution. Carefully check all your variables at each step fo your program.

Confusion about correct design of a program C++

I made a small program that generates primes and lets the user check a number and see if it's a prime or not. Problem is, I'm not sure how to properly design it. This is the program:
#include <iostream>
#include <vector>
typedef unsigned long long bigint;
std::vector<bool> sieve(size_t size)
{
std::vector<bool> primelist(size);
primelist[0] = false;
primelist[1] = false;
for (bigint i = 2; i < size; ++i) { primelist[i] = true; }
for (bigint i = 2; i * i < size; ++i)
{
if (primelist[i])
{
for (bigint j = i; j * i < size; ++j)
primelist[i*j] = false;
}
}
return primelist;
}
int main()
{
bigint range;
bigint number;
std::vector<bool> primes;
std::cout << "Enter range: " << std::endl;
std::cin >> range;
primes = sieve(range);
while (1)
{
std::cout << "Enter number to check: " << std::endl;
std::cin >> number;
if (primes[number])
std::cout << "Prime" << std::endl;
else
std::cout << "Not prime" << std::endl;
}
return 0;
}
The basic flow I want to achieve is: Input range, /handle input/, input number to check, /handle input/
I also want to give the user an option to change the range at any given time, by writing a command like "change range number"
I have a few problems with this:
I want the program to be under control if the user inputs a range bigger than unsigned long long, and if the user basically exceeds any limit(like for example if the range he input was 100 then if he checks for 101) an exception will be caught. I know this needs to be implemented using try/catch/throw, but I have no idea how to do that while keeping the option to change the range and without making my code spaghetti code.
Also, I want the errors to be of enum type(I read that enums are good for exceptions), something like
enum errors
{
OUT_OF_RANGE = 1, //Out of the range specified by the user
INCORRECT_VALUE, //If user input "one" instead of 1
RANGE_SIGNED, //If user inputs a signed value for range
NUM_LIMITS //Number exceeds unsigned long long
};
I have no idea how to use exception handling, not to mention using it with enums. How the hell do I keep this program safe and running, while keeping away from spaghetti code?
I am extremely confused. If someone could help me design this program correctly and maintain readability and efficiency, it will really improve my future program designs.
Thanks for reading!

You asked a lot.
You want to validate user input. Users should not be able to enter huge numbers, non-integers, and so on.
I'm going to start off by answering that this is absolutely not a scenario that exceptions should be used for. Exceptions are used to handle exceptional circumstances. These are ones you can't anticipate or really deal with.
A user enters a number that's too big? You can handle that. Tell them that their number is too big, please enter a number between 1 and X.
A user enters the word apple? You can handle that. Tell them that they can only enter integers.
One way of doing this would be to make a ValidateInput function. You can have it return a number (or an enum, they're basically the same thing) to tell you whether there was an error.
In order to do the validation, you will most likely have to receive input as an std::string and then validate it before turning it into a number. Getting input as an unsigned int or similar integral type doesn't really allow you to check for errors.
This adds a bit of work, since you need to manually validate the input manually. There are libraries with functions to help with this, such as boost::lexical_cast, but that's probably too much for you right now.
Below is some very basic psuedo code to illustrate what I mean. It's only meant to give you an idea of what to do, it won't compile or do the work for you. You could extend it further by making a generic function that returns a message based on an error code and so on.
enum error_code {
SUCCESS, // No error
OUT_OF_RANGE, // Out of the range specified by the user
INCORRECT_VALUE, // If user input "one" instead of 1
RANGE_SIGNED, // If user inputs a signed value for range
NUM_LIMITS // Number exceeds unsigned long long
};
// This function will check if the input is valid.
// If it's not valid, it will return an error code to explain why it's invalid.
error_code ValidateInput(const std::string& input) {
// Check if input is too large for an unsigned long long
if (InputIsTooLarge)
return NUM_LIMITS;
// Check if input is negative
if (InputIsNegative)
return RANGE_SIGNED;
// Check if input is not an integer
if (InputIsNotInteger)
return INCORRECT_VALUE;
// If we make it here, no problems were found, input is okay.
return SUCCESS;
}
unsigned long long GetInput() {
// Get the user's input
std::string input;
std::cin >> input;
// Check if the input is valid
error_code inputError = ValidateInput(input);
// If input is not valid, explain the problem to the user.
if (inputError != SUCCESS) {
if (inputError == NUM_LIMITS) {
std::cout << "That number is too big, please enter a number between "
"1 and X." << std::endl;
}
else if (inputError == RANGE_SIGNED) {
std::cout << "Please enter a positive number." << std::endl;
}
else if (inputError == INCORRECT_VALUE) {
std::cout << "Please enter an integer." << std::endl;
}
else {
std::cout << "Invalid input, please try again." << std::endl;
}
// Ask for input again
return GetInput();
}
// If ValidateInput returned SUCCESS, the input is okay.
// We can turn it into an integer and return it.
else {
return TurnStringIntoBigInt(input);
}
}
int main() {
// Get the input from the user
unsigned long long number = GetInput();
// Do something with the input
}

I like Dauphic's answer, particularly because it illustrates breaking down the problem into bits and solving them individually. I would, however, do GetInput a bit differently:
unsigned long long GetInput() {
// Get the user's input
std::string input;
error_code inputError;
// Repeatedly read input until it is valid
do {
std::cin >> input;
inputError = ValidateInput(input);
if (inputError == NUM_LIMITS) {
std::cout << "That number is too big, please enter a number between "
"1 and X." << std::endl;
}
// ...handle all other cases similarly
} while(inputError != SUCCESS);
// If ValidateInput returned SUCCESS, the input is okay.
// We can turn it into an integer and return it.
return TurnStringIntoBigInt(input);
}
The recursive solution is nice, but has the drawback of, well, being recursive and growing the stack. Probably that's not a big deal in this case, but it is something to watch out for.
As for how to write ValidateInput, basically you're going to be scanning the string for invalid characters and if none are found, testing if the value will fit in your chosen integer type until reading it into a variable with e.g. >>.
note: this solution has a serious flaw in that it doesn't check the state of std::cin. If the user were to pass EOF, i.e. press ^D, the program would get stuck in the loop, which is not good behavior.

Instead of a vector of bool you'd better use a bitset
With that, you can use the Eratosthene method to determine if a number is prime or not.

Parse int and string

Hi I'm not sure if this is the right place to ask this question.
Anyway I have written this code to parse a molecule formula and split it into atoms and amount of each atoms.
For instance if I input "H2O" I will for the atom array get {"H", "O"} and in the amount array I will get {2, 1}. I haven't taken account for amount that is larger than 9, since I don't think there are molecule which can bind to something that is larger than 8.
Anyway I'm quite newbie, so I wonder if this piece of code can be made better?
string formula = "H2O";
int no, k = 0, a = 0;
string atom[10];
int amount[10];
bool flag = true;
stringstream ss(formula);
for(int i = 0; i < formula.size(); ++i)
{
no = atoi(&formula[i]);
if(no == 0 && (flag || islower(formula[i]) ) )
{
cout << "k = " << k << endl;
atom[k] += formula[i];
flag = false;
cout << "FOO1 " << atom[k] << endl;
amount[a] = 1;
}
else if(no != 0)
{
amount[a] = no;
cout << "FOO2 " << amount[a] << endl;
a++;
flag = true;
k++;
}
else
{
k++;
a++;
atom[k] = formula[i];
cout << "FOO3 " << atom[k] << endl;
amount[a] = 1;
flag = false;
}
cout << no << endl;
}

Have you considered an approach with regular expressions? Do you have access to Boost or TR1 regular expressions? An individual atom and its count can easily be represented as:
(after edits based on comments)
([A-Z][a-z]{0,2})([0-9]*)
Then you just need to repeatedly find this pattern in your input string and extract the different parts.

There are many potential improvements that could be made, of course. But as a newbie, I guess you only want the immediate ones. The first improvement is to change this from a program that has a hard coded formula to a program that reads a formula from the user. Then try testing yout program by inputting different formulae, and check that the output is correct.

What if you modified it to be like this algorithm? This would maybe be less code, but would definitely be more clear:
// while not at end of input
// gather an uppercase letter
// gather any lowercase letters
// gather any numbers
// set the element in your array
This could be implemented with 3 very simple loops inside of your main loop, and would make your intentions to future maintainers much more obvious.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Generating random text based on a regex in C++ - c++

Related

Why does my function not switch the first character with the last one of my string?

C++ string number of occurence

Issue Comparing strings for an Answer Key (C++)

Confusion about correct design of a program C++

Parse int and string

Categories

Resources