Vertical Bar Graphs in C++ - c++

Ok, I'm trying to make a vertical bar graph from the values in a file. The code below works, to a point that is, and prints horizontally, but one asterisk per line, meaning there are spaces (obviously). Not looking for a spoonfed answer, just a push in the right direction.
using namespace std;
int main()
{
int counter;
cout<<"Please enter a number"<< "\n";
counter=0;
char *fname = "C:/Users/Jordan Moffat/Desktop/coursework/problem2.txt";
int x;
ifstream infile(fname);
while (infile >> x)
{
if (x==0 && x<=10){
cout<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=10 && x<=20){
cout<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=20 && x<=30){
cout<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=30 && x<=40){
cout<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>= 40 && x<=50){
cout<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=50 && x<=60){
cout<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=60 && x<=70){
cout<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\t"<<"\n";
}
else if (x>=70 && x<=80){
cout<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\t"<<"\n";
}
else if (x>=80 && x<=90){
cout<<"*"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\t"<<"\n";
}
else if (x>=90 && x<=100){
cout<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"\t"<<"*"<<"\n";
}
}
cout<<"====================================================================================="<< "\n";
cout<<"0-9"<<"10-19"<<"20-29"<<"30-39"<<"40-49"<<"50-59"<<"60-69"<<"70-79"<<"80-89"<<"90-100"<<"\n";
system("PAUSE");
}

You have two problems. Apparantly you want to build a histogram and you want to visualize this histogram.
Histogram
One approach to build the histogram requires you to pre-specify the number of bins (homogeneous width), the minimum value (inclusive) and the maximum value (non-inclusive). Then you can compute the index of the bin each item should be assigned to.
Here's an (untested) example:
const int nbins = 10;
const double minval = .0, maxval = 100.;
std::vector<int> bins(nbins, 0);
for (double x; infile >> x; ) {
if (x >= minval && x < maxval) {
// note that integer rounding is probably towards zero, not towards -inf
int idx = floor((x-minval)/(maxval-minval)*nbins);
bins[idx]++;
}
else {
// handle outlier
}
}
Visualization
The approach described in this answer seems appropriate. For large bin counts you may need some normalization procedure, i.e. scaling the values to a range of [0,10] or similar.
Have a look at this (untested) example:
const int chart_height = 10;
const int max_count = *std::max_element(bins.begin(), bins.end());
for (int current_height = chart_height; current_height > 0; --current_height) {
for (int count : bins) {
const int bar_height = (count*chart_height)/max_count;
if (bar_height < current_height)
std::cout << " "; // we're still above the bar
else if (bar_height == current_height)
std::cout << " _ "; // reached the top of the bar
else // bar_height > current_height
std::cout << " | | "; // now the rest of the bar ...
}
std::cout << '\n';
}
With a little bit of fiddling and formatting magic you can also extend it to produce a borderline flexible visualization like this:
11 | _______ _______
| | | | |
| | | | |
| | | | |
| | | | | _______
5 | | | | | | |
| | | | | | |
| | | | | | | _______
| _______ | | | | | | _______ | |
| | | | | | | | | | | | |
+------v----------v----------v----------v----------v----------v-----
3.7 - 4.3 4.3 - 4.9 4.9 - 5.6 5.6 - 6.2 6.2 - 6.8 6.8 - 7.4

To make your bars vertically you need:
get all numbers in an array
determine the range, i.e. compute the max and min value of the array
make a loop over range, printing rows, leaving spaces on columns associate to values lower than the current row is 'depicting'.
here I assume steps 1 & 2 as done, just show the loop and gloss over some detail (note the code doesn't use min and loop from 0)
int values[] = {2,5,1,9,3}, cols = 5, max = 9;
for (int r = 0; r < max; ++r) {
for (int c = 0; c < cols; ++c)
cout << (r + values[c] >= max ? '*' : ' ');
cout << endl;
}
here the output
*
*
*
*
* *
* *
* **
** **
*****

You should read your data into an std::vector
Use two nested loops:
Looping over lines you print where first line is "0->10", second line "10->20" etc.
Looping over the vector, if variable is larger than (linecount-linenumber)*10, print " ", else print "*".
If your data goes from 0 to 100, linecount should be 10.
linenumber is the loop variable from first loop
It is not clear to me how your data is organized in the file. If your data file doesn't contain values which say how many *s each column should have, you should calculate that first.

Just having fun and practicing :)
enter any number sequence you want up to 100 numbers and press 0 to stop and make the graph :)
#include <iostream>
#include <limits>
using namespace std;
int main()
{
const int MAX = 100;
int values[MAX];
int input_number;
int total_number =0;
int largest_number = 0;
for (int i = 0; i < MAX; i++)
{
cin >> input_number;
if (input_number != 0)
{
total_number++;
values[i] = input_number;
}
else if (input_number == 0){
for (int t = 0;t<total_number;t++){
if(values[t]>largest_number)
largest_number = values[t];
}
for (int j = 0; j <largest_number; ++j){
for (int i = 0; i <total_number; ++i)
cout << (j+values[i] >= largest_number ? '*' : ' ') ;
cout << endl;
}
break;
}
}
system ("PAUSE");
return 0; // everything ok
}

I guess only one loop for each if condition you have written is sufficient.
And I agree with Kleist that Looping over variables and print * for each variable

(Similar to the answer of Kleist)
Make an array to define the y-axis (or a formula, based on an index)
Read the numbers from a file a save in a container, so you also know the number of values (x-axis)
double loop, 1 for the y-axis, 1 for the x-axis and find out whether an asterisk must be printed, based on 1.
Let the y-axis counter decrement, so your bars are rising.

Related

Create and print 2D board functions, using specific seed value with no repetitions

Before you read ahead or try to help, this question is regarding my homework so the requirements to this question will be very specific.
I am writing a code that has 2 functions. The first function creates or initializes a 5*5 matrix for an array with numbers from 1 - 25 in random positions.
The second function prints it. However, I am required to use a seed value of 233 in the srand() function which I am unsure of how to use, despite constantly searching for it online. Anyway, the printout should look something like this:
--------------------------
| 4 | 5 | 10 | 21 | 22 |
--------------------------
| 1| 11 | 3 | 19 | 20 |
--------------------------
| 24 | 18 | 16 | 14 | 9|
--------------------------
| 17 | 7 | 23 | 15 | 6|
--------------------------
| 2 | 12 | 13 | 25 | 8 |
--------------------------
The first and most easily explainable issue that I have is that all my display function is doing is printing all the values in a straight line and not in the format that I want it to be.
The other part is that when I use into srand(time(233)), it gives me an error and I'm not sure why even though it is required for my assignment.
The second issue is that some of the numbers start reoccurring in the matrix and they are not supposed to, is there a way to make sure there are no duplicates in the matrix?
Although this is in the C++ language, what I have learned is the C style syntax (no std:: kinds of code or stuff like that). So far I have learned basic arrays, loops, and functions.
#include <iostream>
#include <ctime>
using namespace std;
const int ROW_SIZE = 5;
const int COLUMN_SIZE = 5;
void createBoard(int matrix[][5]);
void display(int matrix[][5]);
int main()
{
srand(time(233)); //the seed value error
int matrix[5][5];
createBoard(matrix);
display(matrix);
}
void createBoard(int matrix[][5])
{
for (int i = 0; i < ROW_SIZE; i++)
{
for (int j = 0; j < COLUMN_SIZE; j++)
{
matrix[i][j] = 1 + rand() % 25;
}
}
}
void display(int matrix[][5])
{
cout << "--------------------------" << endl;
for (int i = 0; i < ROW_SIZE; i++)
{
for (int j = 0; j < COLUMN_SIZE; j++)
{
cout << "| " << matrix[i][j];
}
}
cout << "--------------------------" << endl;
}
Assuming the function time is a requirement, it receives the address of a time_t variable so you need something like:
time_t t = 233;
srand(time(&t));
Though the function will just replace the value of t, so, there is that.
If not, as suggested by molbdnilo, you can use srand(233)(which is probably what is being requested), but know that this will generate the same repeated sequence.
As for the repeated values in the array, a possible strategy is to go back in the array from the generated index and as soon as you find a repetition, stop, and generate a new one, repeat until no equal number is found, though you have better methods and algorithms.
Since you are not to use std:: kinds of code or stuff , as you so eloquently put it, here is a C post that may help:
Unique random number generation in an integer array
The array print formatting issue is just a matter of adjusting and printing the lines in the correct position, to keep a consistent spacing you should use <iomanip> library, setw():
#include <iomanip>
void display(int matrix[][5])
{
cout << " +------------------------+" << endl;
for (int i = 0; i < ROW_SIZE; i++)
{
for (int j = 0; j < COLUMN_SIZE; j++)
{
cout << " | " << setw(2) << matrix[i][j]; // set spacing
}
puts(" |\n +------------------------+");
}
}
Output:
+------------------------+
| 16 | 25 | 23 | 1 | 24 |
+------------------------+
| 11 | 4 | 23 | 7 | 22 |
+------------------------+
| 21 | 23 | 12 | 6 | 15 |
+------------------------+
| 18 | 10 | 8 | 22 | 11 |
+------------------------+
| 23 | 18 | 22 | 18 | 16 |
+------------------------+
Footnote:
There are much better ways to do this, not using rand, if not for your homework, you should take a look for future memory:
https://en.cppreference.com/w/cpp/numeric/random
You can use a Int Array with 26 element(cause its final index is 25)
then set all of the element to 0
use a while loop to try to generate a X number,if it hasnt been used(Check[X] =0), let matrix[i][j] = X and let Check[X] = 1, if it has been used (Check[X]=1) then break the while loop)
And with the seed 233, I dont know why its not run but when i replace it with 'NULL', its run pretty good :D
#include <iostream>
#include <ctime>
using namespace std;
const int ROW_SIZE = 5;
const int COLUMN_SIZE = 5;
int check[26]={0};
void createBoard(int matrix[][5]);
void display(int matrix[][5]);
int main(){
srand(time(NULL)); //the seed value error
int matrix [5][5];
createBoard(matrix);
display(matrix);
}
void createBoard(int matrix[][5])
{
for (int i = 0; i < ROW_SIZE; i++)
{
for(int j = 0; j < COLUMN_SIZE; j++)
{
while (true)
{
//random number X;
int x = 1 + rand() % 25;
if(!check[x]) // If X not used;
{
matrix[i][j] = x;//add to table;
check[x]=1; //Mark that X used;
break;
}
}
}
}
}
void display(int matrix[][5]){
cout<<"--------------------------"<< endl;
for(int i = 0; i < ROW_SIZE; i++){
for(int j = 0; j < COLUMN_SIZE; j++){
cout<<"| "<< matrix[i][j];
}
}
cout<<"--------------------------"<< endl;
}
For your display function, you just have to add line endings (std::endl) at the right place:
void display(int matrix[][5]){
cout<<"--------------------------"<< endl;
for(int i = 0; i < ROW_SIZE; i++){
for(int j = 0; j < COLUMN_SIZE; j++){
cout<<"| "<< matrix[i][j] << " ";
}
cout <<"|" << endl;
}
cout<<"--------------------------"<< endl;
}
For the creation, if you use C++, you can use shuffle: http://www.cplusplus.com/reference/algorithm/shuffle/
void createBoard(int matrix[][5]){
// Create an array { 1, 2 ... 25}
std::array<int,ROW_SIZE * COLUMN_SIZE> tmp;
for (int i = 0; i < ROW_SIZE * COLUMN_SIZE; i++)
{
tmp[i] = i + 1;
}
// define your seed
unsigned seed = 233;
// shuffle your array using that seed
shuffle (tmp.begin(), tmp.end(), std::default_random_engine(seed));
// store the elements in your matrix
for (int i = 0; i < ROW_SIZE; i++){
for(int j = 0; j < COLUMN_SIZE; j++){
matrix[i][j] = tmp[i * COLUMN_SIZE + j];
}
}
}
Note that if you're using C++, you can use STL containers to store your 5x5 board (like array, vector etc...). They come with very handy features (like shuffle).
Note also that the seed is just a number to initialize your random generator. Setting it to 233, makes sure that two different executions of your program will always generate the same sequence of number (that's how you understand that in computer world, it is not really random, but pseudo-random).

Permutations with output limitation in C++

I am trying to do the permutations of 8 characters, but I am only interested in output which contains maximum of 3 same characters. So any output which contains any character in more than 3 occurrences should be skipped.
Character set: a, b, c, d, e, f, g, G
Example:
Not interested in output e.g. aaaaaaab , aabcdeaa, acdGGGGg, GGGGbbbb ...
Interested in output e.g. abcdefgG, aaabcdef, abacadGf ...
I tried to write a code where I evaluate in each cycle number of occurrence of each character and skip (break/continue) to next loop if more than 3 same character occurrences are present.
Here is problem with my code which I can't solve. The program do only permutations starting with character 'a' and stops at aaabgGGG and I can't manage it to continue with iterations starting with b, c, d, e etc...
I want to achieve filtering during cycle to avoid unneeded cycles to occur => achieve as fast processing as possible.
When commenting the the ">3 occurrences filter" code between ##### lines, all permutations are processed correctly.
My code:
#include <iostream>
// C++ program to print all possible strings of length k
using namespace std;
int procbreak = 0;
// The main recursive method to print all possible strings of length k
void printAllKLengthRec(char set[], int setn[], string prefix, int n, int k)
{
// Base case: k is 0, print prefix
//cout << "03. In printAllKLengthRec function" << endl;
if (k == 0)
{
//print table with characters and their count
cout << (prefix) << endl;
cout << " | ";
for (size_t b = 0; b < 8; b++)
{
cout << set[b] << " | ";
}
cout << endl;
cout << " | ";
for (size_t c = 0; c < 8; c++)
{
cout << setn[c] << " | ";
}
cout << endl;
return;
}
// One by one add all characters from set and recursively call for k equals to k-1
for (int i = 0; i < n; i++)
{
cout << "04. In for loop where one by one all chars are added. K = " << k << "; I = " << i << "; N = " << n << endl;
string newPrefix;
//update characters count table
setn[i] += 1;
if (i > 0)
{
setn[i - 1] -= 1;
}
else
{
if (setn[7] > 0)
{
setn[7] -= 1;
}
}
//#############################################################################################
//check if there is any character in a table with count more than 3, then break current cycle
for (size_t d = 0; d < 8; d++)
{
if (setn[d] > 3)
{
procbreak = 1;
break; // enough to find one char with >3, then we don't need to continue and break operation
}
}
if (procbreak == 1)
{
procbreak = 0; // reset procbreak
continue; // skip to next cycle
}
//#############################################################################################
// Next character of input added
newPrefix = prefix + set[i];
// k is decreased, because we have added a new character
printAllKLengthRec(set, setn, newPrefix, n, k - 1);
}
}
void printAllKLength(char set[],int setn[], int k, int n)
{
cout << "02. In printAllKLength function" << endl;
printAllKLengthRec(set, setn, "", n, k);
}
// Main code
int main()
{
cout << "Start" << endl;
char set1[] = { 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'G' };
int setn[] = { 0, 0, 0, 0, 0, 0, 0, 0 };
int k = 8; // string length
printAllKLength(set1, setn, k, 8); // 8 = n => number of characters in the set1
}
Where is main mistake in my code logic?
The solution to your problem is pretty simple.
What you want to do is to take your character set: a, b, c, d, e, f, g, G
and construct a "fake" sequence with each character triplicated.
std::string perm{"GGGaaabbbcccdddeeefffggg"};
The key insight here is that you can compute your permutations as usual, e.g., using std::next_permutation. You just need to take the first 8 elements from that permutation to have the result that you need.
[Edit: In order to avoid computing permutations for the rightmost 16 values, since these will always yield duplicates for the leftmost 8 values, after each step set the rightmost 16 values to the last permutation. The next call to std::next_permutation will permute the first 8 values.]
[Edit2: Working example
#include <algorithm>
#include <chrono>
#include <iostream>
int main()
{
// Initial state
std::string perm{"GGGaaabbbcccdddeeefffggg"};
using clock = std::chrono::steady_clock;
auto start = clock::now();
do
{
// Output permutation
std::cout << perm.substr(0, 8) << "\n";
// Now reverse the last 16 values, so that the call to the next_permutation would change the top 8
std::reverse(std::next(perm.begin(), 8), perm.end());
} while (std::next_permutation(perm.begin(), perm.end()));
std::clog << "Elapsed: " << std::chrono::duration_cast<std::chrono::milliseconds>(clock::now() - start).count() << "ms\n";
return 0;
}
]
I have found where the problem with filtering was...
The whole permutation is done by running cycles within cycles, in other words the function is calling itself.
When passing from right hand character (right most) to the left hand character (one step to the left), function is doing empty 'k' cycles (1 empty 'k' cycle when going from position 8 to 7 .... up to 7 empty 'k' cycles when going from position 2 to 1).
<-----------|
12345678
My initial code was evaluating the count of each character during each of these empty 'k' cycles.
And that was the issue.
During the empty 'k' cycles, the count of each character is changing and when the empty cycle finishes, the count of the character is real and exactly as it should be.
So the solution is, to do the evaluation of count of each character and if any of the chars is in count >3, break only the last cycle when k = 1.
I was breaking the loop in very first empty cycle, where the count of the characters in string were incorrect.
01. In for loop where one by one all chars are added. K = 1; I = 7; N = 8 <--- OK, loop when the last G was added to form string aaaaabGG
table in for loop
| a | b | c | d | e | f | g | G |
| 5 | 1 | 0 | 0 | 0 | 0 | 0 | 2 |
aaaaabGG <--- aaaaabGG was formed
table in base <--- aaaaabGG shown in the final output
| a | b | c | d | e | f | g | G |
| 5 | 1 | 0 | 0 | 0 | 0 | 0 | 2 |
02. In for loop where one by one all chars are added. K = 3; I = 2; N = 8 <--- going one character UP, next string after aaaaabGG should be aaaaacaa
table in for loop
| a | b | c | d | e | f | g | G |
| 5 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | <--- but as we can see, during the K = 3 empty loop, the string is aaaaacGG (updates only 3rd char from left)
03. In for loop where one by one all chars are added. K = 2; I = 0; N = 8 <--- second empty loop K = 2
table in for loop
| a | b | c | d | e | f | g | G |
| 6 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | <--- as we can see, during the K = 2 empty loop, the string is updating and is now aaaaacaG (now updates only 2nd char from left, 3rd is OK from previous empty loop)
04. In for loop where one by one all chars are added. K = 1; I = 0; N = 8 <--- Last loop K = 1 (string is updated 1st character in the left only, 2nd and 3rd were updated in previous empty loops respectively)
table in for loop
| a | b | c | d | e | f | g | G |
| 7 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
aaaaacaa <--- we can see that now the string is as it should be aaaaacaa
table in base <--- aaaaacaa shown in the final output
| a | b | c | d | e | f | g | G |
| 7 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |

How to better understand nested loops?

My problem is that I don't understand nested loops well enough to answer this problem. I'm supposed to right-align a stack that I've made on a left alignment using nested for loops, but I can't quite figure out the conditions on the two inner ones.
Correct answer:
Height = 8
.......#
......##
.....###
....####
...#####
..######
.#######
########
My answer:
Height = 8
.......#
.......#......#
.......#......#.....#
.......#......#.....#....#
.......#......#.....#....#...#
.......#......#.....#....#...#..#
.......#......#.....#....#...#..#.#
.......#......#.....#....#...#..#.##
I've played around with it, took it seriously and nothing. I did (k = 7, k > j, k--), (k = 0, k < n-1, k++), k < j+7, I drew tables and i know that the height is pretty much the same as the value of the spaces but inverted on each line. I also know that the value of the hashes and the spaces should be equal to the height input by user.
It's supposed to take in a value from user, but I've worked on it on a separate file with the value n being the height to simplify and work on it without the rest of the program.
#include <stdio.h>
int main(void) {
int n = 8;
for (int i = 0; i < n; i++) {
for (int j = 0; j < i; j++) {
for(int k = 7; k > j; k--) {
printf(".");
}
printf("#");
}
printf("\n");
}
}
It's actually pretty simple. Write a table with each line and how many spaces and '#' you need to print:
n == 8
| output | line | num_spaces | num_signs |
| -------- | ---- | ---------- | --------- |
| .......# | 1 | 7 | 1 |
| ......## | 2 | 6 | 2 |
| .....### | 3 | 5 | 3 |
| ....#### | 4 | 4 | 4 |
| ...##### | 5 | 3 | 5 |
| ..###### | 6 | 2 | 6 |
| .####### | 7 | 1 | 7 |
| ######## | 8 | 0 | 8 |
For line you can start from 0 or from 1 or from n and go backwards. Pick something that is the easiest. You will see that starting from 1 is the simplest in your example.
Now for each line we need to determine how many num_spaces and num_signs we print. They should depend on line and on n.
For num_spaces it's n - line and for num_signs it's line
So the code should look like this:
// for each line
for (int line = 1; line <= n; ++line)
{
// print n - line spaces
// print line # characters
// print \n
}
With loops the code will look like this:
// for each line
for (int line = 1; line <= n; ++line)
{
// print n - line spaces
for (int i = 0; i < n -line; ++i)
std::cout << ' ';
// print line # characters
for (int i = 0; i < line; ++i)
std::cout << '#';
std::cout << '\n';
}
std::cout.flush();
But that's actually not recommended. You can get rid of those inner loops. One good and easy way is to use strings:
// for each line
for (int line = 1; line <= n; ++line)
{
// print n - line spaces
std::cout << std::string(n - line, ' ');
// print line # characters
std::cout << std::string(line, '#');
std::cout << '\n';
}
std::cout.flush();
And you can go even one step further:
// for each line
for (int line = 1; line <= n; ++line)
{
// print n - line spaces and line # characters
std::cout << std::string(n - line, ' ') << std::string(line, '#') << '\n';
}
std::cout.flush();

Searching a string of ints for a repeating pattern [duplicate]

My problem is to find the repeating sequence of characters in the given array. simply, to identify the pattern in which the characters are appearing.
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
1: | J | A | M | E | S | O | N | J | A | M | E | S | O | N |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
2: | R | O | N | R | O | N | R | O | N | R | O | N | R | O | N |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.
3: | S | H | A | M | I | L | S | H | A | M | I | L |
'---'---'---'---'---'---'---'---'---'---'---'---'
.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.
4: | C | A | R | P | E | N | T | E | R | C | A | R | P | E | N | T | E | R |
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'
Example
Given the previous data, the result should be:
"JAMESON"
"RON"
"SHAMIL"
"CARPENTER"
Question
How to deal with this problem efficiently?
Tongue-in-cheek O(NlogN) solution
Perform an FFT on your string (treating characters as numeric values). Every peak in the resulting graph corresponds to a substring periodicity.
For your examples, my first approach would be to
get the first character of the array (for your last example, that would be C)
get the index of the next appearance of that character in the array (e.g. 9)
if it is found, search for the next appearance of the substring between the two appearances of the character (in this case CARPENTER)
if it is found, you're done (and the result is this substring).
Of course, this works only for a very limited subset of possible arrays, where the same word is repeated over and over again, starting from the beginning, without stray characters in between, and its first character is not repeated within the word. But all your examples fall into this category - and I prefer the simplest solution which could possibly work :-)
If the repeated word contains the first character multiple times (e.g. CACTUS), the algorithm can be extended to look for subsequent occurrences of that character too, not only the first one (so that it finds the whole repeated word, not only a substring of it).
Note that this extended algorithm would give a different result for your second example, namely RONRON instead of RON.
In Python, you can leverage regexes thus:
def recurrence(text):
import re
for i in range(1, len(text)/2 + 1):
m = re.match(r'^(.{%d})\1+$'%i, text)
if m: return m.group(1)
recurrence('abcabc') # Returns 'abc'
I'm not sure how this would translate to Java or C. (That's one of the reasons I like Python, I guess. :-)
First write a method that find repeating substring sub in the container string as below.
boolean findSubRepeating(String sub, String container);
Now keep calling this method with increasing substring in the container, first try 1 character substring, then 2 characters, etc going upto container.length/2.
Pseudocode
len = str.length
for (i in 1..len) {
if (len%i==0) {
if (str==str.substr(0,i).repeat(len/i)) {
return str.substr(0,i)
}
}
}
Note: For brevity, I'm inventing a "repeat" method for strings, which isn't actually part of Java's string; "abc".repeat(2)="abcabc"
Using C++:
//Splits the string into the fragments of given size
//Returns the set of of splitted strings avaialble
set<string> split(string s, int frag)
{
set<string> uni;
int len = s.length();
for(int i = 0; i < len; i+= frag)
{
uni.insert(s.substr(i, frag));
}
return uni;
}
int main()
{
string out;
string s = "carpentercarpenter";
int len = s.length();
//Optimistic approach..hope there are only 2 repeated strings
//If that fails, then try to break the strings with lesser number of
//characters
for(int i = len/2; i>1;--i)
{
set<string> uni = split(s,i);
if(uni.size() == 1)
{
out = *uni.begin();
break;
}
}
cout<<out;
return 0;
}
The first idea that comes to my mind is trying all repeating sequences of lengths that divide length(S) = N. There is a maximum of N/2 such lengths, so this results in a O(N^2) algorithm.
But i'm sure it can be improved...
Here is a more general solution to the problem, that will find repeating subsequences within an sequence (of anything), where the subsequences do not have to start at the beginning, nor immediately follow each other.
given an sequence b[0..n], containing the data in question, and a threshold t being the minimum subsequence length to find,
l_max = 0, i_max = 0, j_max = 0;
for (i=0; i<n-(t*2);i++) {
for (j=i+t;j<n-t; j++) {
l=0;
while (i+l<j && j+l<n && b[i+l] == b[j+l])
l++;
if (l>t) {
print "Sequence of length " + l + " found at " + i + " and " + j);
if (l>l_max) {
l_max = l;
i_max = i;
j_max = j;
}
}
}
}
if (l_max>t) {
print "longest common subsequence found at " + i_max + " and " + j_max + " (" + l_max + " long)";
}
Basically:
Start at the beginning of the data, iterate until within 2*t of the end (no possible way to have two distinct subsequences of length t in less than 2*t of space!)
For the second subsequence, start at least t bytes beyond where the first sequence begins.
Then, reset the length of the discovered subsequence to 0, and check to see if you have a common character at i+l and j+l. As long as you do, increment l.
When you no longer have a common character, you have reached the end of your common subsequence.
If the subsequence is longer than your threshold, print the result.
Just figured this out myself and wrote some code for this (written in C#) with a lot of comments. Hope this helps someone:
// Check whether the string contains a repeating sequence.
public static bool ContainsRepeatingSequence(string str)
{
if (string.IsNullOrEmpty(str)) return false;
for (int i=0; i<str.Length; i++)
{
// Every iteration, cut down the string from i to the end.
string toCheck = str.Substring(i);
// Set N equal to half the length of the substring. At most, we have to compare half the string to half the string. If the string length is odd, the last character will not be checked against, but it will be checked in the next iteration.
int N = toCheck.Length / 2;
// Check strings of all lengths from 1 to N against the subsequent string of length 1 to N.
for (int j=1; j<=N; j++)
{
// Check from beginning to j-1, compare against j to j+j.
if (toCheck.Substring(0, j) == toCheck.Substring(j, j)) return true;
}
}
return false;
}
Feel free to ask any questions if it's unclear why it works.
and here is a concrete working example:
/* find greatest repeated substring */
char *fgrs(const char *s,size_t *l)
{
char *r=0,*a=s;
*l=0;
while( *a )
{
char *e=strrchr(a+1,*a);
if( !e )
break;
do {
size_t t=1;
for(;&a[t]!=e && a[t]==e[t];++t);
if( t>*l )
*l=t,r=a;
while( --e!=a && *e!=*a );
} while( e!=a && *e==*a );
++a;
}
return r;
}
size_t t;
const char *p;
p=fgrs("BARBARABARBARABARBARA",&t);
while( t-- ) putchar(*p++);
p=fgrs("0123456789",&t);
while( t-- ) putchar(*p++);
p=fgrs("1111",&t);
while( t-- ) putchar(*p++);
p=fgrs("11111",&t);
while( t-- ) putchar(*p++);
Not sure how you define "efficiently". For easy/fast implementation you could do this in Java:
private static String findSequence(String text) {
Pattern pattern = Pattern.compile("(.+?)\\1+");
Matcher matcher = pattern.matcher(text);
return matcher.matches() ? matcher.group(1) : null;
}
it tries to find the shortest string (.+?) that must be repeated at least once (\1+) to match the entire input text.
This is a solution I came up with using the queue, it passed all the test cases of a similar problem in codeforces. Problem No is 745A.
#include<bits/stdc++.h>
using namespace std;
typedef long long ll;
int main()
{
ios_base::sync_with_stdio(false);
cin.tie(NULL);
string s, s1, s2; cin >> s; queue<char> qu; qu.push(s[0]); bool flag = true; int ind = -1;
s1 = s.substr(0, s.size() / 2);
s2 = s.substr(s.size() / 2);
if(s1 == s2)
{
for(int i=0; i<s1.size(); i++)
{
s += s1[i];
}
}
//cout << s1 << " " << s2 << " " << s << "\n";
for(int i=1; i<s.size(); i++)
{
if(qu.front() == s[i]) {qu.pop();}
qu.push(s[i]);
}
int cycle = qu.size();
/*queue<char> qu2 = qu; string str = "";
while(!qu2.empty())
{
cout << qu2.front() << " ";
str += qu2.front();
qu2.pop();
}*/
while(!qu.empty())
{
if(s[++ind] != qu.front()) {flag = false; break;}
qu.pop();
}
flag == true ? cout << cycle : cout << s.size();
return 0;
}
I'd convert the array to a String object and use regex
Put all your character in an array e.x. a[]
i=0; j=0;
for( 0 < i < count )
{
if (a[i] == a[i+j+1])
{++i;}
else
{++j;i=0;}
}
Then the ratio of (i/j) = repeat count in your array.
You must pay attention to limits of i and j, but it is the simple solution.

How to on efficient and quick way add prefix to number and remove?

How to on efficient and quick way add prefix to number and remove ? (number can have arbitrary number of digits, number doesn't have limit)
I have number for example 122121 and I want to add digit 9 at the begining to be 9122121, after that I need to remove first digit in number. I have split into vector, push front digit(in this case 9) and the create number from digits ( iteration with multiplying 10).
Is there more efficient way ?
If you want efficiency, don't use anything else than numbers, no vectors, strings, etc. In your case:
#include <iostream>
unsigned long long add_prefix( unsigned long long number, unsigned prefix )
{
// if you want the example marked (X) below to print "9", add this line:
if( number == 0 ) return prefix;
// without the above, the result of (X) would be "90"
unsigned long long tmp = ( number >= 100000 ) ? 1000000 : 10;
while( number >= tmp ) tmp *= 10;
return number + prefix * tmp;
}
int main()
{
std::cout << add_prefix( 122121, 9 ) << std::endl; // prints 9122121
std::cout << add_prefix( 122121, 987 ) << std::endl; // prints 987122121
std::cout << add_prefix( 1, 9 ) << std::endl; // prints 91
std::cout << add_prefix( 0, 9 ) << std::endl; // (X) prints 9 or 90
}
but watch out for overflows. Without overflows, the above even works for multi-digit prefixes. I hope you can figure out the reverse algorithm to remove the prefix.
Edited: As Andy Prowl pointed out, one could interpret 0 as "no digits", so the prefix should not be followed by the digit 0. But I guess it depends on the OPs use-case, so I edited the code accordingly.
You can calculate number of digits using floor(log10(number)) + 1. So the code would look like:
int number = 122121;
int noDigits = floor(log10(number)) + 1;
//Add 9 in front
number += 9*pow(10,noDigits);
//Now strip the 9
number %= pow(10,noDigits);
I hope I got everything right ;)
I shall provide an answer that makes use of binary search and a small benchmark of the answers provided so far.
Binary Search
The following function uses binary search to find the number of digits of the desired number and appends the desired digit in front of it.
int addPrefix(int N, int digit) {
int multiplier = 0;
// [1, 5]
if(N <= 100000) {
// [1, 3]
if(N <= 1000) {
//[1, 2]
if(N <= 100) {
//[1, 1]
if(N <= 10) {
multiplier = 10;
//[2, 2]
} else {
multiplier = 100;
}
//[3, 3]
} else {
multiplier = 1000;
}
//[4, 4]
} else if(N <= 10000) {
multiplier = 10000;
//[5, 5]
} else {
multiplier = 100000;
}
//[6, 7]
} else if(N <= 10000000) {
//[6, 6]
if(N <= 1000000) {
multiplier = 1000000;
//[7, 7]
} else {
multiplier = 10000000;
}
//[8, 9]
} else {
//[8, 8]
if(N <= 100000000) {
multiplier = 100000000;
//[9, 9]
} else {
multiplier = 1000000000;
}
}
return N + digit * multiplier;
}
It is rather verbose. But, it finds the number of digits for a number in the range of int in a maximum of 4 comparisons.
Benchmark
I created a small benchmark running each provided algorithm against 450 million iterations, 50 million iterations per number of determined number of digits.
int main(void) {
int i, j, N = 2, k;
for(i = 1; i < 9; ++i, N *= 10) {
for(j = 1; j < 50000000; ++j) {
k = addPrefix(N, 9);
}
}
return 0;
}
The results:
+-----+-----------+-------------+----------+---------+
| run | Alexander | Daniel Frey | kkuryllo | W.B. |
+-----+-----------+-------------+----------+---------+
| 1st | 2.204s | 3.983s | 5.145s | 23.216s |
+-----+-----------+-------------+----------+---------+
| 2nd | 2.189s | 4.044s | 5.081s | 23.484s |
+-----+-----------+-------------+----------+---------+
| 3rd | 2.197s | 4.232s | 5.043s | 23.378s |
+-----+-----------+-------------+----------+---------+
| AVG | 2.197s | 4.086s | 5.090s | 23.359s |
+-----+-----------+-------------+----------+---------+
You can find the sources used in this Gist here.
How about using lexical cast from boost? That way you're not doing the iteration and all the yourself.
http://www.boost.org/doc/libs/1_53_0/doc/html/boost_lexical_cast.html
you could put the digits in an std::string and use insert and delete but it might be an overkill
%First find the highest power of 10 greater than your number. Then multiple the addition by that and add to your number
For example:
int x;
int addition;
int y = 1;
while (y <= x)
{
y *= 10;
}
x += addition * y;
I didn't test this code so just take as an example...
I also don't really understand your other instructions, you'll need to clarify.
edit okay I think I understand that you also want to remove the first digit sometime as well. You can use a simular approach to do this.
int x;
int y = 1;
while (y <= x*10)
{
y *= 10;
}
x %= y;