Inversion Checker using Insertion Sort - c++

I need to output a group of letters that are that are out of order with respect to the number of inversions of each other.
For example, the sequence “AACEDGG” has only 1 inversion (E and D) while the sequence “ZWQM” has 6 inversions. I don't actually have to sort it out but I have to output them based on the number of inversions they have.
Ex:
Input: AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT
Output: CCCGGGGGGA AACATGAAGG GATCAGATTT ATCGATGCAT TTTTGGCCAA TTTGGCCAAA
I am trying to use insertion sort as a template as required by my teacher.
void inversionChecker(string dna[], int n)
{
int j,k,m;
int tempCount;
for(int i=0; i < n; i++){
int count=0;
for(j=0;j < n; j++){
for(k=j+1; k <= n; k++){
if(dna[i][j] > dna[i][k]){
count++;
tempCount = count;
}
}
}
if(i != 0 && tempCount > count)
dna[i].swap(dna[i-1]);
}
}
I am having issues because I am not too familiar using 2D arrays to compare the letters in each string. When I try to output the array it ends up being blank, seg faults, or errors resulting from my use trying to swap the positions of the strings in the array.
Any help would be appreciated

Here you access the dna array out-of-bounds:
for(j=0;j < n; j++){
for(k=j+1; k <= n; k++){ // when k == n you have undefined behavior
if(dna[i][j] > dna[i][k])
it should be:
for(j=0;j < n-1; j++){
for(k=j+1; k < n; k++){
if(dna[i][j] > dna[i][k])
An alternative approach using misc. standard classes and algorithms, like std::vector and std::sort.
#include <algorithm> // copy, sort
#include <cstddef> // size_t
#include <iterator> // istream_iterator, back_inserter
#include <sstream> // istringstream
#include <string> // string
#include <tuple> // tie
#include <utility> // swap
#include <vector> // vector
#include <iostream>
// count inversions in a sequence
unsigned count_inversions(std::string sequence) {
unsigned res = 0;
// assuming "inversions" are defined as the number of swaps needed in bubblesort
for(size_t i = 0; i < sequence.size() - 1; ++i) {
for(size_t j = i + 1; j < sequence.size(); ++j) {
if(sequence[j] < sequence[i]) {
std::swap(sequence[i], sequence[j]);
++res;
}
}
}
return res;
}
// a class to store a sequence and its inversion count
struct sequence_t {
sequence_t() = default;
explicit sequence_t(const std::string& Seq) :
seq(Seq), inversions(count_inversions(seq)) {}
// "less than" operator to compare two "sequence_t"s (used in std::sort)
bool operator<(const sequence_t& rhs) const {
// assuming lexicographical order if inversions are equal
return std::tie(inversions, seq) < std::tie(rhs.inversions, rhs.seq);
}
std::string seq;
unsigned inversions;
};
// read one sequence_t from an istream
std::istream& operator>>(std::istream& is, sequence_t& s) {
std::string tmp;
if(is >> tmp) s = sequence_t(tmp);
return is;
}
// read "sequence_t"s from an istream and put in a vector<sequence_t>
auto read_sequences(std::istream& is) {
std::vector<sequence_t> rv;
std::copy(std::istream_iterator<sequence_t>(is),
std::istream_iterator<sequence_t>{}, std::back_inserter(rv));
return rv;
}
int main() {
std::istringstream input(
"AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT");
auto sequences = read_sequences(input);
std::sort(sequences.begin(), sequences.end());
// print result
for(const auto& [seq, inversions] : sequences) {
std::cout << seq << '(' << inversions << ')' << ' ';
}
std::cout << '\n';
}
Output (including the inversions):
CCCGGGGGGA(2) AACATGAAGG(10) GATCAGATTT(10) ATCGATGCAT(11) TTTTGGCCAA(12) TTTGGCCAAA(15)

Related

Repeating elements in vector

I am a C++ student. And I need to solve this problem: "Write a program that receives a number and an array of the size of the given number. The program must find all the duplicates of the given numbers, push-back them to a vector of repeating elements, and print the vector". The requirements are I'm only allowed to use the vector library and every repeating element of the array must be pushed to the vector only once, e.g. my array is "1, 2, 1, 2, 3, 4...", the vector must be "1 ,2".
Here's what I've done so far. My code works, but I'm unable to make it add the same duplicate to the vector of repeating elements only once.
#include <iostream>
#include <vector>
int main() {
int n;
std::cin >> n;
int* arr = new int[n];
std::vector<int> repeatedElements;
for(int i = 0; i < n; ++i) {
std::cin >> arr[i];
}
for(int i = 0; i < n; ++i) {
bool foundInRepeated = false;
for(int j = 0; j < repeatedElements.size(); ++j) {
if(arr[i] == repeatedElements[j]) {
foundInRepeated = true;
break;
}
}
if(foundInRepeated) {
continue;
} else {
for(int i = 0; i < n; ++i) {
int count = 1;
for(int j = i + 1; j < n; ++j) {
if(arr[i] == arr[j]) {
++count;
}
}
if(count > 1) {
repeatedElements.push_back(arr[i]);
}
}
}
}
for(int i = 0; i < repeatedElements.size(); ++i) {
std::cout << repeatedElements[i] << " ";
}
std::cout << std::endl;
}
Consider what you're doing here:
if(foundInRepeated) {
continue;
} else {
for(int i = 0; i < n; ++i) { // why?
If the element at some index i (from the outer loop) is not found in repeatedElements, you're again iterating through the entire array, and adding elements that are repeated. But you already have an i that you're interested in, and hasn't been added to the repeatedElements. You only need to iterate through j in the else branch.
Removing the line marked why? (and the closing brace), will solve the problem. Here's a demo.
It's always good to follow a plan. Divide the bigger problem into a sequence of smaller problems is a good start. While this often does not yield an optimal solution, at least it yields a solution, which is more or less straightforward. And which subsequently can be optimized, if need be.
How to find out, if a number in the sequence has duplicates?
We could brute force this:
is_duplicate i = arr[i+1..arr.size() - 1] contains arr[i]
and then write ourselves a helper function like
bool range_contains(std::vector<int>::const_iterator first,
std::vector<int>::const_iterator last, int value) {
// ...
}
and use it in a simple
for (auto iter = arr.cbegin(); iter != arr.cend(); ++iter) {
if (range_contains(iter+1, arr.cend(), *iter) && !duplicates.contains(*iter)) {
duplicates.push_back(*iter);
}
}
But this would be - if I am not mistaken - some O(N^2) solution.
As we know, sorting is O(N log(N)) and if we sort our array first, we will
have all duplicates right next to each other. Then, we can iterate over the sorted array once (O(N)) and we are still cheaper than O(N^2). (O(N log(N)) + O(N) is still O(N log(N))).
1 2 1 2 3 4 => sort => 1 1 2 2 3 4
Eventually, while using what we have at our disposal, this could yield to a program like this:
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
using IntVec = std::vector<int>;
int main(int argc, const char *argv[]) {
IntVec arr; // aka: input array
IntVec duplicates;
size_t n = 0;
std::cin >> n;
// Read n integers from std::cin
std::generate_n(std::back_inserter(arr), n,
[](){
return *(std::istream_iterator<int>(std::cin));
});
// sort the array (in ascending order).
std::sort(arr.begin(), arr.end()); // O(N*logN)
auto current = arr.cbegin();
while(current != arr.cend()) {
// std::adjacent_find() finds the next location in arr, where 2 neighbors have the same value.
current = std::adjacent_find(current,arr.cend());
if( current != arr.cend()) {
duplicates.push_back(*current);
// skip all duplicates here
for( ; current != (arr.cend() - 1) && (*current == *(current+1)); current++) {
}
}
}
// print the duplicates to std::cout
std::copy(duplicates.cbegin(), duplicates.cend(),
std::ostream_iterator<int>(std::cout, " "));
return 0;
}

Function not printing any solutions

So, I need to make a function that is going to return the chromatic number of a graph. The graph is given through an adjecency matrix that the function finds using a file name. I have a function that should in theory work and which the compiler is throwing no issues for, yet when I run it, it simply prints out an empty line and ends the program.
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
using namespace std;
int Find_Chromatic_Number (vector <vector <int>> matg, int matc[], int n) {
if (n == 0) {
return 0;
}
int result, i, j;
result = 0;
for (i = 0; i < n; i++) {
for (j = i; j < n; j++) {
if (matg[i][j] == 1) {
if (matc[i] == matc[j]) {
matc[j]++;
}
}
}
}
for (i = 0; i < n; i++) {
if (result < matc[i]) {
result = matc[i];
}
}
return result;
}
int main() {
string file;
int n, i, j, m;
cout << "unesite ime datoteke: " << endl;
cin >> file;
ifstream reader;
reader.open(file.c_str());
reader >> n;
vector<vector<int>> matg(n, vector<int>(0));
int matc[n];
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
reader >> matg[i][j];
}
matc[i] = 1;
}
int result = Find_Chromatic_Number(matg, matc, n);
cout << result << endl;
return 0;
}
The program is supposed to use an freader to convert the file into a 2D vector which represents the adjecency matrix (matg). I also made an array (matc) which represents the value of each vertice, with different numbers corresponding to different colors.
The function should go through the vector and every time there is an edge between two vertices it should check if their color value in matc is the same. If it is, it ups the second vale (j) by one. After the function has passed through the vector, the matc array should contain n different number with the highest number being the chromatic number I am looking for.
I hope I have explained enough of what I am trying to accomplish, if not just ask and I will add any further explanations.
Try to make it like that.
Don't choose a size for your vector
vector<vector<int> > matg;
And instead of using reader >> matg[i][j];
use:
int tmp;
reader >> tmp;
matg[i].push_back(tmp);

How can you keep the positions of an array the same while sorting?

I am writing a Caesar cipher decoding program that sorts the frequency of letters of a message in descending order. My issue is when I print out the results the positions of the frequencies in the array no longer match the letters I have set up. How do I fix this? I have other code that removes punctuation and capitals, all characters besides spaces and lowercase letters from the message being decoded.
I have trimmed down the code to just what is being questioned.
#include<iostream>
#include<string>
#include<fstream>
using namespace std;
void sortArray(int*, int);
int main()
{
string fileContent = "a coded message which is several hundreds of characters long is being passed into the program";
int count[26];
// This code is skipping over spaces and other characters
for(int f = 0; f < fileContent.length(); f++)
{
if(fileContent[f] == 32)
{
continue;
}
if(fileContent[f] >= 48 && fileContent[f] <= 57)
{
continue;
}
count[(fileContent[f]-'a')%26]++;
}
// Here is where my issue begins. In sortArray, the position of the characters are being changed.
cout << "Letter frequency: Most common to least common" << endl;
sortArray(count, 26);
for(int p = 0; p < 26; p++)
{
cout << char(p + 97) << ": " << count[p] << endl;
}
return 0;
}
void sortArray(int* srcArray, int numElements)
{
for(int x = 0; x < numElements; x++)
{
int max = srcArray[x];
int maxIndex = x;
int hold;
for(int y = x + 1; y < numElements; y++)
{
if(srcArray[y] > max)
{
max = srcArray[y];
maxIndex = y;
}
}
hold = srcArray[x];
srcArray[x] = max;
srcArray[maxIndex] = hold;
hold = 0;
}
}
Please kindly let me know how I can solve this issue, I've been theorizing but I cannot seem to figure out a viable solution.
After you compute the frequency in count array.
std::array<std::pair<char, int>, 26> pairArray;
for (int i = 0; i < 26; ++i)
{
pairArray[i] = std::make_pair('a' + i, count[i]);
}
std::sort(pairArray.begin(), pairArray.end(), myCompare);
for (int i = 0; i < 26; ++i)
std::cout << pairArray[i].first << ": " << pairArray[i].second << std::endl;
For myCompare,
bool myCompare(const std::pair<char, int>& p1, const std::pair<char, int>& p2)
{
return p1.second > p2.second;
}
This should sort the array in descending order.
The problem you are facing is because you have frequencies in the array but the frequencies are not mapped to corresponding character. When the frequencies are sorted,the array is rearranged but your printing of the frequencies is not character dependent,you are printing characters from a-z and assigning frequencies as they are in sorted array.
What you can do is map the frequencies with corresponding character. One solution can be using an unordered map,char being key. An unordered map because it won't internally sort the map on character value,so u can maintain frequency ordering as well.
You can also use vector with pair as #lamandy suggested.
vector< pair <char, int> > vect;
for (int i = 0; i < 26; i++)
{
vect.push_back(make_pair(char(i + 97), count[i]));
}
sort(vect.begin(), vect.end(), sortbysecVal);
// Printing the sorted vector(after using sort())
cout << "The vector after sort operation is:\n";
for (int i = 0; i<26; i++)
{
// "first" and "second" are used to access
// 1st and 2nd element of pair respectively
cout << vect[i].first << " "
<< vect[i].second << endl;
}
sort by second value of pair
bool sortbysecVal(const pair<int, int> &a, const pair<int, int> &b)
return (a.second > b.second);
Once after you have calculated frequencies,you can use this,this will solve your purpose and you wont need your sort function.
P.S : One more thing,you must initialize your (array)count to 0,like int count[26] = {0},because initially it contains garbage if uninitialized and adding up 1 ( count[(fileContent[f]-'a')%26]++;) to a garbage will not produce result(frequency) u expect
The answer is probably a three-liner for a standard library guru, which I am not quite yet. I hate the standard library. It makes programming so easy that anyone can do it.
Here are two versions that I hacked out. This is fun.
#include <map>
#include <string_view>
#include <vector>
#include <algorithm>
using counted = std::pair<char, unsigned>;
std::vector<counted>
counted_chars(const std::string_view input) {
// Return a vector of <char, count> pairs, where char is an uppercase
// letter, and count is the number of occurrences of the letter (upper or lower).
// It is sorted from highest count to lowest.
using namespace std;
map<char, unsigned> count;
// Count them.
for(char next: input) {if (isalpha(next)) {count[toupper(next)] += 1;}}
// Sort them
vector<counted> sorted(count.size());
copy(count.cbegin(), count.cend(), sorted.begin());
sort(sorted.begin(), sorted.end(), [](counted c1, counted c2)
{ return c1.second > c2.second; });
return sorted;
}
int main() {
std::string str = "a coDed; MESSage which_is several hundreds of characters long is being passed into the program";
auto result = counted_chars(str);
return 0;
}
Another one that doesn't use std::map.
#include <map>
#include <vector>
#include <algorithm>
using counted = std::pair<char, unsigned>;
std::vector<counted> counted_chars(std::string input) {
using namespace std;
input.resize(remove_if(input.begin(), input.end(), [](char ch) { return !isalpha(ch); })-input.begin());
for(char &ch: input) { ch = toupper(ch); }
sort(input.begin(), input.end());
string present {input};
present.resize(unique(present.begin(), present.end())-present.begin());
std::vector<counted> sorted;
for (char ch:present) {sorted.push_back(make_pair(ch, count(input.begin(), input.end(), ch)));}
sort(sorted.begin(), sorted.end(), [](counted c1, counted c2) { return c1.second > c2.second; });
return sorted;
}
int main() {
std::string str = " -- I have always wished for my computer to be as easy to use as my telephone; My wish has come true because I can no longer figure out how to use my telephone.";
auto result = counted_chars(std::move(str));
return 0;
}

Arranging a string in uppercase-first alphabetical order C++

I was trying to create a program in C++ that sorts a given string in alphabetical order in a way where the uppercase letters precede their lowercase equivalent.
Example:
DCBAdcba
Sorted string:
AaBbCcDd
Given below is the code.
#include <iostream>
#include <string>
#include <cctype>
struct char_ {
char c;
char diff;
char_();
char_(char x);
};
char_::char_() {
c = 0;
diff = 0;
}
char_::char_(char x) {
c = std::tolower(x);
diff = c - x;
}
void charswap(char_& x, char_& y) {
char_ temp;
temp = x;
x = y;
y = temp;
}
int main() {
std::string str;
getline(std::cin, str);
char_* str2 = new char_[str.length()];
for (int i = 0; i < str.length(); i++) {
str2[i] = char_(str[i]);
}
/*
for (int i = 0; i < str.length(); i++) {
std::cout << str2[i].c << std::endl;
}
*/
for (int i = 0; i < str.length(); i++) {
for (int j = i; j < str.length(); j++) {
if (str2[i].c > str2[j].c)
charswap(str2[i], str2[j]);
}
}
for (int k = 0; k < str.length(); k++) {
std::cout << str2[k].c << "\t" << (int)str2[k].diff << std::endl;
}
for (int i = 0; i < str.length(); i++) {
str2[i].c = str2[i].c - str2[i].diff;
}
for (int i = 0; i < str.length(); i++) std::cout << str2[i].c;
std::cout << "\n";
return 0;
}
A char_ struct is created to store the individual characters(converted to to lowercase) and their difference from the uppercase equivalent(0 or 32, depending if the original char was lowercase or uppercase, respectively). It then sorts the char_ characters on the basis of their lowercase values. And after the sort we add back the difference to the character to retrieve the uppercase form.
But when I try giving this string, it gives the following result.
DCBAdcba
AabBCcdD
I cannot understand what's happening here.
The problem is on this line:
if (str2[i].c > str2[j].c)
charswap(str2[i], str2[j]);
It compares characters in case-insensitive way, with no provision for tie breaking when lowercase characters are the same.
You need to modify this to swap characters when lowercase on the right is greater than lowercase on the left, or when lowercase representations are the same, but the right side original character is in upper case:
if ((str2[i].c > str2[j].c) || (str2[i].c == str2[j].c && str2[j].diff))
charswap(str2[i], str2[j]);
sorts a given string in alphabetical order in a way where the uppercase letters precede their lowercase equivalent.
You can just define a comparison functor reflecting your intention
#include <cctype>
#include <iostream>
#include <vector>
#include <algorithm>
struct case_cmp {
bool operator()(char lhs, char rhs) const {
return (std::isupper(lhs) && std::tolower(lhs) == rhs) || std::tolower(lhs) < std::tolower(rhs);
}
};
Then use std::sort:
int main() {
std::string s("DCBAdcba");
std::sort(std::begin(s), std::end(s), case_cmp());
// Outputs "AaBbCcDd"
std::cout << s << std::endl;
}
std::string can be considered as a container of chars, and as such you can apply STL's algorithms to its content, including std::sort() (just like you would apply an STL algorithm to e.g. std::vector).
You can specify your particular custom sorting criteria using a lambda, to be passed as the third parameter to std::sort(), e.g. (live on Ideone):
#include <algorithm> // for std::sort
#include <cctype> // for std::isupper, std::tolower
#include <iostream> // for std::cout
#include <string> // for std::string
using namespace std;
int main() {
string s{"DCBAdcba"};
sort( s.begin(), s.end(), [](char x, char y) {
// Custom sorting criteria.
// Return true if x precedes y.
// This may work, but requires more testing...
if (isupper(x)) {
if (tolower(x) == y) {
return true;
}
}
return tolower(x) < tolower(y);
});
cout << s << '\n';
}

Algorithm to generate all permutation by selecting some or all charaters

I need to generate all permutation of a string with selecting some of the elements. Like if my string is "abc" output would be { a,b,c,ab,ba,ac,ca,bc,cb,abc,acb,bac,bca,cab,cba }.
I thought a basic algorithm in which I generate all possible combination of "abc" which are {a,b,c,ab,ac,bc,abc} and then permute all of them.
So is there any efficient permutation algorithm by which I can generate all possible permutation with varying size.
The code I wrote for this is :
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <map>
using namespace std;
int permuteCount = 1;
int compare (const void * a, const void * b)
{
return ( *(char*)a - *(char*)b);
}
void permute(char *str, int start, int end)
{
// cout<<"before sort : "<<str;
// cout<<"after sort : "<<str;
do
{
cout<<permuteCount<<")"<<str<<endl;
permuteCount++;
}while( next_permutation(str+start,str+end) );
}
void generateAllCombinations( char* str)
{
int n, k, i, j, c;
n = strlen(str);
map<string,int> combinationMap;
for( k =1; k<=n; k++)
{
char tempStr[20];
int index =0;
for (i=0; i<(1<<n); i++) {
index =0;
for (j=0,c=0; j<32; j++) if (i & (1<<j)) c++;
if (c == k) {
for (j=0;j<32; j++)
if (i & (1<<j))
tempStr[ index++] = str[j];
tempStr[index] = '\0';
qsort (tempStr, index, sizeof(char), compare);
if( combinationMap.find(tempStr) == combinationMap.end() )
{
// cout<<"comb : "<<tempStr<<endl;
//cout<<"unique comb : \n";
combinationMap[tempStr] = 1;
permute(tempStr,0,k);
} /*
else
{
cout<<"duplicated comb : "<<tempStr<<endl;
}*/
}
}
}
}
int main () {
char str[20];
cin>>str;
generateAllCombinations(str);
cin>>str;
}
I need to use a hash for avoiding same combination, so please let me know how can I make this algorithm better.
Thanks,
GG
#include <algorithm>
#include <iostream>
#include <string>
int main() {
using namespace std;
string s = "abc";
do {
cout << s << '\n';
} while (next_permutation(s.begin(), s.end()));
return 0;
}
Next_permutation uses a constant size, but you can add a loop to deal with varying size. Or just store in a set to eliminate the extra dupes for you:
#include <set>
int main() {
using namespace std;
string s = "abc";
set<string> results;
do {
for (int n = 1; n <= s.size(); ++n) {
results.insert(s.substr(0, n));
}
} while (next_permutation(s.begin(), s.end()));
for (set<string>::const_iterator x = results.begin(); x != results.end(); ++x) {
cout << *x << '\n';
}
return 0;
}
I don't think you can write much faster program than you have already. The main problem is the output size: it has order of n!*2^n (number of subsets * average number of permutations for one subset), which is already > 10^9 for a string of 10 different characters.
Since STL's next_permutation adds very limited complexity for such small strings, your program's time complexity is already nearly O(output size).
But you can make your program a bit simpler. In particular, for( k =1; k<=n; k++) loop seems unnecessary: you already calculate size of subset in variable c inside. So, just have int k = c instead of if (c == k). (You'll also need to consider case of empty subset: i == 0)
edit
Actually, there's only 9864100 outputs for n == 10 (not ~ 10^9). Still, my point remains the same: your program already wastes only "O(next_permutation)" time for each output, which is very, very little.