Sorting array of chars alphabetically then by length - c++

I have an array of structs where I keep track of how many times each unique word was seen in a given text:
struct List {
char word[20];
int repeat;
};
Now I need to sort this:
as 6
a 1
appetite 1
angry 1
are 2
and 4
...
To this:
a 1
as 6
and 4
are 2
angry 1
appetite 1
...
(By alphabetically I mean only by first letter)
So far, I have come up with this:
for (i = 0; i < length - 1; i++) {
min_pos = i;
for (j = i + 1; j < length; j++) // find min
if (array[j].word[0] < array[min_pos].word[0]) {
min_pos = j;
}
swap = array[min_pos]; // swap
array[min_pos] = array[i];
array[i] = swap;
}
This code works perfectly for sorting alphabetically, but I just can't write proper code to sort BOTH alphabetically and by length.

Make a comparator function.
Add an operator< to your List:
bool operator<(const List &lhs) const {
if(word[0] != lhs.word[0]) {
return word[0] < lhs.word[0];
}
return strlen(word) < strlen(lhs.word);
}
And now use this operator to sort, using whichever algorithm strikes your fancy.

Others have pointed out that there are faster and cleaner ways to sort. But if you want to use your own selection sort, as you've written, then you just need to make a few changes to your code.
Separate the "do I need to swap" logic from the swapping logic itself. Then the code becomes much cleaner and it's more clear where to add the extra check.
I've only copied the inner loop here. You'd want to replace your existing inner loop with this one. I'm not clear on why you need swap_pos and min_pos, so I've left the semantics alone.
for (j = i + 1; j < length; j++) { // find min
// first, determine whether you need to swap
// You want to swap if the first character of the new word is
// smaller, or if the letters are equal and the length is smaller.
bool doSwap = false;
if (array[j].word[0] < array[min_pos].word[0]) {
doSwap = true;
}
else if (array[j].word[0] == array[min_pos].word[0] &&
strlen(array[j].word) < array[min_pos].word) {
doSwap = true;
}
// do the swap if necessary
if (doSwap) {
swap_pos = j;
swap = array[min_pos]; // swap
array[min_pos] = array[i];
array[i] = swap;
}
}
To more clearly illustrate the necessary logic changes, I've purposely avoided making major style changes or simple optimizations.

You can pass a lambda to sort to do this:
sort(begin(array), end(array), [](const auto& lhs, const auto& rhs){ return *lhs.word < *rhs.word || *lhs.word == *rhs.word && (strlen(lhs.word) < strlen(rhs.word) || strlen(lhs.word) == strlen(rhs.word) && strcmp(lhs.word, rhs.word) < 0); });
Live Example

Use tuple lexicographical compare operators
An easy way to not write this condition is to
#include <tuple>
Then std::tie can be used:
std::tie(array[j].word[0], array[j].repeat) < std::tie(array[min_pos].word[0], array[min_pos].repeat)
This works because std::tie creates a tuple of lvalue references to its arguments. (Which means std::tie requires variables. If You want to compare results from functions std::make_tuple or std::forward_as_tuple would be better)
And std::tuple has operators which
Compares lhs and rhs lexicographically, that is, compares the first elements, if they are equivalent, compares the second elements, if those are equivalent, compares the third elements, and so on.
And the above description is also the idea how to make a comparison of more than value.

Related

How to find the power set of a given set without using left shift bit?

I'm trying to figure out how to implement an algorithm to find a power set given a set, but I'm having some trouble. The sets are actually vectors so for example I am given Set<char> set1{ 'a','b','c' };
I would do PowerSet(set1); and I would get all the sets
but if I do Set<char> set2{ 'a','b','c', 'd' };
I would do PowerSet(set2) and I would miss a few of those sets.
Set<Set<char>> PowerSet(const Set<char>& set1)
{
Set<Set<char>> result;
Set<char> temp;
result.insertElement({});
int card = set1.cardinality();
int powSize = pow(2, card);
for (int i = 0; i < powSize; ++i)
{
for (int j = 0; j < card; ++j)
{
if (i % static_cast<int> ((pow(2, j)) + 1))
{
temp.insertElement(set1[j]);
result.insertElement(temp);
}
}
temp.clear();
}
return result;
}
For reference:
cardinality() is a function in my .h where it returns the size of the set.
insertElement() inserts element into the set while duplicates are ignored.
Also the reason why I did temp.insertElement(s[j]) then result.insertElement(temp) is because result is a set of a set and so I needed to create a temporary set to insert the elements into then insert it into result.
clear() is a function that empties the set.
I also have removeElem() which removes that element specified if it exists, otherwise it'll ignore it.
Your if test is nonsense -- it should be something like
if ((i / static_cast<int>(pow(2,j))) % 2)
you also need to move the insertion of temp into result after the inner loop (just before the temp.clear()).
With those changes, this should work as long as pow(2, card) does not overflow an int -- that is up to about card == 30 on most machines.

Is using a conditional statement based on this flag more efficient than adding more lines of code?

I have a sort function that accepts a Boolean parameter desc (descending) which sorts in reverse order if true & algo is an enum class that selects the algorithm (here the subset of the code is for algo::BUBBLE (bubblesort))
Using this inline conditional statement (if (!desc ? A[j] > A[j + 1] : A[j] < A[j + 1])), I can eliminate rewriting the entire code for reverse sort as it evaluates the appropriate condition based on the desc flag. But I wonder if this can create unnecessary overhead as it checks the flag repeatedly [(n-1)*(1+2+...+n-1) times]. Will this overhead come out as substantial for larger data elements? More code or more overhead?
void Array<T>::sort(bool desc = false, algo a)
{
if (algo == algo::BUBBLE)
{
bool wasSwapped = true;
for (size_t i = 0; i < size - 1 && wasSwapped; i++)
{
switched = false;
for (size_t j = 0; j < size - i - 1; j++)
{
if (!desc ? A[j] > A[j + 1] : A[j] < A[j + 1])
{
wasSwapped = true;
swap(A[j], A[j + 1]);
}
}
}
}
}
A and size are private data members (Array pointer and size respectively).
For code clarity, it will be better to make that a non-member function template and pass it a compare functor. Make sure to put the function in your application's namespace so there is no confusion with the functions of the same name from the std namespace.
Assuming Array<T>::A is accessible,
namespace MyApp
{
template <typename T, typename Compare = std::less<T>>
void sort(Array<T>& array, algo a, Compare compare = Compare());
{
if (a == algo::BUBBLE)
{
bool wasSwapped = true;
for (size_t i = 0; i < size - 1 && wasSwapped; i++)
{
switched = false;
for (size_t j = 0; j < size - i - 1; j++)
{
if (!compare(array.A[j], array.A[j + 1]))
{
wasSwapped = true;
swap(array.A[j], array.A[j + 1]);
}
}
}
}
}
}
Now you can use:
Array<int> a = { ... };
MyApp::sort(a, algo::BUBBLE); // std::less<int> is the default functor.
MyApp::sort(a, algo::BUBBLE, std::greater<int>()); // Explicit compare functor.
If you are really worried about this extra check, you might create the implementation function as a template one with comparator provided as a template argument (similar to std::sort).
Than you will call the implementation function from your main one with either greater than or smaller than as a comparator depending on boolean flag.
But I wonder if this can create unnecessary overhead as it checks the flag repeatedly [(n-1)*(1+2+...+n-1) times
No it will not. The compiler sees where each variable is written and read. It will adjust accordingly.
And, in any case, in your example, performance will be limited by reading and writing the array. As bubble-sort goes, you'll be doing way more reads and writes than needed. Compare it on a large array (million entries) with the code as is, and a hard-coded descending search. By bet is on the timings being identical.

Sort Array By Parity the result is not robust

I am a new programmer and I am trying to sort a vector of integers by their parities - put even numbers in front of odds. The order inside of the odd or even numbers themselves doesn't matter. For example, given an input [3,1,2,4], the output can be [2,4,3,1] or [4,2,1,3], etc. Below is my c++ code, sometimes I got luck that the vector gets sorted properly, sometimes it doesn't. I exported the odd and even vectors and they look correct, but when I tried to combine them together it is just messed up. Can someone please help me debug?
class Solution {
public:
vector<int> sortArrayByParity(vector<int>& A) {
unordered_multiset<int> even;
unordered_multiset<int> odd;
vector<int> result(A.size());
for(int C:A)
{
if(C%2 == 0)
even.insert(C);
else
odd.insert(C);
}
merge(even.begin(),even.end(),odd.begin(),odd.end(),result.begin());
return result;
}
};
If you just need even values before odds and not a complete sort I suggest you use std::partition. You give it two iterators and a predicate. The elements where the predicate returns true will appear before the others. It works in-place and should be very fast.
Something like this:
std::vector<int> sortArrayByParity(std::vector<int>& A)
{
std::partition(A.begin(), A.end(), [](int value) { return value % 2 == 0; });
return A;
}
Because the merge function assumes that the two ranges are sorted, which is used as in merge sort. Instead, you should just use the insert function of vector:
result.insert(result.end(), even.begin(), even.end());
result.insert(result.end(), odd.begin(), odd.end());
return result;
There is no need to create three separate vectors. As you have allocated enough space in the result vector, that vector can be used as the final vector also to store your sub vectors, storing the separated odd and even numbers.
The value of using a vector, which under the covers is an array, is to avoid inserts and moves. Arrays/Vectors are fast because they allow immediate access to memory as an offset from the beginning. Take advantage of this!
The code simply keeps an index to the next odd and even indices and then assigns the correct cell accordingly.
class Solution {
public:
// As this function does not access any members, it can be made static
static std::vector<int> sortArrayByParity(std::vector<int>& A) {
std::vector<int> result(A.size());
uint even_index = 0;
uint odd_index = A.size()-1;
for(int element: A)
{
if(element%2 == 0)
result[even_index++] = element;
else
result[odd_index--] = element;
}
return result;
}
};
Taking advantage of the fact that you don't care about the order among the even or odd numbers themselves, you could use a very simple algorithm to sort the array in-place:
// Assume helper function is_even() and is_odd() are defined.
void sortArrayByParity(std::vector<int>& A)
{
int i = 0; // scanning from beginning
int j = A.size()-1; // scanning from end
do {
while (i < j && is_even(A[i])) ++i; // A[i] is an even at the front
while (i < j && is_odd(A[j])) --j; // A[j] is an odd at the back
if (i >= j) break;
// Now A[i] must be an odd number in front of an even number A[j]
std::swap(A[i], A[j]);
++i;
--j;
} while (true);
}
Note that the function above returns void, since the vector is sorted in-place. If you do want to return a sorted copy of input vector, you'd need to define a new vector inside the function, and copy the elements right before every ++i and --j above (and of course do not use std::swap but copy the elements cross-way instead; also, pass A as const std::vector<int>& A).
// Assume helper function is_even() and is_odd() are defined.
std::vector<int> sortArrayByParity(const std::vector<int>& A)
{
std::vector<int> B(A.size());
int i = 0; // scanning from beginning
int j = A.size()-1; // scanning from end
do {
while (i < j && is_even(A[i])) {
B[i] = A[i];
++i;
}
while (i < j && is_odd(A[j])) {
B[j] = A[j];
--j;
}
if (i >= j) break;
// Now A[i] must be an odd number in front of an even number A[j]
B[i] = A[j];
B[j] = A[i];
++i;
--j;
} while (true);
return B;
}
In both cases (in-place or out-of-place) above, the function has complexity O(N), N being number of elements in A, much better than the general O(N log N) for sorting N elements. This is because the problem doesn't actually sort much -- it only separates even from odd. There's therefore no need to invoke a full-fledged sorting algorithm.

Sorting an array of structs (Cards)

I am making a blackjack game and have created an array to act as the hand the user is dealt. I want to be able to sort it so that the hand is sorted in numerical order so it will be simpler to determine what type of hand the user has. Here is my sturct for the cards:
struct ACard{
int num;
const char *pic;
};
I want to sort the array by int num. I have tried to just use a simple insertion sort to complete the sort but I believe I need overload the operator to do so but I'm having trouble doing so as I've never overloaded a struct like this before. Here is what I have for the sort so far:
int i,j;
ACard key;
for(int i = 1; i < 5; i++){
key = userHand[i].num;
j = i - 1;
while(j >= 0 && userHand[j].num > key){
userHand[j + 1] = userHand[j];
j = j - 1;
}
userHand[j + 1] = key;
}
*Note userHand is the array of ACard's that I wish to sort.
With STL containers you can use std::sort function. First two arguments define the range of elements to be sorted. Third argument defines a LessThan function used for your custom elements comparison (you can use lambda expression for that).
std::vector<ACard> userHand; // or another stl container
// initialize userHand somehow
std::sort(userHand.begin(), userHand.end(),
[](const ACard& left, const ACard& right)
{
return left.num < right.num;
});

C++ Checking for identical values in 2 arrays

I have 2 arrays called xVal, and yVal.
I'm using these arrays as coords. What I want to do is to make sure that the array doesn't contain 2 identical sets of coords.
Lets say my arrays looks like this:
int xVal[4] = {1,1,3,4};
int yVal[4] = {1,1,5,4};
Here I want to find the match between xVal[0] yVal[0] and xVal[1] yVal[1] as 2 identical sets of coords called 1,1.
I have tried some different things with a forLoop, but I cant make it work as intended.
You can write an explicit loop using an O(n^2) approach (see answer from x77aBs) or you can trade in some memory for performance. For example using std::set
bool unique(std::vector<int>& x, std::vector<int>& y)
{
std::set< std::pair<int, int> > seen;
for (int i=0,n=x.size(); i<n; i++)
{
if (seen.insert(std::make_pair(x[i], y[i])).second == false)
return false;
}
return true;
}
You can do it with two for loops:
int MAX=4; //number of elements in array
for (int i=0; i<MAX; i++)
{
for (int j=i+1; j<MAX; j++)
{
if (xVal[i]==xVal[j] && yVal[i]==yVal[j])
{
//DUPLICATE ELEMENT at xVal[j], yVal[j]. Here you implement what
//you want (maybe just set them to -1, or delete them and move everything
//one position back)
}
}
}
Small explanation: first variable i get value 0. Than you loop j over all possible numbers. That way you compare xVal[0] and yVal[0] with all other values. j starts at i+1 because you don't need to compare values before i (they have already been compared).
Edit - you should consider writing small class that will represent a point, or at least structure, and using std::vector instead of arrays (it's easier to delete an element in the middle). That should make your life easier :)
int identicalValueNum = 0;
int identicalIndices[4]; // 4 is the max. possible number of identical values
for (int i = 0; i < 4; i++)
{
if (xVal[i] == yVal[i])
{
identicalIndices[identicalValueNum++] = i;
}
}
for (int i = 0; i < identicalValueNum; i++)
{
printf(
"The %ith value in both arrays is the same and is: %i.\n",
identicalIndices[i], xVal[i]);
}
For
int xVal[4] = {1,1,3,4};
int yVal[4] = {1,1,5,4};
the output of printf would be:
The 0th value in both arrays is the same and is: 1.
The 1th value in both arrays is the same and is: 1.
The 3th value in both arrays is the same and is: 4.