How to implement a natural sort algorithm in c++?

How to implement a natural sort algorithm in c++? - c++

I'm sorting strings that are comprised of text and numbers.
I want the sort to sort the number parts as numbers, not alphanumeric.
For example I want: abc1def, ..., abc9def, abc10def
instead of: abc10def, abc1def, ..., abc9def
Does anyone know an algorithm for this (in particular in c++)
Thanks

I asked this exact question (although in Java) and got pointed to http://www.davekoelle.com/alphanum.html which has an algorithm and implementations of it in many languages.
Update 14 years later: Dave Koelle’s blog has gone off line and I can’t find his actual algorithm, but here’s an implementation.
https://github.com/cblanc/koelle-sort

Several natural sort implementations for C++ are available. A brief review:
natural_sort<> - based on Boost.Regex.
In my tests, it's roughly 20 times slower than other options.
Dirk Jagdmann's alnum.hpp, based on Dave Koelle's alphanum algorithm
Potential integer overlow issues for values over MAXINT
Martin Pool's natsort - written in C, but trivially usable from C++.
The only C/C++ implementation I've seen to offer a case insensitive version, which would seem to be a high priority for a "natural" sort.
Like the other implementations, it doesn't actually parse decimal points, but it does special case leading zeroes (anything with a leading 0 is assumed to be a fraction), which is a little weird but potentially useful.
PHP uses this algorithm.

This is known as natural sorting. There's an algorithm here that looks promising.
Be careful of problems with non-ASCII characters (see Jeff's blog entry on the subject).

Partially reposting my another answer:
bool compareNat(const std::string& a, const std::string& b){
if (a.empty())
return true;
if (b.empty())
return false;
if (std::isdigit(a[0]) && !std::isdigit(b[0]))
return true;
if (!std::isdigit(a[0]) && std::isdigit(b[0]))
return false;
if (!std::isdigit(a[0]) && !std::isdigit(b[0]))
{
if (a[0] == b[0])
return compareNat(a.substr(1), b.substr(1));
return (toUpper(a) < toUpper(b));
//toUpper() is a function to convert a std::string to uppercase.
}
// Both strings begin with digit --> parse both numbers
std::istringstream issa(a);
std::istringstream issb(b);
int ia, ib;
issa >> ia;
issb >> ib;
if (ia != ib)
return ia < ib;
// Numbers are the same --> remove numbers and recurse
std::string anew, bnew;
std::getline(issa, anew);
std::getline(issb, bnew);
return (compareNat(anew, bnew));
}
toUpper() function:
std::string toUpper(std::string s){
for(int i=0;i<(int)s.length();i++){s[i]=toupper(s[i]);}
return s;
}
Usage:
std::vector<std::string> str;
str.push_back("abc1def");
str.push_back("abc10def");
...
std::sort(str.begin(), str.end(), compareNat);

To solve what is essentially a parsing problem a state machine (aka finite state automaton) is the way to go. Dissatisfied with the above solutions i wrote a simple one-pass early bail-out algorithm that beats C/C++ variants suggested above in terms of performance, does not suffer from numerical datatype overflow errors, and is easy to modify to add case insensitivity if required.
sources can be found here

For those that arrive here and are already using Qt in their project, you can use the QCollator class. See this question for details.

Avalanchesort is a recursive variation of naturall sort, whiche merge runs, while exploring the stack of sorting-datas. The algorithim will sort stable, even if you add datas to your sorting-heap, while the algorithm is running/sorting.
The search-principle is simple. Only merge runs with the same rank.
After finding the first two naturell runs (rank 0), avalanchesort merge them to a run with rank 1. Then it call avalanchesort, to generate a second run with rank 1 and merge the two runs to a run with rank 2. Then it call the avalancheSort to generate a run with rank 2 on the unsorted datas....
My Implementation porthd/avalanchesort divide the sorting from the handling of the data using interface injection. You can use the algorithmn for datastructures like array, associative arrays or lists.
/**
* #param DataListAvalancheSortInterface $dataList
* #param DataRangeInterface $beginRange
* #param int $avalancheIndex
* #return bool
*/
public function startAvalancheSort(DataListAvalancheSortInterface $dataList)
{
$avalancheIndex = 0;
$rangeResult = $this->avalancheSort($dataList, $dataList->getFirstIdent(), $avalancheIndex);
if (!$dataList->isLastIdent($rangeResult->getStop())) {
do {
$avalancheIndex++;
$lastIdent = $rangeResult->getStop();
if ($dataList->isLastIdent($lastIdent)) {
$rangeResult = new $this->rangeClass();
$rangeResult->setStart($dataList->getFirstIdent());
$rangeResult->setStop($dataList->getLastIdent());
break;
}
$nextIdent = $dataList->getNextIdent($lastIdent);
$rangeFollow = $this->avalancheSort($dataList, $nextIdent, $avalancheIndex);
$rangeResult = $this->mergeAvalanche($dataList, $rangeResult, $rangeFollow);
} while (true);
}
return $rangeResult;
}
/**
* #param DataListAvalancheSortInterface $dataList
* #param DataRangeInterface $range
* #return DataRangeInterface
*/
protected function findRun(DataListAvalancheSortInterface $dataList,
$startIdent)
{
$result = new $this->rangeClass();
$result->setStart($startIdent);
$result->setStop($startIdent);
do {
if ($dataList->isLastIdent($result->getStop())) {
break;
}
$nextIdent = $dataList->getNextIdent($result->getStop());
if ($dataList->oddLowerEqualThanEven(
$dataList->getDataItem($result->getStop()),
$dataList->getDataItem($nextIdent)
)) {
$result->setStop($nextIdent);
} else {
break;
}
} while (true);
return $result;
}
/**
* #param DataListAvalancheSortInterface $dataList
* #param $beginIdent
* #param int $avalancheIndex
* #return DataRangeInterface|mixed
*/
protected function avalancheSort(DataListAvalancheSortInterface $dataList,
$beginIdent,
int $avalancheIndex = 0)
{
if ($avalancheIndex === 0) {
$rangeFirst = $this->findRun($dataList, $beginIdent);
if ($dataList->isLastIdent($rangeFirst->getStop())) {
// it is the last run
$rangeResult = $rangeFirst;
} else {
$nextIdent = $dataList->getNextIdent($rangeFirst->getStop());
$rangeSecond = $this->findRun($dataList, $nextIdent);
$rangeResult = $this->mergeAvalanche($dataList, $rangeFirst, $rangeSecond);
}
} else {
$rangeFirst = $this->avalancheSort($dataList,
$beginIdent,
($avalancheIndex - 1)
);
if ($dataList->isLastIdent($rangeFirst->getStop())) {
$rangeResult = $rangeFirst;
} else {
$nextIdent = $dataList->getNextIdent($rangeFirst->getStop());
$rangeSecond = $this->avalancheSort($dataList,
$nextIdent,
($avalancheIndex - 1)
);
$rangeResult = $this->mergeAvalanche($dataList, $rangeFirst, $rangeSecond);
}
}
return $rangeResult;
}
protected function mergeAvalanche(DataListAvalancheSortInterface $dataList, $oddListRange, $evenListRange)
{
$resultRange = new $this->rangeClass();
$oddNextIdent = $oddListRange->getStart();
$oddStopIdent = $oddListRange->getStop();
$evenNextIdent = $evenListRange->getStart();
$evenStopIdent = $evenListRange->getStop();
$dataList->initNewListPart($oddListRange, $evenListRange);
do {
if ($dataList->oddLowerEqualThanEven(
$dataList->getDataItem($oddNextIdent),
$dataList->getDataItem($evenNextIdent)
)) {
$dataList->addListPart($oddNextIdent);
if ($oddNextIdent === $oddStopIdent) {
$restTail = $evenNextIdent;
$stopTail = $evenStopIdent;
break;
}
$oddNextIdent = $dataList->getNextIdent($oddNextIdent);
} else {
$dataList->addListPart($evenNextIdent);
if ($evenNextIdent === $evenStopIdent) {
$restTail = $oddNextIdent;
$stopTail = $oddStopIdent;
break;
}
$evenNextIdent = $dataList->getNextIdent($evenNextIdent);
}
} while (true);
while ($stopTail !== $restTail) {
$dataList->addListPart($restTail);
$restTail = $dataList->getNextIdent($restTail);
}
$dataList->addListPart($restTail);
$dataList->cascadeDataListChange($resultRange);
return $resultRange;
}
}

My algorithm with test code of java version. If you want to use it in your project you can define a comparator yourself.
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.function.Consumer;
public class FileNameSortTest {
private static List<String> names = Arrays.asList(
"A__01__02",
"A__2__02",
"A__1__23",
"A__11__23",
"A__3++++",
"B__1__02",
"B__22_13",
"1_22_2222",
"12_222_222",
"2222222222",
"1.sadasdsadsa",
"11.asdasdasdasdasd",
"2.sadsadasdsad",
"22.sadasdasdsadsa",
"3.asdasdsadsadsa",
"adsadsadsasd1",
"adsadsadsasd10",
"adsadsadsasd3",
"adsadsadsasd02"
);
public static void main(String...args) {
List<File> files = new ArrayList<>();
names.forEach(s -> {
File f = new File(s);
try {
if (!f.exists()) {
f.createNewFile();
}
files.add(f);
} catch (IOException e) {
e.printStackTrace();
}
});
files.sort(Comparator.comparing(File::getName));
files.forEach(f -> System.out.print(f.getName() + " "));
System.out.println();
files.sort(new Comparator<File>() {
boolean caseSensitive = false;
int SPAN_OF_CASES = 'a' - 'A';
#Override
public int compare(File left, File right) {
char[] csLeft = left.getName().toCharArray(), csRight = right.getName().toCharArray();
boolean isNumberRegion = false;
int diff=0, i=0, j=0, lenLeft=csLeft.length, lenRight=csRight.length;
char cLeft = 0, cRight = 0;
for (; i<lenLeft && j<lenRight; i++, j++) {
cLeft = getCharByCaseSensitive(csLeft[i]);
cRight = getCharByCaseSensitive(csRight[j]);
boolean isNumericLeft = isNumeric(cLeft), isNumericRight = isNumeric(cRight);
if (isNumericLeft && isNumericRight) {
// Number start!
if (!isNumberRegion) {
isNumberRegion = true;
// Remove prefix '0'
while (i < lenLeft && cLeft == '0') i++;
while (j < lenRight && cRight == '0') j++;
if (i == lenLeft || j == lenRight) break;
}
// Diff start: calculate the diff value.
if (cLeft != cRight && diff == 0)
diff = cLeft - cRight;
} else {
if (isNumericLeft != isNumericRight) {
// One numeric and one char.
if (isNumberRegion)
return isNumericLeft ? 1 : -1;
return cLeft - cRight;
} else {
// Two chars: if (number) diff don't equal 0 return it.
if (diff != 0)
return diff;
// Calculate chars diff.
diff = cLeft - cRight;
if (diff != 0)
return diff;
// Reset!
isNumberRegion = false;
diff = 0;
}
}
}
// The longer one will be put backwards.
return (i == lenLeft && j == lenRight) ? cLeft - cRight : (i == lenLeft ? -1 : 1) ;
}
private boolean isNumeric(char c) {
return c >= '0' && c <= '9';
}
private char getCharByCaseSensitive(char c) {
return caseSensitive ? c : (c >= 'A' && c <= 'Z' ? (char) (c + SPAN_OF_CASES) : c);
}
});
files.forEach(f -> System.out.print(f.getName() + " "));
}
}
The output is,
1.sadasdsadsa 11.asdasdasdasdasd 12_222_222 1_22_2222 2.sadsadasdsad 22.sadasdasdsadsa 2222222222 3.asdasdsadsadsa A__01__02 A__11__23 A__1__23 A__2__02 A__3++++ B__1__02 B__22_13 adsadsadsasd02 adsadsadsasd1 adsadsadsasd10 adsadsadsasd3
1.sadasdsadsa 1_22_2222 2.sadsadasdsad 3.asdasdsadsadsa 11.asdasdasdasdasd 12_222_222 22.sadasdasdsadsa 2222222222 A__01__02 A__1__23 A__2__02 A__3++++ A__11__23 adsadsadsasd02 adsadsadsasd1 adsadsadsasd3 adsadsadsasd10 B__1__02 B__22_13
Process finished with exit code 0

// -1: s0 < s1; 0: s0 == s1; 1: s0 > s1
static int numericCompare(const string &s0, const string &s1) {
size_t i = 0, j = 0;
for (; i < s0.size() && j < s1.size();) {
string t0(1, s0[i++]);
while (i < s0.size() && !(isdigit(t0[0]) ^ isdigit(s0[i]))) {
t0.push_back(s0[i++]);
}
string t1(1, s1[j++]);
while (j < s1.size() && !(isdigit(t1[0]) ^ isdigit(s1[j]))) {
t1.push_back(s1[j++]);
}
if (isdigit(t0[0]) && isdigit(t1[0])) {
size_t p0 = t0.find_first_not_of('0');
size_t p1 = t1.find_first_not_of('0');
t0 = p0 == string::npos ? "" : t0.substr(p0);
t1 = p1 == string::npos ? "" : t1.substr(p1);
if (t0.size() != t1.size()) {
return t0.size() < t1.size() ? -1 : 1;
}
}
if (t0 != t1) {
return t0 < t1 ? -1 : 1;
}
}
return i == s0.size() && j == s1.size() ? 0 : i != s0.size() ? 1 : -1;
}
I am not very sure if it is you want, anyway, you can have a try:-)

Related

How to generate a sequence of values by specifying a start, end value, and step?

I need to generate a sequence of numbers, having a start, end value and a step with which the numbers will be generated.For example, in the Haskell language, this is a trivial problem, and it is called an arithmetic sequence.
[1..10] = [1,2,3,4,5,6,7,8,9,10]
I tried to implement this as follows.
namespace utility {
template<class Container, class Type>
Container generator(Type t_from, Type t_to, Type t_step = 1)
{
// Sequence storage container
Container sequence_of_numbers { };
sequence_of_numbers.reserve(static_cast<std::size_t>(std::abs(t_to - t_from + 1)) / t_step);
// For floating point data
if constexpr (std::is_floating_point_v<Type>) {
// The reverse sequence
if((t_to - t_from) < 0) {
for(Type i = t_from; ; i -= t_step) {
if(i > t_to) {
sequence_of_numbers.push_back(static_cast<typename Container::value_type>(i));
} else {
if(std::fabs(i - t_to) < std::numeric_limits<Type>::epsilon()) {
sequence_of_numbers.push_back(i);
}
break;
}
}
// The direct sequence
} else {
for(Type i = t_from; ; i += t_step) {
if(i < t_to) {
sequence_of_numbers.push_back(static_cast<typename Container::value_type>(i));
} else {
if(std::fabs(i - t_to) < std::numeric_limits<Type>::epsilon()) {
sequence_of_numbers.push_back(i);
}
break;
}
}
}
// Integer data type
} else {
if((t_to - t_from) < 0) {
for(Type i = t_from; i >= t_to; i -= t_step) {
sequence_of_numbers.push_back(static_cast<typename Container::value_type>(i));
}
} else {
for(Type i = t_from; i <= t_to; i += t_step) {
sequence_of_numbers.push_back(static_cast<typename Container::value_type>(i));
}
}
}
sequence_of_numbers.shrink_to_fit();
return sequence_of_numbers;
}
}
And the next call I get the desired result.
std::vector<int> full_reverse_sequence { utility::generator<std::vector<int>>(10000, 0) };
Is there something similar in C++17/C++20, at the syntax or STL library level?

I think, std::iota does what you need.
Or, you may use std::generate or std::generate_n with a simple lambda
There is not much more to say.

Sort file names func [duplicate]

I'm sorting strings that are comprised of text and numbers.
I want the sort to sort the number parts as numbers, not alphanumeric.
For example I want: abc1def, ..., abc9def, abc10def
instead of: abc10def, abc1def, ..., abc9def
Does anyone know an algorithm for this (in particular in c++)
Thanks

I asked this exact question (although in Java) and got pointed to http://www.davekoelle.com/alphanum.html which has an algorithm and implementations of it in many languages.
Update 14 years later: Dave Koelle’s blog has gone off line and I can’t find his actual algorithm, but here’s an implementation.
https://github.com/cblanc/koelle-sort

Several natural sort implementations for C++ are available. A brief review:
natural_sort<> - based on Boost.Regex.
In my tests, it's roughly 20 times slower than other options.
Dirk Jagdmann's alnum.hpp, based on Dave Koelle's alphanum algorithm
Potential integer overlow issues for values over MAXINT
Martin Pool's natsort - written in C, but trivially usable from C++.
The only C/C++ implementation I've seen to offer a case insensitive version, which would seem to be a high priority for a "natural" sort.
Like the other implementations, it doesn't actually parse decimal points, but it does special case leading zeroes (anything with a leading 0 is assumed to be a fraction), which is a little weird but potentially useful.
PHP uses this algorithm.

This is known as natural sorting. There's an algorithm here that looks promising.
Be careful of problems with non-ASCII characters (see Jeff's blog entry on the subject).

Partially reposting my another answer:
bool compareNat(const std::string& a, const std::string& b){
if (a.empty())
return true;
if (b.empty())
return false;
if (std::isdigit(a[0]) && !std::isdigit(b[0]))
return true;
if (!std::isdigit(a[0]) && std::isdigit(b[0]))
return false;
if (!std::isdigit(a[0]) && !std::isdigit(b[0]))
{
if (a[0] == b[0])
return compareNat(a.substr(1), b.substr(1));
return (toUpper(a) < toUpper(b));
//toUpper() is a function to convert a std::string to uppercase.
}
// Both strings begin with digit --> parse both numbers
std::istringstream issa(a);
std::istringstream issb(b);
int ia, ib;
issa >> ia;
issb >> ib;
if (ia != ib)
return ia < ib;
// Numbers are the same --> remove numbers and recurse
std::string anew, bnew;
std::getline(issa, anew);
std::getline(issb, bnew);
return (compareNat(anew, bnew));
}
toUpper() function:
std::string toUpper(std::string s){
for(int i=0;i<(int)s.length();i++){s[i]=toupper(s[i]);}
return s;
}
Usage:
std::vector<std::string> str;
str.push_back("abc1def");
str.push_back("abc10def");
...
std::sort(str.begin(), str.end(), compareNat);

To solve what is essentially a parsing problem a state machine (aka finite state automaton) is the way to go. Dissatisfied with the above solutions i wrote a simple one-pass early bail-out algorithm that beats C/C++ variants suggested above in terms of performance, does not suffer from numerical datatype overflow errors, and is easy to modify to add case insensitivity if required.
sources can be found here

For those that arrive here and are already using Qt in their project, you can use the QCollator class. See this question for details.

Avalanchesort is a recursive variation of naturall sort, whiche merge runs, while exploring the stack of sorting-datas. The algorithim will sort stable, even if you add datas to your sorting-heap, while the algorithm is running/sorting.
The search-principle is simple. Only merge runs with the same rank.
After finding the first two naturell runs (rank 0), avalanchesort merge them to a run with rank 1. Then it call avalanchesort, to generate a second run with rank 1 and merge the two runs to a run with rank 2. Then it call the avalancheSort to generate a run with rank 2 on the unsorted datas....
My Implementation porthd/avalanchesort divide the sorting from the handling of the data using interface injection. You can use the algorithmn for datastructures like array, associative arrays or lists.
/**
* #param DataListAvalancheSortInterface $dataList
* #param DataRangeInterface $beginRange
* #param int $avalancheIndex
* #return bool
*/
public function startAvalancheSort(DataListAvalancheSortInterface $dataList)
{
$avalancheIndex = 0;
$rangeResult = $this->avalancheSort($dataList, $dataList->getFirstIdent(), $avalancheIndex);
if (!$dataList->isLastIdent($rangeResult->getStop())) {
do {
$avalancheIndex++;
$lastIdent = $rangeResult->getStop();
if ($dataList->isLastIdent($lastIdent)) {
$rangeResult = new $this->rangeClass();
$rangeResult->setStart($dataList->getFirstIdent());
$rangeResult->setStop($dataList->getLastIdent());
break;
}
$nextIdent = $dataList->getNextIdent($lastIdent);
$rangeFollow = $this->avalancheSort($dataList, $nextIdent, $avalancheIndex);
$rangeResult = $this->mergeAvalanche($dataList, $rangeResult, $rangeFollow);
} while (true);
}
return $rangeResult;
}
/**
* #param DataListAvalancheSortInterface $dataList
* #param DataRangeInterface $range
* #return DataRangeInterface
*/
protected function findRun(DataListAvalancheSortInterface $dataList,
$startIdent)
{
$result = new $this->rangeClass();
$result->setStart($startIdent);
$result->setStop($startIdent);
do {
if ($dataList->isLastIdent($result->getStop())) {
break;
}
$nextIdent = $dataList->getNextIdent($result->getStop());
if ($dataList->oddLowerEqualThanEven(
$dataList->getDataItem($result->getStop()),
$dataList->getDataItem($nextIdent)
)) {
$result->setStop($nextIdent);
} else {
break;
}
} while (true);
return $result;
}
/**
* #param DataListAvalancheSortInterface $dataList
* #param $beginIdent
* #param int $avalancheIndex
* #return DataRangeInterface|mixed
*/
protected function avalancheSort(DataListAvalancheSortInterface $dataList,
$beginIdent,
int $avalancheIndex = 0)
{
if ($avalancheIndex === 0) {
$rangeFirst = $this->findRun($dataList, $beginIdent);
if ($dataList->isLastIdent($rangeFirst->getStop())) {
// it is the last run
$rangeResult = $rangeFirst;
} else {
$nextIdent = $dataList->getNextIdent($rangeFirst->getStop());
$rangeSecond = $this->findRun($dataList, $nextIdent);
$rangeResult = $this->mergeAvalanche($dataList, $rangeFirst, $rangeSecond);
}
} else {
$rangeFirst = $this->avalancheSort($dataList,
$beginIdent,
($avalancheIndex - 1)
);
if ($dataList->isLastIdent($rangeFirst->getStop())) {
$rangeResult = $rangeFirst;
} else {
$nextIdent = $dataList->getNextIdent($rangeFirst->getStop());
$rangeSecond = $this->avalancheSort($dataList,
$nextIdent,
($avalancheIndex - 1)
);
$rangeResult = $this->mergeAvalanche($dataList, $rangeFirst, $rangeSecond);
}
}
return $rangeResult;
}
protected function mergeAvalanche(DataListAvalancheSortInterface $dataList, $oddListRange, $evenListRange)
{
$resultRange = new $this->rangeClass();
$oddNextIdent = $oddListRange->getStart();
$oddStopIdent = $oddListRange->getStop();
$evenNextIdent = $evenListRange->getStart();
$evenStopIdent = $evenListRange->getStop();
$dataList->initNewListPart($oddListRange, $evenListRange);
do {
if ($dataList->oddLowerEqualThanEven(
$dataList->getDataItem($oddNextIdent),
$dataList->getDataItem($evenNextIdent)
)) {
$dataList->addListPart($oddNextIdent);
if ($oddNextIdent === $oddStopIdent) {
$restTail = $evenNextIdent;
$stopTail = $evenStopIdent;
break;
}
$oddNextIdent = $dataList->getNextIdent($oddNextIdent);
} else {
$dataList->addListPart($evenNextIdent);
if ($evenNextIdent === $evenStopIdent) {
$restTail = $oddNextIdent;
$stopTail = $oddStopIdent;
break;
}
$evenNextIdent = $dataList->getNextIdent($evenNextIdent);
}
} while (true);
while ($stopTail !== $restTail) {
$dataList->addListPart($restTail);
$restTail = $dataList->getNextIdent($restTail);
}
$dataList->addListPart($restTail);
$dataList->cascadeDataListChange($resultRange);
return $resultRange;
}
}

My algorithm with test code of java version. If you want to use it in your project you can define a comparator yourself.
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Comparator;
import java.util.List;
import java.util.function.Consumer;
public class FileNameSortTest {
private static List<String> names = Arrays.asList(
"A__01__02",
"A__2__02",
"A__1__23",
"A__11__23",
"A__3++++",
"B__1__02",
"B__22_13",
"1_22_2222",
"12_222_222",
"2222222222",
"1.sadasdsadsa",
"11.asdasdasdasdasd",
"2.sadsadasdsad",
"22.sadasdasdsadsa",
"3.asdasdsadsadsa",
"adsadsadsasd1",
"adsadsadsasd10",
"adsadsadsasd3",
"adsadsadsasd02"
);
public static void main(String...args) {
List<File> files = new ArrayList<>();
names.forEach(s -> {
File f = new File(s);
try {
if (!f.exists()) {
f.createNewFile();
}
files.add(f);
} catch (IOException e) {
e.printStackTrace();
}
});
files.sort(Comparator.comparing(File::getName));
files.forEach(f -> System.out.print(f.getName() + " "));
System.out.println();
files.sort(new Comparator<File>() {
boolean caseSensitive = false;
int SPAN_OF_CASES = 'a' - 'A';
#Override
public int compare(File left, File right) {
char[] csLeft = left.getName().toCharArray(), csRight = right.getName().toCharArray();
boolean isNumberRegion = false;
int diff=0, i=0, j=0, lenLeft=csLeft.length, lenRight=csRight.length;
char cLeft = 0, cRight = 0;
for (; i<lenLeft && j<lenRight; i++, j++) {
cLeft = getCharByCaseSensitive(csLeft[i]);
cRight = getCharByCaseSensitive(csRight[j]);
boolean isNumericLeft = isNumeric(cLeft), isNumericRight = isNumeric(cRight);
if (isNumericLeft && isNumericRight) {
// Number start!
if (!isNumberRegion) {
isNumberRegion = true;
// Remove prefix '0'
while (i < lenLeft && cLeft == '0') i++;
while (j < lenRight && cRight == '0') j++;
if (i == lenLeft || j == lenRight) break;
}
// Diff start: calculate the diff value.
if (cLeft != cRight && diff == 0)
diff = cLeft - cRight;
} else {
if (isNumericLeft != isNumericRight) {
// One numeric and one char.
if (isNumberRegion)
return isNumericLeft ? 1 : -1;
return cLeft - cRight;
} else {
// Two chars: if (number) diff don't equal 0 return it.
if (diff != 0)
return diff;
// Calculate chars diff.
diff = cLeft - cRight;
if (diff != 0)
return diff;
// Reset!
isNumberRegion = false;
diff = 0;
}
}
}
// The longer one will be put backwards.
return (i == lenLeft && j == lenRight) ? cLeft - cRight : (i == lenLeft ? -1 : 1) ;
}
private boolean isNumeric(char c) {
return c >= '0' && c <= '9';
}
private char getCharByCaseSensitive(char c) {
return caseSensitive ? c : (c >= 'A' && c <= 'Z' ? (char) (c + SPAN_OF_CASES) : c);
}
});
files.forEach(f -> System.out.print(f.getName() + " "));
}
}
The output is,
1.sadasdsadsa 11.asdasdasdasdasd 12_222_222 1_22_2222 2.sadsadasdsad 22.sadasdasdsadsa 2222222222 3.asdasdsadsadsa A__01__02 A__11__23 A__1__23 A__2__02 A__3++++ B__1__02 B__22_13 adsadsadsasd02 adsadsadsasd1 adsadsadsasd10 adsadsadsasd3
1.sadasdsadsa 1_22_2222 2.sadsadasdsad 3.asdasdsadsadsa 11.asdasdasdasdasd 12_222_222 22.sadasdasdsadsa 2222222222 A__01__02 A__1__23 A__2__02 A__3++++ A__11__23 adsadsadsasd02 adsadsadsasd1 adsadsadsasd3 adsadsadsasd10 B__1__02 B__22_13
Process finished with exit code 0

// -1: s0 < s1; 0: s0 == s1; 1: s0 > s1
static int numericCompare(const string &s0, const string &s1) {
size_t i = 0, j = 0;
for (; i < s0.size() && j < s1.size();) {
string t0(1, s0[i++]);
while (i < s0.size() && !(isdigit(t0[0]) ^ isdigit(s0[i]))) {
t0.push_back(s0[i++]);
}
string t1(1, s1[j++]);
while (j < s1.size() && !(isdigit(t1[0]) ^ isdigit(s1[j]))) {
t1.push_back(s1[j++]);
}
if (isdigit(t0[0]) && isdigit(t1[0])) {
size_t p0 = t0.find_first_not_of('0');
size_t p1 = t1.find_first_not_of('0');
t0 = p0 == string::npos ? "" : t0.substr(p0);
t1 = p1 == string::npos ? "" : t1.substr(p1);
if (t0.size() != t1.size()) {
return t0.size() < t1.size() ? -1 : 1;
}
}
if (t0 != t1) {
return t0 < t1 ? -1 : 1;
}
}
return i == s0.size() && j == s1.size() ? 0 : i != s0.size() ? 1 : -1;
}
I am not very sure if it is you want, anyway, you can have a try:-)

Tests randomly fails

I'm writing board game and I need following functionality: player rolls two dices, if he rolled doubles (same number on both dice), he gets to roll again, if he rolled doubles again, he goes to jail.
In my Game class it looks like that
void logic::Game::rollTheDice() {
m_throwsInCurrentTurn++;
int firstThrow = m_firstDice.roll();
int secondThrow = m_secondDice.roll();
m_totalRollResult += firstThrow + secondThrow;
if (firstThrow == secondThrow) m_doublesInCurrentTurn++;
}
std::string logic::Game::checkForDoubles() {
std::string message;
if (m_doublesInCurrentTurn == 0 && m_throwsInCurrentTurn == 1) {
m_canThrow = false;
m_canMove = true;
}
if (m_doublesInCurrentTurn == 1 && m_throwsInCurrentTurn == 1) {
message = "Doubles! Roll again.";
m_canThrow = true;
m_canMove = false;
}
if (m_doublesInCurrentTurn == 1 && m_throwsInCurrentTurn == 2) {
m_canThrow = false;
m_canMove = true;
}
if (m_doublesInCurrentTurn == 2 && m_throwsInCurrentTurn == 2) {
message = "Doubles again! You are going to jail.";
m_canThrow = false;
m_canMove = false;
getActivePlayer().lockInJail();
}
return message;
}
void logic::Game::setInMotion(unsigned number) {
m_players[m_activePlayer].startMoving();
m_players[m_activePlayer].incrementPosition(number);
}
m_canThrow basicly enables or disables ability to click "Roll the Dice" button, m_canMove decides if player token can start moving, m_players[m_activePlayer] is std::vector<Player>, startMoving() does that,
void logic::Player::startMoving() {
m_isMoving = true;
}
needed for token movement, so baiscly not relevant here.
Last function from Game class I need to show you is reset(), used mainly for testing purposes
void logic::Game::reset() {
m_throwsInCurrentTurn = 0;
m_doublesInCurrentTurn = 0;
m_totalRollResult = 0;
}
Now finnaly Unit Test that sometimes goes wrong. Sometimes, I mean completely random, like 1 out of 10-20 times.
//first throw is double, second throw is not
TEST_F(GameTestSuite, shouldFinishAfterSecondRollAndMove) {
auto game = m_sut.get();
do {
if (game.getThrowsInCurrentTurn() == 2) game.reset();
game.rollTheDice();
game.checkForDoubles();
if (game.getThrowsInCurrentTurn() == 1 && game.getDoublesInCurrentTurn() == 1) {
ASSERT_EQ(game.canThrow(), true);
ASSERT_EQ(game.canMove(), false);
}
} while (game.getThrowsInCurrentTurn() != 2 && game.getDoublesInCurrentTurn() != 1);
ASSERT_EQ(game.canThrow(), false);
ASSERT_EQ(game.canMove(), true);
game.setInMotion(game.getTotalRollResult());
ASSERT_EQ(game.getActivePlayer().isMoving(), true);
ASSERT_EQ(game.getActivePlayer().getPosition(), game.getTotalRollResult());
}
This line exactly, ASSERT_EQ(game.canThrow(), false); sometimes is equal true after do-while loop that should end once m_canThrow is set to false

Shouldn't:
} while (game.getThrowsInCurrentTurn() != 2 && game.getDoublesInCurrentTurn() != 1);
be
} while (game.getThrowsInCurrentTurn() != 2 && game.getDoublesInCurrentTurn() <= 1);
You want to allow up to two turns but 0 or 1 doubles.

C++ std::set<string> Alphanumeric custom comparator

I'm solving a problem with a sorting non-redundant permutation of String Array.
For example, if input string is "8aC", then output should be order like {"Ca8","C8a", "aC8", "a8C", "8Ca", "9aC"}.I chose C++ data structure set because each time I insert the String into std:set, set is automatically sorted and eliminating redundancy. The output is fine.
But I WANT TO SORT SET IN DIFFERENT ALPHANUMERIC ORDER which is different from default alphanumeric sorting order. I want to customize the comparator of set the order priority like: upper case> lower case > digit.
I tried to customize comparator but it was quite frustrating. How can I customize the sorting order of the set? Here's my code.
set<string, StringCompare> setl;
for (i = 0; i < f; i++)
{
setl.insert(p[i]); //p is String Array. it has the information of permutation of String.
}
for (set<string>::iterator iter = setl.begin(); iter != setl.end(); ++iter)
cout << *iter << endl; //printing set items. it works fine.
struct StringCompare
{
bool operator () (const std::string s_left, const std::string s_right)
{
/*I want to use my character comparison function in here, but have no idea about that.
I'm not sure about that this is the right way to customize comparator either.*/
}
};
int compare_char(const char x, const char y)
{
if (char_type(x) == char_type(y))
{
return ( (int) x < (int) y) ? 1 : 0 ;
}
else return (char_type(x) > char_type(y)) ? 1 : 0;
}
int char_type(const char x)
{
int ascii = (int)x;
if (ascii >= 48 && ascii <= 57) // digit
{
return 1;
}
else if (ascii >= 97 && ascii <= 122) // lowercase
{
return 2;
}
else if (ascii >= 48 && ascii <= 57) // uppercase
{
return 3;
}
else
{
return 0;
}
}

You are almost there, but you should compare your string lexicographically.
I roughly added small changes to your code.
int char_type( const char x )
{
if ( isupper( x ) )
{
// upper case has the highest priority
return 0;
}
if ( islower( x ) )
{
return 1;
}
if ( isdigit( x ) )
{
// digit has the lowest priority
return 2;
}
// something else
return 3;
}
bool compare_char( const char x, const char y )
{
if ( char_type( x ) == char_type( y ) )
{
// same type so that we are going to compare characters
return ( x < y );
}
else
{
// different types
return char_type( x ) < char_type( y );
}
}
struct StringCompare
{
bool operator () ( const std::string& s_left, const std::string& s_right )
{
std::string::const_iterator iteLeft = s_left.begin();
std::string::const_iterator iteRight = s_right.begin();
// we are going to compare each character in strings
while ( iteLeft != s_left.end() && iteRight != s_right.end() )
{
if ( compare_char( *iteLeft, *iteRight ) )
{
return true;
}
if ( compare_char( *iteRight, *iteLeft ) )
{
return false;
}
++iteLeft;
++iteRight;
}
// either of strings reached the end.
if ( s_left.length() < s_right.length() )
{
return true;
}
// otherwise.
return false;
}
};

Your comparator is right. I would turn parameters to const ref like this
bool operator () (const std::string &s_left, const std::string &s_right)
and start by this simple implementation:
return s_left < s_right
This will give the default behaviour and give you confidence you are on the right track.
Then start comparing one char at the time with a for loop over the shorter between the length of the two strings. You can get chars out the string simply with the operator[] (e.g. s_left[i])

You're very nearly there with what you have.
In your comparison functor you are given two std::strings. What you need to do is to find the first position where the two strings differ. For that, you can use std::mismatch from the standard library. This returns a std::pair filled with iterators pointing to the first two elements that are different:
auto iterators = std::mismatch(std::begin(s_left), std::end(s_left),
std::begin(s_right), std::end(s_right));
Now, you can dereference the two iterators we've been given to get the characters:
char c_left = *iterators.first;
char c_right = *iterators.second;
You can pass those two characters to your compare_char function and it should all work :-)

Not absoloutely sure about this, but you may be able to use an enumerated class towards your advantage or an array and choose to read from certain indices in which ever order you like.
You can use one enumerated class to define the order you would like to output data in and another that contains the data to be outputed, then you can set a loop that keeps on looping to assign the value to the output in a permuted way!
namespace CustomeType
{
enum Outs { Ca8= 0,C8a, aC8, a8C, 8Ca, 9aC };
enum Order{1 = 0 , 2, 3 , 4 , 5};
void PlayCard(Outs input)
{
if (input == Ca8) // Enumerator is visible without qualification
{
string[] permuted;
permuted[0] = Outs[0];
permuted[1] = Outs[1];
permuted[2] = Outs[2];
permuted[3] = Outs[3];
permuted[4] = Outs[4];
}// else use a different order
else if (input == Ca8) // this might be much better
{
string[] permuted;
for(int i = 0; i<LessThanOutputLength; i++)
{
//use order 1 to assign values from Outs
}
}
}
}

This should work :
bool operator () (const std::string s_left, const std::string s_right)
{
for(int i = 0;i < s_left.size();i++){
if(isupper(s_left[i])){
if(isupper(s_right[i])) return s_left[i] < s_right[i];
else if(islower(s_right[i]) || isdigit(s_right[i]))return true;
}
else if(islower(s_left[i])){
if(islower(s_right[i])) return s_left[i] < s_right[i];
else if(isdigit(s_right[i])) return true;
else if(isupper(s_right[i])) return false;
}
else if(isdigit(s_left[i])){
if(isdigit(s_right[i])) return s_left[i] < s_right[i];
else if(islower(s_right[i]) || isupper(s_right[i])) return false;
}
}
}

Best way to check if one of other objects is true or not

I am looking for best way to implement this scenario:
I have 4 objects that have Boolean member that in the flow of the app sometimes they are set to true and sometimes are set to false depending on conditions;
Then I have final function that gets 1 of this objects and needs to check if in the other 3 objects one of them has the member set to true .
The problem is I know how to do the dirty check , and I am searching for cleaner way here is my code for the final function:
class Obj
{
public :
Obj(int _id) : id(_id)
bool status;
int id // only 4 objects are created 0,1,2,3
}
m_obj0 = new Obj(0) ;
m_obj1 = new Obj(1) ;
m_obj2 = new Obj(2) ;
m_obj3 = new Obj(3) ;
bool check(Obj* obj)
{
if(obj->id == 0)
{
if(m_obj1->status || m_obj2->status || m_obj3->status)
{
return true;
}
return false;
}else if(obj->id == 1)(
if(m_obj0->status || m_obj2->status || m_obj3->status)
{
return true;
}
return false;
}else if(obj->id == 2)(
if(m_obj0->status || m_obj1->status || m_obj3->status)
{
return true;
}
return false;
}else if(obj->id == 3)(
if(m_obj0->status || m_obj1->status || m_obj2->status)
{
return true;
}
return false;
}
is there a shorter and cleaner way to accomplish this check function ?

You can set m_obj as an array. Then use a for loop to check
bool check(Obj* obj)
{
for (int i = 0; i < 4; i ++) {
if (obj->id == i) continue;
if (m_obj[i]->status == true)
return true;
}
return false;
}
Or add them together, then subtract m_obj[obj->id]->status。Check the result is zero or not
bool check(Obj* obj)
{
int result = m_obj[0]->status+m_obj[1]->statusm_obj[2]->status
+m_obj[3]->status-m_obj[obj->id]->status;
return (result!=0);
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to implement a natural sort algorithm in c++? - c++

I'm sorting strings that are comprised of text and numbers. I want the sort to sort the number parts as numbers, not alphanumeric. For example I want: abc1def, ..., abc9def, abc10def instead of: abc10def, abc1def, ..., abc9def Does anyone know an algorithm for this (in particular in c++) Thanks

This is known as natural sorting. There's an algorithm here that looks promising. Be careful of problems with non-ASCII characters (see Jeff's blog entry on the subject).

For those that arrive here and are already using Qt in their project, you can use the QCollator class. See this question for details.

Related

How to generate a sequence of values by specifying a start, end value, and step?

Sort file names func [duplicate]

Tests randomly fails

C++ std::set<string> Alphanumeric custom comparator

Best way to check if one of other objects is true or not

Categories

Resources