Sorting in alphabetical with å ä ö

Sorting in alphabetical with å ä ö - c++

I have an algoritm sorting words in alphabetical by the letters value, this all works fine until I include å ä ö as they return a int value ranging from -103 to -124. Becuse of this the order of the words are like this ä å ö a for example, when it should be a å ä ö. So how do I make it sort it correctly with å ä ö last?
Edit: Im not allowed to use fancy functions, that is why this code is so bare boned, also using using namespace std
My code:
pali is a vector of type string that I use to store the words
void SortPal() {
int antal = pali.size();
string tempO;
bool byte = false;
for (int i = 0; i < antal - 1; i++) { //går igenom alla ord i vectorn
if (int(pali[i][0]) > int(pali[i + 1][0])) {
tempO = pali[i];
pali[i] = pali[i + 1];
pali[i + 1] = tempO;
i = -1;
}
else if (int(pali[i][0]) == int(pali[i + 1][0])) { //Om första bokstaven är samma kollar den följande
int minsta = pali[i].size();
if (minsta > pali[i + 1].size()) {
minsta = pali[i + 1].size();
}
for (int a = 1; a < minsta-1; a++){
if (int(pali[i][a]) > int(pali[i + 1][a])) { //byter om någon av bokstäverna efter den första är mindre än bokstäverna i andra ordet
tempO = pali[i];
pali[i] = pali[i + 1];
pali[i + 1] = tempO;
i = -1;
byte = true;
break;
}
}
if (byte == false && pali[i].size() > pali[i + 1].size()) { // byter om pali i+1 är mindre än pali i
tempO = pali[i];
pali[i] = pali[i + 1];
pali[i + 1] = tempO;
i = -1;
}
}
}
}

Generally speaking, there's no relationship between the alphabetical order of letters in any given language and numerical codes assigned to said letters in any given character set. In order to compare strings according to the alphabetical order of a given language (or more generally the collation order of the current locale), C has a special function called strcoll.
In order to use it, you need to set up your locale accordingly. Unfortunately, locale names are not standard in C. If you are on Windows, the linked example is unlikely to work.
This is what you should be using in real software. It matters little for you assignment since you are not supposed to use fancy library functions. You need to implement a function similar to strcoll yourself, and it should only work for your language.
In a language where each character has its own place in the alphabet, this function is simple: write a function that takes a character and returns its place in the alphabet (e.g. for 'a' return 1, for 'b' return 2, ..., for 'å' return 27, for 'ä' return 28...) Compare the strings according to numbers returned by this function. This may or may not take into account letter case depending on what exact sort order you want.
If you don't want to write a big switch, you can use the fact that letters that are in ASCII are already ordered as you want, you only need to fix the order of three additional letters. So you can write something like this:
int collation_order(int ch) {
switch (ch) {
case 'Å': return 'Z'+1;
case 'å': return 'z'+1;
case 'Ä': return 'Z'+2;
case 'ä': return 'z'+2;
case 'Ö': return 'Z'+3;
case 'ö': return 'z'+3;
default : return ch;
}
}
int my_strcoll (char* p, char* q)
{
int pp, qq;
while (*p && (pp=collation_order(*p)) == (qq = collation_order(*q))) {
p++; q++;
}
return pp - qq;
}
Of course this means that non-alphabetic that come after Z/z in the ASCII table will get sorted incorrectly. If you want to sort those after Ö/ö, you need to extend collation_order accordingly. Try doing this without resorting to a case for each individual character.
Another way to write collation_order is to use character codes (cast to unsigned char) as indices in an array of 256 integer elements.
Also please note that old 8-bit encodings are old and should not be used for serious new development. For more information, read this.

Since your options are constrained and you can also constrain your input to a foreseeable universe, I'd suggest you to use a simple parser function to fit non-ASCII characters inside the places you know they should:
int parse_letter( int source )
{
switch( source )
{
case 'å':
case 'ä': return 'a';
case 'ö': return 'o':
// as many cases as needed...
default: return source;
}
}

Related

Generate string lexicographically larger than input

Given an input string A, is there a concise way to generate a string B that is lexicographically larger than A, i.e. A < B == true?
My raw solution would be to say:
B = A;
++B.back();
but in general this won't work because:
A might be empty
The last character of A may be close to wraparound, in which case the resulting character will have a smaller value i.e. B < A.
Adding an extra character every time is wasteful and will quickly in unreasonably large strings.
So I was wondering whether there's a standard library function that can help me here, or if there's a strategy that scales nicely when I want to start from an arbitrary string.

You can duplicate A into B then look at the final character. If the final character isn't the final character in your range, then you can simply increment it by one.
Otherwise you can look at last-1, last-2, last-3. If you get to the front of the list of chars, then append to the length.

Here is my dummy solution:
std::string make_greater_string(std::string const &input)
{
std::string ret{std::numeric_limits<
std::string::value_type>::min()};
if (!input.empty())
{
if (std::numeric_limits<std::string::value_type>::max()
== input.back())
{
ret = input + ret;
}
else
{
ret = input;
++ret.back();
}
}
return ret;
}
Ideally I'd hope to avoid the explicit handling of all special cases, and use some facility that can more naturally handle them. Already looking at the answer by #JosephLarson I see that I could increment more that the last character which would improve the range achievable without adding more characters.
And here's the refinement after the suggestions in this post:
std::string make_greater_string(std::string const &input)
{
constexpr char minC = ' ', maxC = '~';
// Working with limits was a pain,
// using ASCII typical limit values instead.
std::string ret{minC};
auto rit = input.rbegin();
while (rit != input.rend())
{
if (maxC == *rit)
{
++rit;
if (rit == input.rend())
{
ret = input + ret;
break;
}
}
else
{
ret = input;
++(*(ret.rbegin() + std::distance(input.rbegin(), rit)));
break;
}
}
return ret;
}
Demo

You can copy the string and append some letters - this will produce a lexicographically larger result.
B = A + "a"

other ideas for calculating a string that contains e.g. "5 + 3 / 2"

i have written a program, where you give a string as input. this string should be a normal term e.g. "5 + 3 / 2" and all numbers and operators have to be seperated via a whitespace. the term you can type in should be as long as you want it to be, e.g. "1 * 2 * 5 - 1 * 4 + 1 + 5 + 3 + 3 + 3" should be working too. +, -, * and / are the only operators that are allowed to be used.
i have already got a working code. but it ignores the fact of * and / before + and -. but it does everything else perfectly.
the idea is it creates two arrays, one that saves the operators in a char array (called char operators[ ])and the other array saves the integers in a float array (called float values[ ]). then i have this calculation method:
void calc(float values[], char operators[]) {
float res_final;
float res_array[10];
int counter = (sizeof(values) / sizeof(*values));
for (int i = 0; i < getBorder(values); i++) {
if (i == 0) {
res_array[i] = switchFunction(values[i], values[i + 1], operators[i]);
}
res_final = switchFunction(res_array[i], values[i + 2], operators[i + 1]);
res_array[i+1] = res_final;
if (i == getBorder(values)) {
break;
}
}
std::cout << "evaluation of expression is: " << res_final << std::endl;
}
float switchFunction(float val_1, float val_2, char op) {
switch (op) {
case '+': return val_1 + val_2;
break;
case '-': return val_1 - val_2;
break;
case '*': return val_1 * val_2;
break;
case '/': return val_1 / val_2;
break;
}
return 0;
}
well the code is not really pretty, but i couldnt come up with anything more useful. i have so many ideas but it all failed when it comes to the operators. i wanted to define the normal + in '+' and for the rest too, but this wont work.
so if you have any suggestions on how to include point before line or if you have a complete different approach to mine, i would be glad to hear about it :)

In the long run, you want to create an Object that represents the formula.
A good structure would be a tree. An inner node of such a tree is an operator, while a leaf is a number.
Then you write a parser that parses a string into a tree. I'd do this recursive like this:
FormulaNode parse(input){
string left, right;
if(split_string(input, * or /, left, right){
return FormulaNode(* or /, parse(left), parse(right))
if(split_string(input, + or -, left, right){
...
}
return FormulaNode(number, to_value(string))
}
with split_string being a method that tries to split a string by a certain symbol, returns a boolean if that was possible and splits it into the references left and right,
FormulaNode(symbol, left child, right child) being a constructor that creates an inner node,
FormulaNode(number, value) being a constructor that creates a leaf.
Of course, all of this is pseudo-code, didn't want to impose a style on you, just to illustrate the principle. The second constructor might probably only be of the signature FormulaNode(const double). As for symbol, I'd recommend to create something like enumerate OperatorType {addition,...}.
EDIT:
here is a bigger architecture with a somewhat different design:
class FormulaTree{
private:
class FormulaNode{
private:
bool is_number;
//used members if is number
double value;
//used members if not is number / is operator
OperatorType type;
unique_ptr<FormulaNode> left_child, right_child;
public:
FormulaNode(string input);
double evaluate() const;
};
unique_ptr<FormulaNode> root;
public:
Formula(string input);
double evaluate() const;
}
with (in pseudo-code)
FormulaTree::FormulaNode::FormulaNode(string input){
if(input contains * or /){
char symbol = first occurence(input, * or /);
vector<string> split_input= split at first occurence(input, symbol);
type = OperatorType(symbol);
is_number = false;
left_child = make_unique(new FormulaNode(split_input[0]));
right_child = make_unique(new FormulaNode(split_input[1]));
return;
}
if(input contains + or -){
...
}
is_number = true;
value = parse to int(input);
}
(in the long run, you might also want to add something that checks if the input is legal, like "the string is not empty on one side of an operator", "parse to int worked, that is contained no illegal characters" et cetera)
(also, if you continue to expand this, you need some parser that splits it by brackets first)
If you need me to explain anything about this structure, simply ask, I'll edit.
EDIT:
Slava commented that it would be better to derive FormulaNode for the different types. This is right, and I originally edited this to show such a design, but I removed it again because it might easily confuse a beginner.
Especially since such a pattern would require a somewhat different layout - we would want to let the tree itself do the parsing since the derived classes shouldn't know each other. In the long run, you want to learn such things. I'd recommend that you try out the pattern I presented, add your own style, add some more features (like a symbol for power or the option to use a minus to denote a negative number) and then put it on CodeReview. My reasoning is that this is what you want to do anyway and when you do, your code will be attacked at every part anyway, until it's "perfect".

Inexistent double decrement?

I was writing a little game, where there is an hidden word, and the user must guess, char to char, what word is.
While coding this I got stucked in something that I don't understeand where and how it happens.
while(true)
{
if(Hue == 0)
Try -= 1;
if(Hue == 1)
Hue = 0;
GotoXY(0, 3);
printf("Inserisci una lettera maiuscola\n>");
GotoXY(1, 4);
scanf("%c", &Key);
GotoXY(0, 4);
printf(" ");
GotoXY(0, 6);
printf("Numero di tentativi rimasti: %d ", Try);
for(unsigned short Iterator = 1; Iterator < Length - 1; ++Iterator)
if(Key == UserString[Iterator])
{
for(unsigned short SecIterator = Iterator; SecIterator < Length - 1; ++SecIterator)
{
if(Key == UserString[SecIterator])
{
GotoXY(SecIterator, 1);
printf("%c", Key);
}
}
Hue = 1;
break;
}
}
Hue is a simple control variable to check if the key was in the word.
If it's still 0 then the key wasn't in the word, so the Try decrements it self and so on.
But what happen is that Hue, either is 0 or 1 causes the decrement of Try, and the thing even more stange is that Try decrement twice when is 0, evenly in the code isn't written nothing like that.
Thanks for the help.

It seems the confusion is mostly due to the double decrement: well, you are reading chars and most likely you hit return making two chars available: the entered character and the '\n' from the return. Since apparently neither character matches you get two decrements.
Just for a bit of explanation: when using the formatted input using std::cin >> Key leading whitespace is skipped. When using scanf("%c", &c) each character is extracted. I think you can have scanf() skip leading spaces using
if (1 == scanf(" %c", &c)) {
// process the input
}
Note the extra space in front of the '%c'. To debug issues like this it is generally a good idea to print what was read. ...and, of course, you always need to verify that the read was actually successful.

Converting scancodes to ASCII

I'm implementing my own text editor in c++. It's going... ok. ;P
I need a way to turn a keycode (specifically Allegro, they call it scancodes) into an ASCII-char. I can do A-Z easy, and converting those to a-z is easy as well. What I do currently is use a function in Allegro that returns a name from a scancode (al_keycode_to_name), meaning if the key pressed is A-Z it returns "A" to "Z". That's easy peasy, but I can't simply read special characters like ",", ";" etc. That's where I'm having a hard time.
Is there a way to do this automatically? Maybe a library that does this? The real trick is taking different layouts into consideration.
Here's what I have so far, in case anyone's interested. The class InputState is basically a copy of the Allegro inputstate, with added functionality (keyDown, keyUp, keyPress for example):
void AllegroInput::TextInput(const InputState &inputState, int &currentCharacter, int &currentRow, std::string &textString)
{
static int keyTimer = 0;
static const int KEY_TIMER_LIMIT = 15;
for (int i = 0; i < 255; i++)
{
if (inputState.key[i].keyDown)
{
keyTimer++;
}
if (inputState.key[i].keyPress)
{
keyTimer = 0;
}
if ((inputState.key[i].keyPress) || ((inputState.key[i].keyDown) && (keyTimer >= KEY_TIMER_LIMIT)))
{
std::string ASCII = al_keycode_to_name(i);
if ((ASCII.c_str()[0] >= 32) && (ASCII.c_str()[0] <= 126) && (ASCII.length() == 1))
{
textString = textString.substr(0, currentCharacter) + ASCII + textString.substr(currentCharacter, textString.length());
currentCharacter++;
}
else
{
switch(i)
{
case ALLEGRO_KEY_DELETE:
if (currentCharacter >= 0)
{
textString.erase(currentCharacter, 1);
}
break;
case ALLEGRO_KEY_BACKSPACE:
if (currentCharacter > 0)
{
currentCharacter--;
textString.erase(currentCharacter, 1);
}
break;
case ALLEGRO_KEY_RIGHT:
if (currentCharacter < textString.length())
{
currentCharacter++;
}
break;
case ALLEGRO_KEY_LEFT:
if (currentCharacter > 0)
{
currentCharacter--;
}
break;
case ALLEGRO_KEY_SPACE:
if (currentCharacter > 0)
{
textString = textString.substr(0, currentCharacter) + " " + textString.substr(currentCharacter, textString.length());
currentCharacter++;
}
break;
}
}
}
}
}

You should be using the ALLEGRO_EVENT_KEY_CHAR event with the event.keyboard.unichar value to read text input. ALLEGRO_EVENT_KEY_DOWN and ALLEGRO_EVENT_KEY_UP correspond to physical keys being pressed. There is not a 1:1 correspondence between them and printable characters.
Say a dead key is being used to convert the two keys e' to é. You'd get two key down events for e and ' (and neither are useful for capturing the proper input), but one key char event with é. Or inversely, maybe somebody mapped F4 to a macro that unleashes an entire paragraph of text. In that case, you'd have multiple chars for a single key down.
Or a simple test: if you hold down a key for five seconds, you will get one ALLEGRO_EVENT_KEY_DOWN but multiple ALLEGRO_EVENT_KEY_CHAR as the OS' keyboard driver sends repeat events.
You can use ALLEGRO_USTR to easily store these unicode strings.
ALLEGRO_USTR *input = al_ustr_new("");
// in the event loop
al_ustr_append_chr(input, event.keyboard.unichar);
There's also ways to delete characters if backspace is pressed, etc. You can use the ustr data types with the font add-on directly via al_draw_ustr(font, color, x, y, flags, input), or you can use al_cstr(input) to get a read-only pointer to a UTF-8 string.

Switch every pair of words in a string (“ab cd ef gh ijk” becomes “cd ab gh ef ijk”) in c/c++

Switch every pair of words in a string (“ab cd ef gh ijk” becomes “cd ab gh ef ijk”) in c++ without any library function.
int main(){
char s[]="h1 h2 h3 h4";//sample input
switch_pair(s);
std::cout<<s;
return 0;
}
char * switch_pair(char *s){
char * pos = s;
char * ptr = s;
int sp = 0;//counts number of space
while(*pos){
if(*pos==' ' && ++sp==2){ //if we hit a space and it is second space then we've a pair
revStr_iter(ptr,pos-1);//reverse the pair so 'h1 h2' -> '2h 1h'
sp=0;//set no. of space to zero to hunt new pairs
ptr=pos+1;//reset ptr to nxt word after the pair i.e. h3'
}
pos++;
}
if(sp==1) //tackle the case where input is 'h1 h2' as only 1 space is there
revStr_iter(ptr,pos-1);
revWord(s); //this will reverse each individual word....i hoped so :'(
return s;
}
char* revStr_iter(char* l,char * r){//trivial reverse string algo
char * p = l;
while(l<r){
char c = *l;
*l = *r;
*r = c;
l++;
r--;
}
return p;
}
char* revWord(char* s){//this is the villain....need to fix it...Grrrr
char* pos = s;
char* w1 = s;
while(*pos){
if(*pos==' '){//reverses each word before space
revStr_iter(w1,pos-1);
w1=pos+1;
}
pos++;
}
return s;
}
Input - h1 h2 h3 h4
expected - h2 h1 h4 h3
actual - h2 h1 h3 4h
can any noble geek soul help plz :(((

IMO, what you're working on so far looks/seems a lot more like C code than C++ code. I think I'd start from something like:
break the input into word objects
swap pairs of word objects
re-construct string of rearranged words
For that, I'd probably define a really minimal string class. Just about all it needs (for now) is the ability to create a string given a pointer to char and a length (or something on that order), and the ability to assign (or swap) strings.
I'd also define a tokenizer. I'm not sure if it should really be a function or a class, but for the moment, let's jut say "function". All it does is look at a string and find the beginning and end of a word, yielding something like a pointer to the beginning, and the length of the word.
Finally, you need/want an array to hold the words. For a first-step, you could just use a normal array, then later when/if you want to have the array automatically expand as needed, you can write a small class to handle it.

int Groups = 1; // Count 1 for the first group of letters
for ( int Loop1 = 0; Loop1 < strlen(String); Loop1++)
if (String[Loop1] == ' ') // Any extra groups are delimited by space
Groups += 1;
int* GroupPositions = new int[Groups]; // Stores the positions
for ( int Loop2 = 0, Position = 0; Loop2 < strlen(String); Loop2++)
{
if (String[Loop2] != ' ' && (String[Loop2-1] == ' ' || Loop2-1 < 0))
{
GroupPositions[Position] = Loop2; // Store position of the first letter
Position += 1; // Increment the next position of interest
}
}
If you can't use strlen, write a function that counts any letters until it encounters a null terminator '\0'.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Sorting in alphabetical with å ä ö - c++

Related

Generate string lexicographically larger than input

other ideas for calculating a string that contains e.g. "5 + 3 / 2"

Inexistent double decrement?

Converting scancodes to ASCII

Switch every pair of words in a string (“ab cd ef gh ijk” becomes “cd ab gh ef ijk”) in c/c++

Categories

Resources