Right padding a String with Zeros in XSLT - xslt

I need to right pad this with leading zeros to a length of 3 in the output (which is fixed length text)
Examples:
A becomes A00
AB becomes AB0
ABC becomes ABC
Please help.

You could do simply:
substring(concat($your-string, '000'), 1, 3)
Note that this means that "ABCD" becomes "ABC".

Related

Slicing string character correctly in C++

I'd like to count number 1 in my input, for example,111 (1+1+1) must return 3and
101must return 2 (1+1)
To achieve this,I developed sample code as follows.
#include <iostream>
using namespace std;
int main(){
string S;
cout<<"input number";
cin>>S;
cout<<"S[0]:"<<S[0]<<endl;
cout<<"S[1]:"<<S[1]<<endl;
cout<<"S[2]:"<<S[2]<<endl;
int T = (int) (S[0]+S[1]+S[2]);
cout<<"T:"<<T<<endl;
return 0;
}
But when I execute this code I input 111 for example and my expected return is 3 but it returned 147.
[ec2-user#ip-10-0-1-187 atcoder]$ ./a.out
input number
111
S[0]:1
S[1]:1
S[2]:1
T:147
What is the wrong point of that ? I am totally novice, so that if someone has opinion,please let me know. Thanks
It's because S[0] is a char. You are adding the character values of these digits, rather than the numerical value. In ASCII, numerical digits start at value 48. In other words, each of your 3 values are exactly 48 too big.
So instead of doing 1+1+1, you're doing 49+49+49.
The simplest way to convert from character value to digit is to subtract 48, which is the value of 0.
e.g, S[0] - '0'.
Since your goal is to count the occurrences of a character, it makes no sense to sum the characters together. I recommend this:
std::cout << std::ranges::count(S, '1');
To explain the output that you get, characters are integers whose values represent various symbols (and non-printable control characters). The value that represents the symbol '1' is not 1. '1'+'1'+'1' is not '3'.

Beginner: tolower in a vector

So I have a code for counting different characters in a textfile and I dont quite understand what the bottom codeline in the following section do:
string linje;
int nLetters = 'Z' - 'A' +1;
vector<int> bokstaver(nLetters, 0);
int antallTegn = 0;
while(getline(inputfile, linje)){
for(char tegn:linje){
if(isLetter(tegn)){
antallTegn++;
bokstaver[tolower(tegn)-'a']++;
I know it converts the tegn variable to a lower case but I dont understand why we have to subtract 'a'.
Characters are represented by integers in computers. Each integer value represent a character (this gets more complicated with UNICODE, but that's beyond the scope of this question). So, 'a' has a numerical value, as does tolower(tegn).
Numbers can be thought as a line of numbers, where the value is the position of the number on that line. Similarly, number-encoded characters can be thought of as characters in a line where their numerical value is their position.
A number line:
0 1 2 3 4
A character line:
, . - a b c d
Subtraction of two numbers is analogous to their distance on the number line. Similarly, subtracting two characters is their distance on the character line.
So, bokstaver is an array whose indices are positions on the character line, offset by the position of the character 'a'.
Characters are represented using character codes in computers.
For example, character code representing for a is 0x61 (97) in ASCII code.
Therefore, you have to subtract 'a' (character code for a) in order to convert the character code in the input string to index for the vector starting with zero.
This method will work when character codes for alphabets are continuous. ASCII code satisfies this condition.

How to convert between character and byte position in Objective-C/C/C++

I need to convert from a byte position in a UTF-8 string to the corresponding character position in Objective-C. I'm sure there must be a library to do this, but I cannot find one - does anyone (though obviously any C or C++ library would do the job here).
I realise that I could truncate the UTF-8 string at the required character, convert that to an NSString, then read the length of the NSString to get my answer, but that seems like a somewhat hacky solution to a problem that can be solved quite simply with a small FSM in C.
Thanks for your help.
"Character" is a somewhat ambiguous term, it means something different in different contexts. I'm guessing that you want the same result as your example, [NSString length].
The NSString documentation isn't exactly upfront about this, but [NSString length] counts the number of UTF-16 code units in the string. So U+0000..U+FFFF count as one each, but U+10000..U+10FFFF count as two each. And don't split surrogate pairs!
You can count the number of UTF-16 code points based on the leading byte of each UTF-8 character. The trailing bytes use a disjoint set of values so you don't need to track any state at all, except your position in the string (good news: a finite state machine is overkill).
static const unsigned char BYTE_WIDTHS[256] = {
// 1-byte: 0xxxxxxx
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
// Trailing: 10xxxxxx
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
// 2-byte leading: 110xxxxx
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
// 3-byte leading: 1110xxxx
// 4-byte leading: 11110xxx
// invalid: 11111xxx
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0
};
size_t utf8_utf16width(const unsigned char *string, size_t len)
{
size_t i, utf16len = 0;
for (i = 0; i < len; i++)
utf16len += BYTE_WIDTHS[string[i]];
return utf16len;
}
The table is 1 for the 1-byte, 2-byte, and 3-byte UTF-8 leading characters, and 2 for the 4-byte UTF-8 leading characters because those will end up as two characters when translated to NSString.
I generated the table in Haskell with:
elems $ listArray (0,256) (repeat 0) //
[(n,1) | n <- ([0x00..0x7f] ++ [0xc0..0xdf] ++ [0xe0..0xef])] //
[(n,2) | n <- [0xf0..0xf7]]
Look at the UTF-8 encoding and note that code points begin with the following 8-bit patterns:
76543210 <- bit
0xxxxxxx <- ASCII chars
110xxxxx \
1110xxxx } <- more byte(s) (of form 10xxxxxx) follow
11110xxx /
That's what you should look for when searching for the beginning of a code point.
But that alone is only a part of the solution. You need to take into account Combining characters. You need to take combining diacritical marks together with the main character that precedes them, you cannot just separate them and treat as independent characters.
There's probably even more to it.

format specifier for short integer

I don't use correctly the format specifiers in C. A few lines of code:
int main()
{
char dest[]="stack";
unsigned short val = 500;
char c = 'a';
char* final = (char*) malloc(strlen(dest) + 6);
snprintf(final, strlen(dest)+6, "%c%c%hd%c%c%s", c, c, val, c, c, dest);
printf("%s\n", final);
return 0;
}
What I want is to copy at
final [0] = a random char
final [1] = a random char
final [2] and final [3] = the short array
final [4] = another char ....
My problem is that i want to copy the two bytes of the short int to 2 bytes of the final array.
thanks.
I'm confused - the problem is that you are saying strlen(dest)+6 which limits the length of the final string to 10 chars (plus a null terminator). If you say strlen(dest)+8 then there will be enough space for the full string.
Update
Even though a short may only be 2 bytes in size, when it is printed as a string each character will take up a byte. So that means it can require up to 5 bytes of space to write a short to a string, if you are writing a number above 10000.
Now, if you write the short to a string as a hexadecimal number using the %x format specifier, it will take up no more than 2 bytes.
You need to allocate space for 13 characters - not 11. Don't forget the terminating NULL.
When formatted the number (500) takes up three spaces, not one. So your snsprintf should give the final length as strlen(dest)+5+3. Then also fix your malloc call to adjust. If you want to compute the strlen of the number, do that with a call like this strlen(itoa(val)). Also, cant forget the NULL at the end of dest, but I think strlen takes this into account, but I'm not for sure.
Simple answer is you only allocated enough space for the strlen(dest) + 6 characters when in all reality it looks like you're going to have 8 extra characters... since you have 2 chars + 3 chars in your number + 2 chars after + dest (5 chars) = 13 char when you allocated 11 chars.
Unsigned shorts can take up to 5 characters, right? (0 - 65535)
Seems like you'd need to allocate 5 characters for your unsigned short to cover all of the values.
Which would point to using this:
char* final = (char*) malloc(strlen(dest) + 10);
You lose one byte because you think the short variable takes 2 byte. But it takes three: one for each digit character ('5', '0', '0'). Also you need a '\0' terminator (+1 byte).
==> You need strlen(dest) + 8
Use 8 instead of 6 on:
char* final = (char*) malloc(strlen(dest) + 6);
and
snprintf(final, strlen(dest)+6, "%c%c%hd%c%c%s", c, c, val, c, c, dest);
Seems like the primary misunderstanding is that a "2-byte" short can't be represented on-screen as 2 1-byte characters.
First, leave enough room:
char* final = (char*) malloc(strlen(dest) + 9);
The entire range of possible values for a 1-byte character are not printable. If you want to display this on screen and be readable, you'll have to encode the 2-byte short as 4 hex bytes, such as:
## as hex, 4 characters
snprintf(final, sizeof(final), "%c%c%4x%c%c%s", c, c, val, c, c, dest);
If you are writing this to a file, that's OK, and you might try the following:
## print raw bytes, upper byte, then lower byte.
snprintf(final, sizeof(final), "%c%c%c%c%c%c%s", c, c, ((val<<8)&0xFF), ((val>>8)&0xFF), c, c, dest);
But that won't make sense to a human looking at it, and is sensitive to endianness. I'd strongly recommend against it.

What does "%3d" mean in a printf statement?

In this code what is the role of the symbol %3d? I know that % means refer to a variable.
This is the code:
#include <stdio.h>
int main(void)
{
int t, i, num[3][4];
for(t=0; t<3; ++t)
for(i=0; i<4; ++i)
num[t][i] = (t*4)+i+1;
/* now print them out */
for(t=0; t<3; ++t) {
for(i=0; i<4; ++i)
printf("%3d ", num[t][i]);
printf("\n");
}
return 0;
}
%3d can be broken down as follows:
% means "Print a variable here"
3 means "use at least 3 spaces to display, padding as needed"
d means "The variable will be an integer"
Putting these together, it means "Print an integer, taking minimum 3 spaces"
See http://www.cplusplus.com/reference/clibrary/cstdio/printf/ for more information
That is a format specifier to print a decimal number (d) in three (at least) digits (3).
From man printf:
An optional decimal digit string
specifying a minimum field width. If
the converted value has fewer
characters than the field width, it
will be padded with spaces on the left
(or right, if the left-adjustment flag
has been given) to fill out the field
width.
Take a look here:
Print("%3d",X);
If X is 1234, it prints 1234.
If X is 123, it prints 123.
If X is 12, it prints _12 where _ is a leading single whitespace character.
If X is 1, it prints __1 where __ is two leading whitespacce characters.
An example to enlighten existing answers:
printf("%3d" , x);
When:
x is 1234 prints 1234
x is 123 prints 123
x is 12 prints 12 with an extra padding (space)
x is 1 prints 1 with two extra paddings (spaces)
You can specify the field width between the % and d(for decimal). It represents the total number of characters printed.
A positive value, as mentioned in another answer, right-aligns the output and is the default.
A negative value left-aligns the text.
example:
int a = 3;
printf("|%-3d|", a);
The output:
|3 |
You could also specify the field width as an additional parameter by using the * character:
int a = 3;
printf("|%*d|", 5, a);
which gives:
| 3|
It is a formatting specification. %3d says: print the argument as a decimal, of width 3 digits.
Literally, it means to print an integer padded to three digits with spaces. The % introduces a format specifier, the 3 indicates 3 digits, and the d indicates an integer. Thus, the value of num[t][i] is printed to the screen as a value such as " 1", " 2", " 12", etc.
2/3 or any integer is padding/width .it means ex for 3 ,minimum 3 space if we print a=4 then it print like 4,here two space left before 4 because it is one character