ABAP equality check is wrong for INT4 and CHAR numeric - unit-testing

I've ran into an issue here, and I can't figure out exactly what SAP is doing. The test is quite simple, I have two variables that are a completely different type as well as having two completely different values.
The input is an INT4 of value 23579235. I am testing the equality function against a string '23579235.43'. Obviously my expectation is that these two variables are different because not only are they not the same type of variable, but they don't have the same value. Nothing about them is similar, actually.
EXPECTED1 23579235.43 C(11) \TYPE=%_T00006S00000000O0000000302
INDEX1 23579235 I(4) \TYPE=INT4
However, cl_abap_unit_assert=>assert_equals returns that these two values are identical. I started debugging and noticed the 'EQ' statement was used to check the values, and running the same statement in a simple ABAP also returns 'true' for this comparison.
What is happening here, and why doesn't the check fail immediately after noticing that the two data types aren't even the same? Is this a mistake on my part, or are these assert classes just incorrect?
report ztest.
if ( '23579235.43' eq 23579235 ).
write: / 'This shouldn''t be shown'.
endif.

As #dirk said, ABAP implicitly converts compared or assigned variables/literals if they have different types.
First, ABAP decides that the C-type literal is to be converted into type I so that it can be compared to the other I literal, and not the opposite because there's this priority rule when you compare types C and I : https://help.sap.com/http.svc/rc/abapdocu_752_index_htm/7.52/en-US/abenlogexp_numeric.htm###ITOC##ABENLOGEXP_NUMERIC_2
| decfloat16, decfloat34 | f | p | int8 | i, s, b |
.--------------|------------------------|---|---|------|---------|
| string, c, n | decfloat34 | f | p | int8 | i |
(intersection of "c" and "i" -> bottom rightmost "i")
Then, ABAP converts the C-type variable into I type for doing the comparison, using the adequate rules given at https://help.sap.com/http.svc/rc/abapdocu_752_index_htm/7.52/en-US/abenconversion_type_c.htm###ITOC##ABENCONVERSION_TYPE_C_1 :
Source Field Type c -> Numeric Target Fields -> Target :
"The source field must contain a number in mathematical or
commercial notation. [...] Decimal places are rounded commercially
to integer values. [...]"
Workarounds so that 23579235.43 is not implicitly rounded to 23579235 and so the comparison will work as expected :
either IF +'23579235.43' = 23579235. (the + makes it an expression i.e. it corresponds to 0 + '23579235.43' which becomes a big Packed type with decimals, because of another rule named "calculation type")
or IF conv decfloat16( '23579235.43' ) = 23579235. (decfloats 16 and 34 are big numbers with decimals)

Related

FORTRAN subroutine thinks the length of a string I'm passing it is not what it is

stackoverflow. First-time poster, long-time reader. I am working on debugging a large program (that started in F77 but has evolved), and I'm getting a runtime error that the string I'm passing a subroutine is shorter than expected. The thing is, I'm putting in a debug statement right before calling the subroutine, and the string is indeed of the correct length. Can you help me figure this one out? Since the code is long I'll just post the relevant snippet of file her1pro.F here (note WORD="HUCKEL " with a space at the end, but this happens to all the strings):
SUBROUTINE PR1INT(WORD,WORK,LWORK,IORDER,NPQUAD,
& TRIANG,PROPRI,IPRINT)
...
CHARACTER WORD*7
...
WRITE(*,*)"LB debug, calling PR1IN1 from PR1INT"
WRITE(*,*)"LB debug, WORD=",WORD
WRITE(*,*)"LB debug, LENGTH(WORD)=",LEN(WORD)
CALL PR1IN1(WORK,KFREE,LFREE,WORK(KINTRP),WORK(KINTAD),
& WORK(KLBINT),WORD,IORDER,NPQUAD,TRIANG,
& PROPRI,IPRINT,DUMMY,NCOMP,TOFILE,'TRIANG',
& DOINT,WORK(KEXPVL),EXP1EL,DUMMY)
...
SUBROUTINE PR1IN1(WORK,KFREE,LFREE,INTREP,INTADR,LABINT,WORD,
& IORDER,NPQUAD,TRIANG,PROPRI,IPRINT,
& SINTMA,NCOMP,TOFILE,MTFORM,
& DOINT,EXPVAL,EXP1EL,DENMAT)
...
CHARACTER LABINT(*)*8, WORD*7, TABLE(NTABLE)*7, MTFORM*6,
& EFDIR*1, LABLOC*8
...
And this is the output I'm getting:
[xxx#yyy WORK_TEST ]$ ~/dalton/build/dalton.x
DALTON: default work memory size used. 64000000
Work memory size (LMWORK+2): 64000002 = 488.28 megabytes; node 0
0: Directories for basis set searches:
./:
LB debug, calling PR1IN1 from PR1INT
LB debug, WORD=HUCKEL
LB debug, LENGTH(WORD)= 7
At line 161 of file /p/home/lbelcher/dalton/DALTON/abacus/her1pro.F
Fortran runtime error: Actual string length is shorter than the declared one for dummy argument 'word' (6/7)
From the standard:
16.9.112 LEN (STRING [, KIND])
Description. Length of a character entity.
Class. Inquiry function.
Arguments.
STRING shall be of type character. If it is an unallocated allocatable variable or a pointer that is not associated, its length type parameter shall not be deferred.
KIND (optional) shall be a scalar integer constant expression.
Result Characteristics. Integer scalar. If KIND is present, the kind type parameter is that specified by the value of KIND; otherwise the kind type parameter is that of default integer type.
Result Value. The result has a value equal to the number of characters in STRING if it is scalar or in an element of STRING if it is an array.
Example. If C is declared by the statement
CHARACTER (11) C (100)
LEN (C) has the value 11.
16.9.113 LEN_TRIM (STRING [, KIND])
Description. Length without trailing blanks.
Class. Elemental function.
Arguments.
STRING shall be of type character.
KIND (optional) shall be a scalar integer constant expression.
Result Characteristics. Integer. If KIND is present, the kind type parameter is that specified by the value of KIND; otherwise the kind type parameter is that of default integer type.
Result Value. The result has a value equal to the number of characters remaining after any trailing blanks in STRING are removed. If the argument contains no nonblank characters, the result is zero.
Examples. LEN_TRIM (’ A B ’) has the value 4 and LEN_TRIM (’ ’) has the value 0.
I think the examples here tell the story.

How do I compare values for equality by Type Constructor?

Background
I'm a relative newcomer to Reason, and have been pleasantly suprised by how easy it is to compare variants that take parameters:
type t = Header | Int(int) | String(string) | Ints(list(int)) | Strings(list(string)) | Footer;
Comparing different variants is nice and predictable:
/* not equal */
Header == Footer
Int(1) == Footer
Int(1) == Int(2)
/* equal */
Int(1) == Int(1)
This even works for complex types:
/* equal */
Strings(["Hello", "World"]) == Strings(["Hello", "World"])
/* not equal */
Strings(["Hello", "World"]) == Strings(["a", "b"])
Question
Is it possible to compare the Type Constructor only, either through an existing built-in operator/function I've not been able to find, or some other language construct?
let a = String("a");
let b = String("b");
/* not equal */
a == b
/* for sake of argument, I want to consider all `String(_)` equal, but how? */
It is possible by inspecting the internal representation of the values, but I wouldn't recommend doing so as it's rather fragile and I'm not sure what guarantees are made across compiler versions and various back-ends for internals such as these. Instead I'd suggest either writing hand-built functions, or using some ppx to generate the same kind of code you'd write by hand.
But that's no fun, so all that being said, this should do what you want, using the scarcely documented Obj module:
let equal_tag = (a: 'a, b: 'a) => {
let a = Obj.repr(a);
let b = Obj.repr(b);
switch (Obj.is_block(a), Obj.is_block(b)) {
| (true, true) => Obj.tag(a) == Obj.tag(b)
| (false, false) => a == b
| _ => false
};
};
where
equal_tag(Header, Footer) == false;
equal_tag(Header, Int(1)) == false;
equal_tag(String("a"), String("b")) == true;
equal_tag(Int(0), Int(0)) == true;
To understand how this function works you need to understand how OCaml represents values internally. This is described in the section on Representation of OCaml data types in the OCaml manual's chapter on Interfacing C with OCaml (and already here we see indications that this might not hold for the various JavaScript back-ends, for example, although I believe it does for now at least. I've tested this with BuckleScript/rescript, and js_of_ocaml tends to follow internals closer.)
Specifically, this section says the following about the representation of variants:
type t =
| A (* First constant constructor -> integer "Val_int(0)" *)
| B of string (* First non-constant constructor -> block with tag 0 *)
| C (* Second constant constructor -> integer "Val_int(1)" *)
| D of bool (* Second non-constant constructor -> block with tag 1 *)
| E of t * t (* Third non-constant constructor -> block with tag 2 *)
That is, constructors without a payload are represented directly as integers, while those with payloads are represented as "block"s with tags. Also note that block and non-block tags are independent, so we can't first extract some "universal" tag value from the values that we then compare. Instead we have to check whether they're both blocks or not, and then compare their tags.
Finally, note that while this function will accept values of any type, it is written only with variants in mind. Comparing values of other types is likely to yield unexpected results. That's another good reason to not use this.

Does std::scientific always result in normalized scientific notation for floating-point numbers?

Scientific notation defines how numbers should be displayed using a sign, a number and an exponent but it does not state that the visualization is normalized.
An example: -2.34e-2 (normalized scientific notation) is the same as -0.234e-1 (scientific notation)
Can I rely on the following code always producing the normalized outcome?
Edit: except NAN and INF as pointed out in the answers.
template<typename T>
static std::string toScientificNotation(T number, unsigned significantDigits)
{
if (significantDigits > 0) {
significantDigits--;
}
std::stringstream ss;
ss.precision(significantDigits);
ss << std::scientific << number;
return ss.str();
}
If yes, please list a section in the C++ documentation/standard stating that it is not platform/implementation-defined. Since the value of 0 is also represented differently I'm afraid that certain very small numbers (denormalized?!) could be visualized differently. On my platform with my compiler it currently works for std::numeric_limits::min(), denorm_min().
Note: I use this to find the order of magnitude of a number without messing with all the quirky details of floating point number analysis. I wanted the standard library do it for me :-)
Can I rely on the following code always producing the normalized outcome?
There are no guarantee of it, no. Better said: the Standard does not impose a guarantee as strong as you wish here was.
std::scientific is only quoted on the following relevant parts:
[floatfield.manip]:2
ios_base& scientific(ios_base& str);
Effects: Calls str.setf(ios_­base​::​scientific, ios_­base​::​floatfield).
Returns: str.
Table 101 — fmtflags effects
| Element | Effect(s) if set |
| ... | ... |
| scientific | generates floating-point output in scientific notation |
| ... | ... |
Yes, except for zero, infinity and NaN.
The C++ standard refers to the C standard for formatting, which requires normalized scientific notation.
[floatfield.manip]/2
ios_base& scientific(ios_base& str);
Effects: Calls str.setf(ios_­base​::​scientific, ios_­base​::​floatfield).
Returns: str.
[ostream.inserters.arithmetic]/1 (partial)
operator<<(float val);
operator<<(double val);
operator<<(long double val);
Effects: The classes num_­get<> and num_­put<> handle locale-dependent numeric formatting and parsing. These inserter functions use the imbued locale value to perform numeric formatting. When val is of type ..., double, long double, ..., the formatting conversion occurs as if it performed the following code fragment:
bool failed = use_facet<
num_put<charT, ostreambuf_iterator<charT, traits>>
>(getloc()).put(*this, *this, fill(), val).failed();
When val is of type float the formatting conversion occurs as if it performed the following code fragment:
bool failed = use_facet<
num_put<charT, ostreambuf_iterator<charT, traits>>
>(getloc()).put(*this, *this, fill(),
static_cast<double>(val)).failed();
[facet.num.put.virtuals]/1:5.1 (partial)
Stage 1:
The first action of stage 1 is to determine a conversion specifier. The tables that describe this determination use the following local variables
fmtflags flags = str.flags();
fmtflags floatfield = (flags & (ios_base::floatfield));
For conversion from a floating-point type, the function determines the floating-point conversion specifier as indicated in Table 70.
Table 70 — Floating-point conversions
| State | stdio equivalent |
| ------------------------------------------------ | ---------------- |
| floatfield == ios_­base​::​scientific && !uppercase | %e |
| floatfield == ios_­base​::​scientific | %E |
The representations at the end of stage 1 consists of the char's that would be printed by a call of printf(s, val) where s is the conversion specifier determined above.
C11 n1570 [7.21.6.1]:8.4
e,E
A double argument representing a floating-point number is converted in the
style [−]d.ddde±dd, where there is one digit (which is nonzero if the
argument is nonzero) before the decimal-point character and the number of
digits after it is equal to the precision; if the precision is missing, it is taken as 6; if the precision is zero and the # flag is not specified, no decimal-point
character appears. The value is rounded to the appropriate number of digits. The E conversion specifier produces a number with E instead of e
introducing the exponent. The exponent always contains at least two digits,
and only as many more digits as necessary to represent the exponent. If the
value is zero, the exponent is zero.
A double argument representing an infinity or NaN is converted in the style of an f or F conversion specifier.

Is it necessary to append _kind to number literals in modern Fortran?

This might be stupid question but I'm a bit confused after some recent tests. I always thought the way Fortran deals with reals is the following (modern declaration):
program testReal
implicit none
integer, parameter :: rkind=8
real(kind=rkind) :: a,b,c
// Test 1
a = 1
b = 1.
c = 1._rkind
write(*,"(2(1xES23.15))") a, b, c
// Test 2
a = 1/2
b = 1/2.
c = 1./2.
write(*,"(2(1xES23.15))") a, b, c
// Test 3
a = 3.3
b = 1 * a
c = 1. * a
write(*,"(2(1xES23.15))") a, b, c
end program testReal
Apart from Test 2 - a everything evaulates the same. I always thought I have to put e.g. 1._rkind, 0.5_rkind, etc. after every real in order to make sure to fill the rest of the mantissa with zeros?
Is this just pure luck or is it really not neccessary anymore to attach the _rkind?
Let's look first at Test 1. Here 1, 1. and 1._rkind are literal constants. 1 is an integer of default kind; 1. is a real of default kind; 1._rkind is a real of kind rkind (which could be the same kind as the default). These are all different things.
However, in this case what happens on the assignment is key. As a, b and c are all reals of kind rkind the corresponding right-hand sides are all converted to a real of kind rkind (assuming such a kind has greater precision than default kind). The conversion is equivalent to
a = REAL(1, rkind)
b = REAL(1., rkind)
c = 1._rkind
It just so happens that 1 and 1. are both, in your numeric model, convertable exactly to 1._rkind.
I won't touch on Test 2, as the differences are "obvious".
In Test 3, we have the literal constant 3.3 which is a real of default kind. Again
a = REAL(3.3, rkind)
b = REAL(1, rkind)*REAL(3.3, rkind)
c = REAL(1., rkind)*REAL(3.3, rkind)
due to where and how conversions happen. From this you can see that the results are reasonably the same and the arithmetic happens as real of kind rkind.
What you will notice is a difference
a = 3.3
b = 3.3_rkind
because the (mathematical) real number 3.3 is not exactly representable in your numeric model. And the approximations will differ with real of default kind and kind rkind.
In particular, there is no need to worry about "fill[ing] the rest of the mantissa with zeros".
It is not necessary to specify the kind for numbers which are exactly representable. All integer numbers that are not too big are exactly representable as an IEEE standard floating point number. Therefore it is not necessary for constants like 1. or 3.. Also one half is exactly representable in binary so 1./2. will work fine.
It is necessary for other values, which are not exactly representable, becaause without the suffix the literal is treated as default kind (single precision). In your case 3.3 is NOT exactly representable and you will get different results
write(*,*) 3.3, 3.3_rkind
3.29999995 3.2999999999999998

Char data type in C/C++

I am trying to call a C++ DLL in Java. In its C++ head file, there are following lines:
#define a '102001'
#define b '102002'
#define c '202001'
#define d '202002'
What kind of data type are for a, b, c, and d? are they char or char array? and what are the correpsonding data type in Java that I should convert to?
As Mysticial pointed out, these are multicharacter literals. Their type is implementation-dependent, but it's probably Java long, because they use 48 bits.
In Java, you need to convert them to long manually:
static long toMulticharConst(String s) {
long res = 0;
for (char c : s.toCharArray()) {
res <<= 8;
res |= ((long)c) & 0xFF;
}
return res;
}
final long a = toMulticharConst("102001");
final long b = toMulticharConst("102002");
final long c = toMulticharConst("202001");
final long d = toMulticharConst("202002");
I might try to answer the first two questions. Being not familiar with java, I have to leave the last question to others.
Single and double quotes mean very different things in C. A character enclosed in single quotes is just the same as the integer representing it in the collating sequence(e.g. in ASCII implementation, 'a' means exactly the same as 97).
However, a string enclosed in double quotes is a short-hand way of writing a pointer to the initial character of a nameless array that has been initialized with the characters between the quotes and an extra character whose binary value is 0.
Because an integer is always large enough to hold several characters, some implementations of C compilers allow multiple characters in a character constant as well as a string constant, which means that writing 'abc' instead of "abc" may well go undetected. Yet, "abc" means a pointer points to a array containing 4 characters (a,b,c,and \0) while the meaning of 'abc' is platform-dependent. Many of the C compiler take it to mean "an integer that is composed somehow of the values of the characters a,b,and c.
For more informations, you might read the chapter 1.4 of the book "C traps and pitfalls"