Getting weird result by using %I64u inside Mingw-w64

Getting weird result by using %I64u inside Mingw-w64 - c++

This is my code :
Note : \n inside scanf is my way to prevent trailing newline problem. That isn't best solution but i'm using it too much and currently it becoming my habit. :-)
...
int main()
{
unsigned long long int input[2], calc_square;
while(scanf("\n%I64u %I64u", input[0], input[1]) == 2)
{
printf("%I64u %I64u\n", input[0], input[1]);
...
My expected input and program result is :
Input :
89 89
For output, instead of printing back 89, it show this output :
I64u I64u
I'm using g++ (GCC) 4.9.1 from MSYS2 package. Noted that g++ because there are some portion of my code currently using C++ STL.
Edited : I changed my code by using standard %llu instead of %I64u, and here is my expected input and program result :
Input
89 89
For output, it's kind a weird result :
25769968512 2337536

This code is wrong:
while(scanf("\n%I64u %I64u", input[0], input[1]) == 2)
input[0] and input[1] each have type unsigned long long, but they are required to have type unsigned long long * (pointer to unsigned long long) for scanf operations. I'm unsure if MinGW supports checking printf and scanf format specifiers, but ordinary GCC is capable of detecting these kinds of errors at compile time as long as you enable the proper warnings. I highly recommend always compiling with as high of a warning level as you possibly can, such as -Wall -Wextra -Werror -pedantic in the most extreme case.
You need to pass in the address of these variables:
while(scanf("\n%I64u %I64u", &input[0], &input[1]) == 2)
// ^ ^
// | |

I suspect you have been using MSYS2's GCC which isn't a native Windows compiler, and doesn't support the MS-specific %I64 format modifiers (MSYS2's GCC is very much like Cygwin's GCC).
If you wanted to use MinGW-w64 GCC, you should have launched mingw64_shell.bat or mingw32_shell.bat and have the appropriate toolchain installed:
pacman -S mingw-w64-i686-toolchain
or
pacman -S mingw-w64-x86_64-toolchain
With that done, you can safely use either modifier on any Windows version dating back to Windows XP SP3 provided you pass -D__USE_MINGW_ANSI_STDIO=1.
FWIW, I avoid using the MS-specific modifiers and always pass -D__USE_MINGW_ANSI_STDIO=1
Finally, annoyingly, your sample doesn't work when launched from the MSYS2 shell due to mintty not being a proper Windows console; you need to run it from cmd.exe

Related

Large global array of vectors causing compilation error

I have a very simple C++ code (it was a large one, but I stripped it to the essentials), but it's failing to compile. I'm providing all the details below.
The code
#include <vector>
const int SIZE = 43691;
std::vector<int> v[SIZE];
int main() {
return 0;
}
Compilation command: g++ -std=c++17 code.cpp -o code
Compilation error:
/var/folders/l5/mcv9tnkx66l65t30ypt260r00000gn/T//ccAtIuZq.s:449:29: error: unexpected token in '.section' directive
.section .data.rel.ro.local
^
GCC version: gcc version 12.1.0 (Homebrew GCC 12.1.0_1)
Operating system: MacOS Monterey, version 12.3, 64bit architecture (M1)
My findings and remarks:
The constant SIZE isn't random here. I tried many different values, and SIZE = 43691 is the first one that causes the compilation error.
My guess is that it is caused by stack overflow. So I tried to compile using the flag -Wl,-stack_size,0x20000000, and also tried ulimit -s 65520. But neither of them has any effect on the issue, the code still fails to compile once SIZE exceeds 43690.
I also calculated the amount of stack memory this code consumes when SIZE = 43690. AFAIK, vectors use 24B stack memory, so the total comes to 24B * 43690 = 1048560B. This number is very close to 220 = 1048576. In fact, SIZE = 43691 is the first time that the consumed stack memory exceeds 220B. Unless this is quite some coincidence, my stack memory is somehow limited to 220B = 2M. If that really is the case, I still cannot find any way to increase it via the compilation command arguments. All the solutions in the internet leads to the stack_size linker argument, but it doesn't seem to work on my machine. I'm wondering now if it's because of the M1 chip somehow.
I'm aware that I can change this code to use vector of vectors to consume memory from the heap, but I have to deal with other's codes very often who are used to coding like this.
Let me know if I need to provide any more details. I've been stuck with this the whole day. Any help would be extremely appreciated. Thanks in advance!

I had the same issue, and after adding the -O2 flag to the compilation command, it started working. No idea why.
So, something like this:
g++-12 -O2 -std=c++17 code.cpp -o code

It does seem to be an M1 / M1 Pro issue. I tested your code on two seperate M1 Pro machine with the same result as yours. One workaround I found is to use the x86_64 version of gcc under rosetta, which doesn't have these allocation problems.

Works on a M1 Max running Monterey 12.5.1 with XCode 13.4.1 and using clang 13.1.6 compiler:
% cat code.cpp
#include <vector>
const int SIZE = 43691;
std::vector<int> v[SIZE];
int main() {
return 0;
}
% cc -std=c++17 code.cpp -o code -lc++
% ./code
Also fails with gcc-12.2.0:
% g++-12 -std=c++17 code.cpp -o code
/var/tmp/ccEnyMCk.s:449:29: error: unexpected token in '.section' directive
.section .data.rel.ro.local
^
So it seems to be a gcc issue on M1 issue.

This is a gcc-12 problem on Darwin Aarch64 targets. It shall not emit such sections like .section .data.rel.ro.local. Section names on macOS shall start with __, eg.: .section __DATA,...
See Mach-O reference.

Cannot use fsanitize=address in minGW compiler

My C++ Setup INFO :
OS - Windows 10,
IDE - VS Code,
compiler - MinGw
ISSUE :
I recently found sanitizers for C++ to catch some runtime errors like out of bound array access with -fsanitize=address and -fsanitize=undefined in my VS Code environment.
This is a simple program for out of bound array access :
in VS Code terminal being used is CMD when i tried to compile the program with this line g++ -fsanitize=address check.cpp -o main.exe
On hitting enter I got this error ( cannot find -lasan ) :
here is the c++ code ->
// Program to demonstrate
// accessing array out of bounds
#include <iostream>
using namespace std ;
int main()
{
int arr[] = {1,2,3,4,5};
cout<<arr[0]<<endl;
// arr[10] is out of bound
cout<<arr[10]<<endl;
return 0;
}
Cause(s)/Solution(s) for this issue
How to fix cannot find -lasan ?
Does MinGW not support these sanitizers, if so, shall I use cygwin ?
Can I install clang on windows machine (if possible) to use this
whole bunch of sanitizers ?
What are the other options to use besides using Visual Studio
instead of VS Code ??
Any other suggestions
NOTE : kindly suggest me suitable tags for this post (if I have used a wrong one or missed a crucial one)

gfortran requires format widths while ifort doesn't?

I am trying to migrate a .FOR file (for practice purposes) from ifort to gfortran. This file compiles in my Intel Visual Fortran solution with no issues. However when I compile it in gfortran using the following command:
gfortran -ffree-form -ffree-line-length-200 -Dinternal_debug -c MyFile.FOR -o MyFile.o
I get the following error message:
MyFile.FOR:4561:22:
102 format(A, I)
1
Error: Nonnegative width required in format string at (1)
Does ifort simply not require there to be a format width or are there additional ifort options that enable relaxing this requirement? How come the file runs smoothly in ifort but not in gfortran?

Your observation is correct, I have encountered this myself before. Intel Fortran does not enforce this requirement while gfortran does. The field width is actually required by the Fortran standard. I am not aware of any compiler option that could change this behaviour. The only option I am aware of is to fix the code to make it standard compliant.
How to do it can be found in Error: Nonnegative width required in format string at (1) . Note that the g0 that you asked about is not a compiler option to accept I. It is a different format descriptor to put into the code instead of I.

GCC 4.2.2 unsigned short error in casting

this line of code isn't compiling for me on GCC 4.2.2
m_Pout->m_R[i][j] = MIN(MAX(unsigned short(m_Pin->m_R[i][j]), 0), ((1 << 15) - 1));
error: expected primary-expression before ‘unsigned’
however if I add braces to (unsigned short) it works fine.
can you please explain what type of casting (allocation) is being done here?
why isn't the lexical parser/compiler is able to understand this c++ code in GCC?
Can you suggest a "better" way to write this code? supporting GCC 4.2.2 (no c++11, and cross platform)

unsigned short(m_Pin->m_R[i][j]) is a declaration with initialisation of an anonymous temporary, and that cannot be part of an expression.
(unsigned short)(m_Pin->m_R[i][j]) is a cast, and is an expression.
So (1) cannot be used as an argument for MAX, but (2) can be.

I think Bathsheba's answer is at least misleading. short(m_Pin->m_R[i][j]) is a cast. Why is it that the extra unsigned messing things up? It's because unsigned short is not a simple-type-specifier. The cast syntax T(E) works only if T is a single token, and unsigned short is two tokens.
Other types which are spelled with more than one token are char* and int const, and therefore these are also not valid casts: char*(0) and int const(0).
With static_cast<>, the < > are balanced so the type can be named with a sequence of identifiers, even static_cast<int const*const>(0)

You could use the §2 in Bathsheba's answer but it is more idiomatic to use static_cast in C++:
static_cast<unsigned short>(m_Pin->m_R[i][j])
BTW, your error is not related to GCC. You'll get the same if using Clang/LLVM or any (C++99 or C++11) standard conforming C++ compiler.
But independently of that, you should use a much newer version of GCC. In july 2015 the current version is GCC 5.1 and your GCC 4.2.2 version is from 2007, which is very ancient.
Using a more recent version of GCC is worthwhile because:
it enables you to stick to a more recent version of C++, e.g. C++11 (compile with -std=c++11 or -std=gnu++11)
recent GCC have improved their diagnostics. Compiling with -Wall -Wextra will help a lot.
recent GCC are optimizing better, and you'll get more performance from your code
recent GCC have a better and more standard conforming standard C++ library
recent GCC are better for debugging (with a recent GDB), and have sanitizer options (-fsanitize=address, -fsanitize=undefined, other -fsanitize=.... options) which help finding bugs
recent GCC are more standard conforming
recent GCC are customizable thru plugins, including MELT
older GCC 4.2 is no more supported by the FSF, and you'll need to pay big bucks the few companies supporting them.
You don't need any root access to compile from its source code a GCC 5 compiler (or cross-compiler). Read the installation procedures. You'll build a GCC tailored to your particular libc (and you might even use musl-libc if you wanted to ....), perhaps by compiling outside of the source tree after having configured with a command like
...your-path-to/gcc-5/configure --prefix=$HOME/soft/ --program-suffix=-mine
then make then make install then add $HOME/soft/bin/ to your PATH and use gcc-mine and g++-mine

Non-ASCII wchar_t literals under LLVM

I've migrated an Xcode iOS project from Xcode 3.2.6 to 4.2. Now I'm getting warnings when I try to initialize a wchar_t with a literal with a non-ASCII character:
wchar_t c1;
if(c1 <= L'я') //That's Cyrillic "ya"
The messages are:
MyFile.cpp:148:28: warning: character unicode escape sequence too long for its type [2]
MyFile.cpp:148:28: warning: extraneous characters in wide character constant ignored [2]
And the literal does not work as expected - the comparison misfires.
I'm compiling with -fshort-wchar, the source file is in UTF-8. The Xcode editor displays the file fine. It compiled and worked on GCC (several flavors, including Xcode 3), worked on MSVC. Is there a way to make LLVM compiler recognize those literals? If not, can I go back to GCC in Xcode 4?
EDIT: Xcode 4.2 on Snow Leopard - long story why.
EDIT2: confirmed on a brand new project. File extension does not matter - same behavior in .m files. -fshort-wchar does not affect it either. Looks like I've gotta go back to GCC until I can upgrade to a version of Xcode where this is fixed.

Not an answer, but hopefully helpful information — I could not reproduce the problem with clang 4.0 (Xcode 4.5.1):
$ uname -a
Darwin air 12.2.0 Darwin Kernel Version 12.2.0: Sat Aug 25 00:48:52 PDT 2012; root:xnu-2050.18.24~1/RELEASE_X86_64 x86_64
$ env | grep LANG
LANG=en_US.UTF-8
$ clang -v
Apple clang version 4.0 (tags/Apple/clang-421.0.60) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.2.0
Thread model: posix
$ cat test.c
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
wchar_t c1 = 0;
printf("sizeof(c1) == %lu\n", sizeof(c1));
printf("sizeof(L'Я') == %lu\n", sizeof(L'Я'));
if (c1 < L'Я') {
printf("Я люблю часы Заря!\n");
} else {
printf("Что за....?\n");
}
return EXIT_SUCCESS;
}
$ clang -Wall -pedantic ./test.c
$ ./a.out
sizeof(c1) == 4
sizeof(L'Я') == 4
Я люблю часы Заря!
$ clang -Wall -pedantic ./test.c -fshort-wchar
$ ./a.out
sizeof(c1) == 2
sizeof(L'Я') == 2
Я люблю часы Заря!
$
The same behavior is observed with clang++ (where wchar_t is built-in type).

If in fact the source is UTF-8 then this isn't correct behavior. However I can't reproduce the behavior in the most recent version of Xcode
MyFile.cpp:148:28: warning: character unicode escape sequence too long for its type [2]
This error should be refering to a 'Universal Character Name' (UCN), which looks like "\U001012AB" or "\u0403". It indicates that the value represented by the escape sequence is larger than the enclosing literal type is capable of holding. For example if the codepoint value requires more than 16 bits then a 16 bit wchar_t will not be able to hold the value.
MyFile.cpp:148:28: warning: extraneous characters in wide character constant ignored [2]
This indicates that the compiler thinks there's more than one codepoint represented inside a wide character literal. E.g. L'ab'. The behavior is implementation defined and both clang and gcc simply use the last codepoint value.
The code you show shouldn't trigger either of these, at least in clang. The first because that applies only to UCNs, let alone the fact that 'я' fits easily within a single 16-bit wchar_t; and the second because he source code encoding is always taken to be UTF-8 and it will see the UTF-8 multibyte representation of 'я' as a single codepoint.
You might recheck and ensure that the source actually is UTF-8. Then you should check that you're using an up-to-date version of Xcode. You can also try switching the compiler in your project settings > Compile for C/C++/Objective-C

I dont have an answer to your specific question, but wanted to point out that llvm-gcc has been permanently discontinued. In my experience in dealing with delta's between Clang and llvm-gcc, and gcc, Clang is often correct with regards to the C++ specification even if that behavior is surprising.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js