"No newline at end of file" compiler warning - c++

What is the reason for the following warning in some C++ compilers?
No newline at end of file
Why should I have an empty line at the end of a source/header file?

Think of some of the problems that can occur if there is no newline. According to the ANSI standard the #include of a file at the beginning inserts the file exactly as it is to the front of the file and does not insert the new line after the #include <foo.h> after the contents of the file. So if you include a file with no newline at the end to the parser it will be viewed as if the last line of foo.h is on the same line as the first line of foo.cpp. What if the last line of foo.h was a comment without a new line? Now the first line of foo.cpp is commented out. These are just a couple of examples of the types of problems that can creep up.
Just wanted to point any interested parties to James' answer below. While the above answer is still correct for C, the new C++ standard (C++11) has been changed so that this warning should no longer be issued if using C++ and a compiler conforming to C++11.
From C++11 standard via James' post:
A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file (C++11 §2.2/1).

The requirement that every source file end with a non-escaped newline was removed in C++11. The specification now reads:
A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file (C++11 §2.2/1).
A conforming compiler should no longer issue this warning (at least not when compiling in C++11 mode, if the compiler has modes for different revisions of the language specification).

C++03 Standard [2.1.1.2] declares:
... If a source file that is not empty does not end in a new-line character, or ends in a new-line character
immediately preceded by a backslash character before any such splicing takes place, the behavior is undefined.

The answer for the "obedient" is "because the C++03 Standard says the behavior of a program not ending in newline is undefined" (paraphrased).
The answer for the curious is here: http://gcc.gnu.org/ml/gcc/2001-07/msg01120.html.

It isn't referring to a blank line, it's whether the last line (which can have content in it) is terminated with a newline.
Most text editors will put a newline at the end of the last line of a file, so if the last line doesn't have one, there is a risk that the file has been truncated. However, there are valid reasons why you might not want the newline so it is only a warning, not an error.

#include will replace its line with the literal contents of the file. If the file does not end with a newline, the line containing the #include that pulled it in will merge with the next line.

Of course in practice every compiler adds a new line after the #include. Thankfully. – #mxcl
not specific C/C++ but a C dialect: when using the GL_ARB_shading_language_include extension the glsl compiler on OS X warns you NOT about a missing newline. So you can write a MyHeader.h file with a header guard which ends with #endif // __MY_HEADER_H__ and you will lose the line after the #include "MyHeader.h" for sure.

I am using c-free IDE version 5.0,in my progrm either of 'c++' or 'c' language i was getting same problem.Just at the end of the program i.e. last line of the program(after braces of function it may be main or any function),press enter-line no. will be increased by 1.then execute the same program,it will run without error.

Because the behavior differs between C/C++ versions if file does not end with new-line. Especially nasty is older C++-versions, fx in C++ 03 the standard says (translation phases):
If a source file that is not empty does not end in a new-line
character, or ends in a new-line character immediately preceded by a
backslash character, the behavior is undefined.
Undefined behavior is bad: a standard conforming compiler could do more or less what it wants here (insert malicous code or whatever) - clearly a reason for warning.
While the situation is better in C++11 it is a good idea to avoid situations where the behavior is undefined in earlier versions. The C++03 specification is worse than C99 which outright prohibits such files (behavior is then defined).

This warning might also help to indicate that a file could have been truncated somehow. It's true that the compiler will probably throw a compiler error anyway - especially if it's in the middle of a function - or perhaps a linker error, but these could be more cryptic, and aren't guaranteed to occur.
Of course this warning also isn't guaranteed if the file is truncated immediately after a newline, but it could still catch some cases that other errors might miss, and gives a stronger hint to the problem.

In my case, I use KOTLIN Language and the compiler is on IntelliJ. Also, I am using a docker container with LINT to fix possible issues with typos, imports, code usage, etc. This error is coming from these lint fixes, most probably - I mean surely.
In short, the error says, 'Add a new line at the end of the file' That is it.
Before there was NO extra empty line:

Related

Why can my comment consist of so much forward slash (/)?

I know that there are many types of comment, I will list out a few of them (those related):
// - Normal comment
/// - This would make the comment bold
And surprisingly, the IDE would not raise an error in this code (Even it is not executed, it should still control the programmer somewhere):
/////////////////// HI!
Why would the standard allow this to happen?
BTW, my IDE is Code::Blocks 20.03 if it matters.
As per the C++ standard: lex.comment
The characters // start a comment, which terminates immediately before the next new-line character.
From the above, you can infer that every character (other than newline) which follows the first two / characters is part of the comment.
If that wasn't already clear enough, it goes on to note:
The comment characters //, /*, and */ have no special meaning within a // comment and are treated just like other characters.

C++: Is there a standard definition for end-of-line in a multi-line string constant?

If I have a multi-line string C++11 string constant such as
R"""line 1
line 2
line3"""
Is it defined what character(s) the line terminator/separator consist of?
The intent is that a newline in a raw string literal maps to a single
'\n' character. This intent is not expressed as clearly as it
should be, which has led to some confusion.
Citations are to the 2011 ISO C++ standard.
First, here's the evidence that it maps to a single '\n' character.
A note in section 2.14.5 [lex.string] paragraph 4 says:
[ Note: A source-file new-line in a raw string literal results in a
new-line in the resulting execution string-literal. Assuming no
whitespace at the beginning of lines in the following example, the
assert will succeed:
const char *p = R"(a\
b
c)";
assert(std::strcmp(p, "a\\\nb\nc") == 0);
— end note ]
This clearly states that a newline is mapped to a single '\n'
character. It also matches the observed behavior of g++ 6.2.0 and
clang++ 3.8.1 (tests done on a Linux system using source files with
Unix-style and Windows-style line endings).
Given the clearly stated intent in the note and the behavior of two
popular compilers, I'd say it's safe to rely on this -- though it
would be interesting to see how other compilers actually handle this.
However, a literal reading of the normative wording of the
standard could easily lead to a different conclusion, or at least
to some uncertainty.
Section 2.5 [lex.pptoken] paragraph 3 says (emphasis added):
Between the initial and final double quote characters of the
raw string, any transformations performed in phases 1 and 2
(trigraphs, universal-character-names, and line splicing)
are reverted; this reversion shall apply before any d-char,
r-char, or delimiting parenthesis is identified.
The phases of translation are specified in 2.2 [lex.phases]. In phase 1:
Physical source file characters are mapped, in an
implementation-defined manner, to the basic source character set
(introducing new-line characters for end-of-line indicators) if
necessary.
If we assume that the mapping of physical source file characters to the
basic character set and the introduction of new-line characters are
"tranformations", we might reasonably conclude that, for example,
a newline in the middle of a raw string literal in a Windows-format
source file should be equivalent to a \r\n sequence. (I can imagine
that being useful for Windows-specific code.)
(This interpretation does lead to problems with systems where the
end-of-line indicator is not a sequence of characters, for example
where each line is a fixed-width record. Such systems are rare
these days.)
As "Cheers and hth. - Alf"'s answer
points out, there is an open
Defect Report
for this issue. It was submitted in 2013 and has not yet been
resolved.
Personally, I think the root of the confusion is the word "any"
(emphasis added as before):
Between the initial and final double quote characters of the raw
string, any transformations performed in phases 1 and 2 (trigraphs,
universal-character-names, and line splicing) are reverted; this
reversion shall apply before any d-char, r-char, or delimiting
parenthesis is identified.
Surely the mapping of physical source file characters to
the basic source character set can reasonably be thought of
as a transformation. The parenthesized clause "(trigraphs,
universal-character-names, and line splicing)" seems to be intended
to specify which transformations are to be reverted, but that
either attempts to change the meaning of the word "transformations"
(which the standard does not formally define) or contradicts the use
of the word "any".
I suggest that changing the word "any" to "certain" would express
the apparent intent much more clearly:
Between the initial and final double quote characters of the raw
string, certain transformations performed in phases 1 and 2 (trigraphs,
universal-character-names, and line splicing) are reverted; this
reversion shall apply before any d-char, r-char, or delimiting
parenthesis is identified.
This wording would make it much clearer that "trigraphs,
universal-character-names, and line splicing" are the only
transformations that are to be reverted. (Not everything done
in translation phases 1 and 2 is reverted, just those specific
listed transformations.)
The standard seems to indicate that:
R"""line 1
line 2
line3"""
is equivalent to:
"line 1\nline 2\nline3"
From 2.14.5 String literals of the C++11 standard:
4 [ Note: A source-file new-line in a raw string literal results in a new-line in the resulting execution string literal. Assuming no whitespace at the beginning of lines in the following example, the assert will succeed:
const char *p = R"(a\
b
c)";
assert(std::strcmp(p, "a\\\nb\nc") == 0);
—end note ]
5 [ Example: The raw string
R"a(
)\
a"
)a"
is equivalent to "\n)\\\na\"\n".
Note: the question has changed substantially since the answers were posted. Only half of it remains, namely the pure C++ aspect. The network focus in this answer addresses the original question's “sending a multi-line string to a server with well-defined end-of-line requirements”. I do not chase question evolution in general.
Internally in the program, the C++ standard for newline is \n. This is used also for newline in a raw literal. There is no special convention for raw literals.
Usually \n maps to ASCII linefeed, which is the value 10.
I'm not sure what it maps to in EBCDIC, but you can check that if needed.
On the wire, however, it's my impression that most protocols use ASCII carriage return plus linefeed, i.e. 13 followed by 10. This is sometimes called CRLF, after the ASCII abbreviations CR for carriage return and LF for linefeed. When the C++ escapes are mapped to ASCII this is simply \r\n in C++.
You need to abide by the requirements of the protocol you're using.
For ordinary file/stream i/o the C++ standard library takes care of mapping the internal \n to whatever convention the host environment uses. This is called text mode, as opposed to binary mode where no mapping is performed.
For network i/o, which is not covered by the standard library, the application code must do this itself, either directly or via some library functions.
There is an active issue about this, core language defect report #1655 “Line endings in raw string literals”, submitted by Mike Miller 2013-04-26, where he asks,
” is it intended that, for example, a CRLF in the source of a raw string literal is to be represented as a newline character or as the original characters?
Since line ending values differ depending on the encoding of the original file, and considering that in some file systems there is not an encoding of line endings, but instead lines as records, it's clear that the intention is not to represent the file contents as-is – since that's impossible to do in all cases. But as far as I can see this DR is not yet resolved.

Single line comment continuation

From the C++ standard (going back to at least C++98) § 2.2, note 2 states:
Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. Except for splices reverted in a raw string literal, if a splice results in a character sequence that matches the syntax of a universal-character-name, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file.
And, section § 2.7 states:
The characters /* start a comment, which terminates with the characters */. These comments do not nest. The characters // start a comment, which terminates with the next new-line character. If there is a form-feed or a vertical-tab character in such a comment, only white-space characters shall appear between it and the new-line that terminates the comment; no diagnostic is required. [Note: The comment characters //, /*, and */ have no special meaning within a // comment and are treated just like other characters. Similarly, the comment characters // and /* have no special meaning within a /* comment. ]
I would take these two together to mean that the following:
// My comment \
is valid
// My comment \ still valid \
is valid
are legal in C++98. In GCC 4.9.2, these both compile without any diagnostic messages. In MSVC 2013, these both produce the following:
warning C4010: single-line comment contains line-continuation character
If you have warnings as errors enabled (which, I do), this causes the program to not compile successfully (without warnings-as-errors, it works just fine). Is there something in the standard that disallows single-line comment continuations, or is this a case of MSVC non-compliance with the standard?
It's not a question of compliance. You've specifically asked the compiler to treat a valid construct as an error, so that's what it does.
GCC will give the same warning (or error, if requested) if you specify -Wcomment or -Wall.
I'd say it's MS being sensitive to the fact that if you do something like:
#define macro() \
some stuff \
// Intended as comment \
more stuff
then you get VERY interesting errors when you use macro() in the code.
Or other simply accidentally typing a comment like this:
// The files for foo-project are in c:\projects\foo\
int blah;
(Strange errors for "undefined variable blah" occurs)
I would NEVER use line continuation in a single-line comment, but if you have some good reason to, just turn THAT warning off in MSVC.
Also as Mike says: Warnings are not even covered by the standard - it only says what needs to be an error. If you enable "warnings are errors", you will have to either be selective about what warnings you enable, or accept that some constructs that are technically valid (but dubious) will be unacceptable in the build, because the compiler maker has decided to warn about it. Try writing if (c = getchar()) in gcc or clang and see how far you get with much -Werror and warnings on "high". Yet it is perfectly valid according to the standard.

Preprocessing multiline comments and their embedded newlines at the end of file

This is question about C99/C11 (may be C++ too) preprocessor and their standard-compliance.
Let's consider two source files:
/* I'm
* multiline
* comment
*/
and
/* I'm
* multiline
* comment
*/
i_am_a_token;
If we preprocess both files with gcc or clang (several version was tested), there will be a difference. In the first case preprocessor will not keep newlines from the multiline comment. And in the second case all newlines will be kept.
All mentioned standards says (somewhere inside "Translation phases"):
Each comment is replaced by one space character. New-line characters are retained.
Why there is the difference in handling multiline comments at the end of file? And is this behaviour standard-compliant?
The reason is simple - line numbers and error reporting. Since the compiler reports errors with line numbers, it is convenient so that line numbers in the pre-processed file correspond to line numbers in the original file. That's the reason the lines occupied by comment are preserved when they are followed by code, whereas they don't have to be preserved at the end of file.
As for the standards. The standards
C99: ISO/IEC 9899:1999
C11: ISO/IEC 9899:2011
specify the language, preprocessing macros etc., but they don't specify how the language should be processed. You can see it in the scope definition of C11:
ISO/IEC 9899:2011 does not specify
the mechanism by which C programs are transformed for use by a data-processing system;
which means that preprocessor output is pretty much internal issue, out of the scope of the standard.

What's the utility of an empty C++ file?

The second part of translation phase 2 (section 2.2.2 in N3485) basically says that if a source file does not end in a newline character, the compiler should treat it as if it did.
However, if I'm reading it correctly it makes an explicit exception for empty source files, which remain empty.
The exact text (with added emphasis) is:
Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. Only the last backslash on any physical source line shall be eligible for being part of such a splice. If, as a result, a character sequence that matches the syntax of a universal-character-name is produced, the behavior is undefined. A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file.
I haven't been able to figure out any situations in which it would make a difference whether a source file was empty or consisted of only a newline character.
I'm hoping someone can shed some light on the reasoning behind this requirement.
This is to specifically support the 1994 winning entry in the international obfuscated C code contest in the category "worst abuse of rules": The world's smallest self-replicating program. Guaranteed.
I think the idea is that a source file normally consists of zero or more lines, and each line consists of a sequence of non-new-line characters followed by a new-line. Any source file not meeting that requirement needs special handling (so you don't get lines composed of text from two different source files).
An empty C++ source file is not particularly useful, but there's no point in forbidding it. The quoted clause isn't about distinguishing between an empty file and a file consisting of just one new-line (there should be no real difference between them).
i guess this means that every line ends with \n, while empty file has no lines
The preprocessor can be used to construct things besides program source, and a blank line can be significant -- it's often used to separate paragraphs in text, for instance.
"A source file that is not empty and that does not end in a new-line character, or that ends in a new-line character immediately preceded by a backslash character before any such splicing takes place, shall be processed as if an additional new-line character were appended to the file."
The second part of translation phase 2 (section 2.2.2 in N3485) basically says that if a source file does not end in a newline character, the compiler should treat it as if it did.
No - it says that if the file "is not empty" AND does not end in a newline, then a newline is added
However, if I'm reading it correctly it makes an explicit exception for empty source files, which remain empty.
Agreed.
I haven't been able to figure out any situations in which it would make a difference whether a source file was empty or consisted of only a newline character. I'm hoping someone can shed some light on the reasoning behind this requirement.
Consider a header file called "header.h" with last line as below with no trailing newline:
#endif // #ifndef INCLUDED_HEADER_H
Say another.cc includes it as follows:
#include "header.h"
#include "another.h"
When another.cc is parsed, the text from header.h is substituted for the line specifying its inclusion. Done naively, that would result in:
#endif // #ifndef INCLUDED_HEADER_H#include "another.h"
Obvious, the compiler would then fail to act on #include "another.h", considering it part of the comment begun in header.h.
So, the rule for incomplete rules avoids these problems (which could be terribly hard to spot).
If the file was empty anyway, this problem doesn't manifest: there's nothing like the #endif to be prepended to the next line in the including file....