$ symbol in c++ - c++

I read the following code from an open source library. What confuses me is the usage of dollar sign. Can anyone please clarify the meaning of $ in the code. Your help is greatly appreciated!
__forceinline MutexActive( void ) : $lock(LOCK_IS_FREE) {}
void lock ( void );
__forceinline void unlock( void ) {
__memory_barrier(); // compiler must not schedule loads and stores around this point
$lock = LOCK_IS_FREE;
}
protected:
enum ${ LOCK_IS_FREE = 0, LOCK_IS_TAKEN = 1 };
Atomic $lock;

There is a gcc switch, -fdollars-in-identifiers which explicitly allows $ in idenfitiers.
Perhaps they enable it and use the $ as something that is highly unlikely to clash with normal names.
-fdollars-in-identifiers
Accept $ in identifiers. You can also explicitly prohibit use of $ with the option -fno-dollars-in-identifiers. (GNU C allows $ by
default on most target systems, but there are a few exceptions.)
Traditional C allowed the character $ to form part of identifiers.
However, ISO C and C++ forbid $ in identifiers.
See the gcc documentation. Hopefully the link stays good.

It is being used as part of an identifer.
[C++11: 2.11/1] defines an identifier as "an arbitrarily long sequence of letters and digits." It defines "letters and digits" in a grammar given immediately above, which names only numeric digits, lower- and upper-case roman letters, and the underscore character explicitly, but does also allow "other implementation-defined characters", of which this is presumably one.
In this scenario the $ has no special meaning other than as part of an identifier — in this case, the name of a variable. There is no special significance with it being at the start of the variable name.

Even if dollar sign are not valid identifiers according to the standard, it can be accepted. For example visual studio (I think ggc too but I'm not sure about that) seems to accept it.
Check this doc : http://msdn.microsoft.com/en-us/library/565w213d(v=vs.80).aspx
and this : Are dollar-signs allowed in identifiers in C++03?

The C++ standard says:
The basic source character set consists of 96 characters: the space
character, the control characters representing horizontal tab,
vertical tab, form feed, and new-line, plus the following 91 graphical
characters: a b c d e f g h i j k l m n o p q r s t u v w x y z A B C
D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9
_ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ! = , \ " ’
There is no $ in the basic source character set described above; The $ character in your code is an extension to the basic source character set, which isn't required. Consider in Britain, where the pound symbol (£ or ₤) is used in place of the dollar symbol ($).

Related

Does At symbol (#) and Dollar Sign ($) has any special meaning in C or C++

Recently one of my friend encountered this question in an interview. The interviewer asked him if the special characters like $, #, |, ^, ~ have any usage in c or c++ and where.
I know that |, ^ and ~ are used as Bitwise OR, XOR and Complement respectively.
But I don't know if # and $ has any special meaning. If it does, could you please give example where it can be applied?
# is generally invalid in C; it is not used for anything. It is used for various purposes by Objective-C, but that's a whole other kettle of fish.
$ is invalid as well, but many implementations allow it to appear in identifiers, just like a letter. (In these implementations, for instance, you could name a variable or function $$$ if you liked.) Even there, though, it doesn't have any special meaning.
To complete the accepted answer, the # can be used to specify the absolute address of a variable on embedded systems.
unsigned char buf[128]#0x2000;
Note this is a non-standard compiler extension.
Check out a good explanation here
To complete the other answers. The C99-Standard in 5.2.1.3:
Both the basic source and basic execution character sets shall have
the following members:
the 26 uppercase letters of the Latin alphabet
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
the 26 lowercase letters of the Latin alphabet
a b c d e f g h i j k l m
n o p q r s t u v w x y z
the 10 decimal digits
0 1 2 3 4 5 6 7 8 9
the following 29 graphic characters
! " # % & ' ( ) * + , - . / :
; < = > ? [ \ ] ^ _ { | } ~
All other characters maybe not even exist. (And should not be used)
But there is also this point in the Common extensions: Annex J, J.5.2:
Characters other than the underscore _, letters, and digits, that are not part of the basic
source character set (such as the dollar sign $, or characters in national character sets)
may appear in an identifier (6.4.2).
Which is basically what duskwuff already wrote.

Regular Expression: search multiple string with linefeed delimited by ";"

I have a string such this that described a structured data source:
Header whocares;
SampleTestPlan 2
a b
c d;
Test abc;
SampleTestPlan 3
e f
g h
i l;
Wafer 01;
EndOfFile;
Every field...
... is starting with "FieldName"
... is ending with ";"
... may contain linefeed
My need is to find with regular expression the values of SampleTestPlan that's repeated twice. So...
1st value is:
2
a b
c d
2nd value is
3
e f
g h
i l
I've performed several attempts with such search string:
/SampleTestPlan(.\s)/gm
/SampleTestPlan(.\s);/gm
/SampleTestPlan(.*);/gm
but I need to understand much better how Regular Expression work as I'm definitively a newbie on them and I need to learn a lot.
Thanks in advance to anyone that may help me!
Stefano, Milan, ITALY
You could use the following regex:
(?<=\w\b)[^;]+(?=;)
See it working live here on regex101!
How it works:
It matches everything that is:
preceded by a sequence of characters: \w+
followed by a ;
contains anything (at least one character) except a ; (including newlines).
For example, for that input:
Header whocares;
SampleTestPlan 2
a b
c d;
Test abc;
SampleTestPlan 3
e f
g h
i l;
Wafer 01;
EndOfFile;
It matches 5 times:
whocares
then:
2
a b
c d
then:
abc
then:
3
e f
g h
i l
then:
01
Assuming your input will be always in this well formatted like the sample, try this:
/SampleTestPlan(\s+\d+.*?);/sg
Here, /s modifier means Dot matches newline characters
You can try this at online.
That would be /SameTestPlan([^;]+)/g. [^abc] means any character which is not a, b or c.

Is there a whitelist or blacklist of characters for custom sml infixes?

infix 3 .. errors out. Which characters are allowed or not allowed for defining custom infixes? Where might I find a list online?
thanks
You may infix any non-qualified identifier.
The following is from the SML 90' definition
The following are the reserved words used in the Core. They may not (except =) be used as identifiers.
abstype and andalso as case do datatype else
end exception fn fun handle if in infix
infixr let local nonfix of op open orelse
raise rec then type val with withtype while
( ) [ ] { } , : ; ... _ | = => -> #
....
An identifier is either alphanumeric: any sequence of letters,
digits or primes (') and underbars (_) starting with a letter or
prime, or symbolic: any non-empty sequence of the following
symbols:
! % & # + - / : < = > ? # \ ~ ' ^ | *
In either case, however, reserved words are excluded. This means that
for example # and | are not identifiers, but ## and |=| are
identifiers. The only exception to this rule is that the symbol =,
which is a reserved word, is also allowed as an identifier to stand
for the equality predicate.

Valid header names

I can't understand correctly what does they mean in the following article:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1566.htm
It is interesting to note that C89 explicitly allowed only letters in
header and include file names. C++ added underscores, and C99 added
digits. Probably both standards should allow both.
I found the following statements in all C and C++ standards:
ISO/IEC 9899:1990
6.1.7 Header names
Syntax
1 header-name:
< h-char-sequence >
" q-char-sequence "
h-char-sequence:
h-char
h-char-sequence h-char
h-char:
any member of the source character set except
the new-line character and >
q-char-sequence:
q-char
q-char-sequence q-char
q-char:
any member of the source character set except
the new-line character and "
ISO/IEC 9899:1990
5.2.1 Character sets
...
Both the basic source and basic execution character sets shall have the following
members: the 26 uppercase letters of the Latin alphabet
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z
the 26 lowercase letters of the Latin alphabet
a b c d e f g h i j k l m
n o p q r s t u v w x y z
the 10 decimal digits
0 1 2 3 4 5 6 7 8 9
the following 29 graphic characters
! " # % & ' ( ) * + , — . / :
; < = > ? [ \ ] ^ _ { | } ~
For example, i see underscore and digits even in C89 / C90.
It's referring to this:
There shall be an implementation-defined mapping between the delimited
sequence and the external source file name. The implementation shall
provide unique mappings for sequences consisting of one or more
letters (as defined in $2.2.1) followed by a period (.) and a single
letter. The implementation may ignore the distinctions of
alphabetical case and restrict the mapping to six significant
characters before the period.
(C89)
This is the C99 version:
The implementation shall provide unique mappings for sequences
consisting of one or more letters or digits (as defined in 5.2.1)
followed by a period (.) and a single letter. The first character shall
be a letter. The implementation may ignore the distinctions of
alphabetical case and restrict the mapping to eight significant
characters before the period.

Regex for numbers on scientific notation?

I'm loading a .obj file that has lines like
vn 8.67548e-017 1 -1.55211e-016
for the vertex normals. How can I detect them and bring them to double notation?
A regex that would work pretty well would be:
-?[\d.]+(?:e-?\d+)?
Converting to a number can be done like this: String in scientific notation C++ to double conversion, I guess.
The regex is
-? # an optional -
[\d.]+ # a series of digits or dots (see *1)
(?: # start non capturing group
e # "e"
-? # an optional -
\d+ # digits
)? # end non-capturing group, make optional
**1) This is not 100% correct, technically there can be only one dot, and before it only one (or no) digit. But practically, this should not happen. So the regex is a good approximation and false positives should be very unlikely. Feel free to make the regex more specific.*
You can identify the scientific values using: -?\d*\.?\d+e[+-]?\d+ regex.
I tried a number of the other solutions to no avail, so I came up with this.
^(-?\d+)\.?\d+(e-|e\+|e|\d+)\d+$
Debuggex Demo
Anything that matches is considered to be valid Scientific Notation.
Please note: This accepts e+, e- and e; if you don't want to accept e, use this: ^(-?\d+)\.?\d+(e-|e\+|\d+)\d+$
I'm not sure if it works for c++, but in c# you can add (?i) between the ^ and (- in the regex, to toggle in-line case-insensitivity. Without it, exponents declared like 1.05E+10 will fail to be recognised.
Edit: My previous regex was a little buggy, so I've replaced it with the one above.
The standard library function strtod handles the exponential component just fine (so does atof, but strtod allows you to differentiate between a failed parse and parsing the value zero).
If you can be sure that the format of the double is scientific, you can try something like the following:
string inp("8.67548e-017");
istringstream str(inp);
double v;
str >> scientific >> v;
cout << "v: " << v << endl;
If you want to detect whether there is a floating point number of that format, then the regexes above will do the trick.
EDIT: the scientific manipulator is actually not needed, when you stream in a double, it will automatically do the handling for you (whether it's fixed or scientific)
Well this is not exactly what you asked for since it isn't Perl (gak) and it is a regular definition not a regular expression, but it's what I use to recognize an extension of C floating point literals (the extension is permitting "_" in digit strings), I'm sure you can convert it to an unreadable regexp if you want:
/* floats: Follows ISO C89, except that we allow underscores */
let decimal_string = digit (underscore? digit) *
let hexadecimal_string = hexdigit (underscore? hexdigit) *
let decimal_fractional_constant =
decimal_string '.' decimal_string?
| '.' decimal_string
let hexadecimal_fractional_constant =
("0x" |"0X")
(hexadecimal_string '.' hexadecimal_string?
| '.' hexadecimal_string)
let decimal_exponent = ('E'|'e') ('+'|'-')? decimal_string
let binary_exponent = ('P'|'p') ('+'|'-')? decimal_string
let floating_suffix = 'L' | 'l' | 'F' | 'f' | 'D' | 'd'
let floating_literal =
(
decimal_fractional_constant decimal_exponent? |
hexadecimal_fractional_constant binary_exponent?
)
floating_suffix?
C format is designed for programming languages not data, so it may support things your input does not require.
For extracting numbers in scientific notation in C++ with std::regex I normally use
((\\+|-)?[[:digit:]]+)(\\.(([[:digit:]]+)?))?((e|E)((\\+|-)?)[[:digit:]]+)?
which corresponds to
((\+|-)?\d+)(\.((\d+)?))?((e|E)((\+|-)?)\d+)?
Debuggex Demo
This will match any number of the form +12.3456e-78 where
the sign can be either + or - and is optional
the comma as well as the positions after the comma are optional
the exponent is optional and can be written with a lower- or upper-case letter
A corresponding code for parsing might look like this:
std::regex const scientific_regex {"((\\+|-)?[[:digit:]]+)(\\.(([[:digit:]]+)?))?((e|E)((\\+|-)?)[[:digit:]]+)?"};
std::string const str {"8.67548e-017 1 -1.55211e-016"};
for (auto it = std::sregex_iterator(str.begin(), str.end(), scientific_regex); it != std::sregex_iterator(); ++it) {
std::string const match {it->str()};
std::cout << match << std::endl;
}
If you want to convert the found sub-strings to a double number std::stod should handle the conversion correctly as already pointed out by Ben Voigt.
Try it here!