Where to use empty character constant '' in C++? [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
The empty character constant '' can not be cout or assigned to character in C++. The compiler will say "error: expected expression". Can we put it in C++ source code? If not, what's the usage of ''? (empty character constant '' is one ' followed with another ')

Can we put it in C++ source code?
No, it would be a syntax error.
If not, what's the usage of ''?
There is no usage, unless your purpose is to cause a compilation error (for which there are probably better alternatives such as static_assert).
Can it be understood that empty character constant '' is just a pure grammar error just like a variable being named as 2018ch ?
Yes. The grammar says:
character-literal:
encoding-prefix opt ' c-char-sequence '
Notice that unlike the encoding-prefix, c-char-sequence is not optional.
Side note: Yes, it is a character sequence - multi character literals exist. But you don't need to learn about them yet other than knowing that you probably won't need them. Just don't assume that they're strings.

Ok I think that the confusion comes from the fact that a string can be an empty string e.g. "", so maybe you draw a parallel and expect there to be an empty character something like ''.
Well remember what a string is: a series of characters (0, 1, or more) (terminated by the end of string character '\0'). So "" is a string of 0 characters (end of string character not counted, although it is there), aka the "empty string".
A character is well... just that one character. Not zero, not 2 or 3. A character always has a value. Thus the empty character '' does not exist and makes no sense.

'' makes no sense and thus it won't compile, what value is it supposed to have?
Remember, it's all just bits and bytes in memory at some point so what value should the bytes have that represent ''?
char a = 0;
//or
char a = '\0';
These represent "empty" chars which is the closest you'll get to ''.

Related

Regex expression doesn't recognize dot at end of word - Regex (C++) [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed last month.
Improve this question
I'm trying to read a line out of a file using the following regex expression:
^([A-z.]+?\\s?[A-z]+)\\s([A-z]+)\\s(\\d{7})\\s(\\d?\\d.\\d)$
on the line:
W.W. Sneijder 0000574 10.0
(To be clear: the intent is to make any word with chars [a-z], [A-Z], or dots, match with the [A-z.]+ part.)
However, the regular expression doesn't recognize the second dot in W.W., which seems strange to me. Don't the square brackets combined with the + mean that any character from inside them is accepted, until (here) whitespace is encountered? I found a regex that does work but isn't that elegant:
^([A-z.]+[.\\s?[A-z]+)\\s([A-z]+)\\s(\\d{7})\\s(\\d?\\d.\\d)$
I'm hoping the find an elegant solution. It'd be great to hear your input.
Links such as RegEx - Not parsing dot(.) at the end of a sentence didn't seem to answer my question unfortunately.
Space separated data is just a different variant of the common CSV (Comma Separated Values) format. There are many ways to separate a string on arbitrary separators, but in C++ using space is actually very easy:
std::vector<std::string> separate_on_space(std::string const& input)
{
std::vector<std::string> output;
std::istringstream iss(input);
// Copy all space-separated "words" from the input to the vector
std::copy(std::istream_iterator<std::string>(iss), // Begin iterator
std::istream_iterator<std::string>(), // End iterator
std::back_inserter(output)); // Destination iterator
return output;
}
[See example here]
Once you have separated the values into a vector of strings, you can then convert the numeric values to their actual type (for example using std::stod) and store into suitable objects.
Of course this doesn't handle names with spaces in them in a graceful way, but that can be handled at a higher level (by checking the size of the resulting vector, and by knowing the last two elements should always the special numbers, and the rest are the names).
On the other hand the regular expression in the question doesn't handle it at all. :)
In your regex, the entire W.W. Sneijder is captured in the first group. Looking at your regex, I doubt you intended it that way.
I think the regex you wanted is ^([A-z.]+?\s?[A-z]+)\s(\d{7})\s(\d?\d.\d)$. Or if you wanted Sneijder to be in the second capture: ^([A-z.]+?)\s([A-z]+)\s(\d{7})\s(\d?\d.\d)$.
... or maybe you wanted ^([A-z.]+?\s?[A-z]*)\s([A-z]+)\s(\d{7})\s(\d?\d.\d)$ (* instead of + in the first capture group).
or ^([A-z.]+?(?:\s[A-z]+)?)\s([A-z]+)\s(\d{7})\s(\d?\d.\d)$ (optional space + text, again in the first capture groups).
All 4 expressions should match your test string, but behave differently on other test strings.
There certainly are improvements to the regex, such as ensuring the string does not start with a ..
As long as you touch the inside of each capture group but not the logic across capture groups, you can let the regex manage any level of control you desire and this will have no impact on the code that follows the text parsing.
It will always be 4 capture groups, with, except the first regex I posted above that has only 3 capture groups, with some guarantees on the text if you need to convert it to another type.

understand C++ - "character literal" vs "string literal" [duplicate]

This question already has answers here:
Single quotes vs. double quotes in C or C++
(15 answers)
Closed 4 years ago.
i was reading a textbook that was talking about "character literal" vs "string literal." It said the following:
'A' is stored as 65
"A" is stored as 65 0
char letter;
letter = 'A' // this will work
letter = "A" // this will not work!
the textbooks explanation confused me. It said "because char variables are only large enough to hold one character, you cannot assign string literals to them." Can anyone explain further, its not clicking in my head. Thank you for your time
You should see this:
Single quotes vs. double quotes in C or C++
As everyone has said here, think about arrays.
A character is only one leter or digit or symbol and it is declared by simple quotes. However, when you are declaring with double quotes, you are actually indicating that is about a string or array. Thus, you should declare your variable like an array. For instance:
char letter[] = "A";
Or
char *letter = "A";
If you want a static Array, You could try something like this:
char letter[5] = {'H','E','L','L','O'};
If you want see another point view, you could read this:
http://www.cplusplus.com/doc/tutorial/ntcs/
Hope I was helpful.
What you might be missing is the fact that strings can be of arbitrary length. The compiler places the string somewhere in the program / memory the way you type it, but it needs to know where the string ends! This type of strings is known as zero- or null-terminated. It means simply that the string is the actual string data followed by a single byte with the value 0.
So in the example, 'A' is the character A. In memory, it may immediately be followed by some garbage / unrelated data, but it's fine, because the compiler knows to only ever use that one byte.
"A" is the string A. In memory, it must be followed by a null terminator, otherwise the program could get confused because there might be garbage data immediately after the string.
Think about strings as array of characters, where one element of this array is simply 'character literal'.

Perl: Regular expressions Pattern matching [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Is [A] a regular expression that will match a string of characters which contains any number of occurrences of the letter A (and only the letter A, with no other characters or spaces) such as AAAA?
Anything in square brackets is a character class. This is complicated enough that it has its own Perl documentation page (in the link), so it's not a surprise it wasn't evident how it works.
A character class defines a set of possible characters; when pattern matching, a character class by itself matches one character from the input, no matter how many characters there are inside the square brackets.
/[A]/ # find one copy of 'A' anywhere in the string
/[abcd]/ # find one copy of any of 'a', 'b', 'c', or 'd' anywhere in the string
/[A..Z]/ # find any one uppercase ASCII character somewhere in the string
If you want your class to match differently, you can add modifiers:
/[A..Z]+/ # find one or more uppercase ASCII characters in a row
/[A]*/ # find zero or more 'A's in a row
The linked page will show you a lot of other options to specify sets of characters inside the square brackets. But the key is that one set of square brackets matches one character unless you add + (one or more of these) or '*' (zero or more of these).
No.
The regular expression pattern [A] can be simplified to just A. It will match any string that contains A. While that includes AAAA, it also includes ZAZ.
For starters, you will need to anchor the match.

The C++ equivalent of C's format string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a C program that reads from keyboard, like this:
scanf("%*[ \t\n]\"%[^A-Za-z]%[^\"]\"", ps1, ps2);
For a better understanding of what this instruction does, let's split the format string as follows:
%*[ \t\n]\" => read all spaces, tabs and newlines ([ \t\n]) but not store them in any variable (hence the '*'), and will keep reading until encounter a double quote (\"), however the double quote is not input.
Once scanf() has found the double quote, reads all caracters that are not letters into ps1. This is accomplished with...
%[^A-Za-z] => input anything not an uppercase letter 'A' through 'Z' and lowercase letter 'a' through 'z'.
%[^\"]\" => read all remaining characters up to, but not including a double quote into ps2 ([^\"]) and the string must end with a double quote (\"), however the double quote is not input.
Can someone show me how to do the same thing in C++
Thank you
C++ supports the scanf function. There is no simple alternative, especially if you want to replicate the exact semantics of scanf() with all the quirks.
Note however that your code has several issues:
You do not pass the maximum number of characters to read into ps1 and ps2. Any sufficiently input sequence will cause a buffer overflow with dire consequences.
You could simplify the first format %*[ \t\n] with just a space in the format string. This would also allow for the case where no whitespace characters are present. As currently written, scanf() would fail and return 0 if no whitspace characters are present before the ".
Similarly, if no non letters or if no other characters follow before the second ", scanf would return a short count of 0 or 1 and leave one or both destination array in an indeterminate state.
For all these reasons, it would be much safer and predictable in C to first read a line of input with fgets() and use sscanf() or parse the line by hand.
In C++, you definitely want to use the std::regex package defined in <regex.h>.

RegEx Remove "-" but not " - " from a string [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
I want to remove "-" but not " - " from a string.
For Example: "01-Frozen - Madonna.mp3" becomes "01Frozen - Madonna.mp3"
I will than remove all digits using /d, I have seen some patterns for it.
So can any body help?
Let's take the example you already specified. 01-Frozen - Madonna.mp3.
The pattern is this: <non space character><hyphen><non space character>
If you need a space, the regex would be \s which will match a single non breaking space. The wonderful aspect of Regular Expression is that most match flags have an opposite, usually denoted by a capital letter of the same identifier. Since, in this case, we don't want a space, we could use \S which matches all characters that are not a space.
So the pattern now looks like: \S-\S.
If you've tried this, it won't work as expected since we want only the hyphens that do not have non-space-items around them and should not include the non-space-items themselves.
Cases like these call for a special kind of...erm...things termed as lookaheads and lookbehinds. Usually this involves a question mark and one more identifier — one of >, <, =, :, !. These extra identifiers ensure what kind of lazy you want your matches to get. You can read more about them here.
For this case, we need to use the = which will ensure that token appended to it — \S in our case — won't be a part of the result. This is called a positive lookahead matcher. So the final regex looks like this:
/(?=\S)-(?=\S)/
[Edited]
Paraphrasing #jerry's comments:
Well, if you want it to work properly, you'll need a lookbehind: /(?<=\S)-(?=\S)/. Though I would prefer negative ones in this case as it would be more natural to say 'not preceded by' and 'not followed by': /(?
Option 1:
/(?<=\S)-(?=\S)/
Option 2:
/(?<!\s)-(?!\s)/