How to do string concatenation in gdb/ada - gdb

According to the manual, string concatenation isn't implemented in gdb. I need it however, so is there a way to achieve this, perhaps using array functions?

I don't have a copy of gdb around to try this on, but perhaps this line from later in the Ada section of the document will help you?
Rather than use catenation and
symbolic character names to introduce
special characters into strings, one
may instead use a special bracket
notation, which is also used to print
strings. A sequence of characters of
the form ["XX"]' within a string or
character literal denotes the (single)
character whose numeric encoding is XX
in hexadecimal. The sequence of
characters["""]' also denotes a
single quotation mark in strings. For
example, "One line.["0a"]Next
line.["0a"]"
contains an ASCII newline character
(Ada.Characters.Latin_1.LF) after each
period.

For Objective-C:
[#"asd" stringByAppendingString:#"zxc"]
[#"ID: " stringByAppendingString:(NSString*) [aTaskDict valueForKey:#"ID"]]

Related

How to search a unicode character using its code point in sublime text

From what I understand, unicode characters have various representations.
e.g., code point or hex byte (these two representations are not always the same if UTF-8 encoding is used).
If I want to search for a visible unicode character (e.g., 汉) I can just copy it and search. This works even if I do not know its underlying unicode representation. But for other characters which may not be easily visible, such as zeros width space, that way does not work well. For these characters, we may want to search it using its code point.
My question
If I have known a character's code point, how do I search it in sublime text using regular expression? I highlight sublime text because different editors may use different format.
Zero width space characters can be found via:
\x{200b}
Demo
Non breaking space characters can be found via:
\xa0
Demo
For unicode character whose code point is CODE_POINT (code point must be in hexadecimal format), we can safely use regular expression of the format \x{CODE_POINT} to search it.
General rules
For unicode characters whose code points can fit in two hex digits, it is fine to use \x without curly braces, but for those characters whose code points are more than two hex digits, you have to use \x followed by curly braces.
Some examples
For example, in order to find character A, you can use either \x{41} or \x41 to search it.
As another example, in order to find 我(according to here, its code point is U+6211), you have to use \x{6211} to search it instead of \x6211 (see image below). If you use \x6211, you will not find the character 我.

c++: XOR'd string with special characters won't compile as raw string literal?

I have a string I've obfuscated in my code, by XORing each character by some random value.
However, the resulting multi-line raw string literal won't compile correctly.
In the following image, you can see how MSVS2015 is not parsing the string correctly, even when using proper delimeters on either end (notice the black text throughout, not being parsed as part of the string).
Trying to compile the code results in errors about not being able to find the closing brace of the literal (even though it's in the proper place, at the very end of the string after the closing delimeter, etc). Manually erasing the black bits results in a proper compilation (albeit with a string that can no longer be properly unscrambled, of course).
I'm assuming this is happening because various resulting characters of the XOR function cannot be properly saved inside the .h file. Is there a solution to this problem? I've tried switching the file format to Unicode but that didn't work.
Your use of raw strings is too simplistic. It takes the sequence ..|.. as delimiter, you probably don't have the sequence )..|.. at the end of your string.
Use the full specification of delimited raw strings as described in cppreference in variant (6). This is also described in the C++ standard section §2.14.5 String literals. The template is as follows:
R"d-char-sequence(your raw text)d-char-sequence"
The key is to use the "d-char-sequence". This sequence can contain the following:
any member of the basic source character set except:
space, the left parenthesis (, the right parenthesis ), the backslash \,
and the control characters representing horizontal tab,
vertical tab, form feed, and newline.
How that d-char-sequence works is described as follows:
A string literal that has an R in the prefix is a raw string literal. The d-char-sequence serves as a delimiter. The terminating d-char-sequence of a raw-string is the same sequence of characters as the initial d-char-sequence. A d-char-sequence shall consist of at most 16 characters.
This ensures that the raw string can legally contain any character that is supported by the source character set (unicode here). The raw string may contain quotes parentheses backslashes and even newlines.
It's not as complicated as it sounds. Just add a prefix and a suffix to the raw string. It could look like this:
std::string(R"my-delimiter(... long text ...)my-delimiter");
Of course substitute ... long text ... with the raw string literal. Just make sure that the sequence )my-delimiter" does not appear in the raw string text.

What to use to represent a lambda character in C++

In the program, Lambda λ theoretically represents nothing: ''. I thought of representing this programatically as '\0', but obviously that terminates a string which is not necessarily what lambda does. Also, I am reading in from istringstream and it has problems reading that character in.
So what character would you use?
I'm assuming you have a reason for representing Int,Char,Int as a string, rather than just define a struct to hold the data.
As you say, \0 doesn't work as it terminates the string. But there are other invisible ASCII characters that you can use and easily escape in C++. Have a look at this list of escape codes.

C++ - Escaping or disabling backslash on string

I am writing a C++ program to solve a common problem of message decoding. Part of the problem requires me to get a bunch of random characters, including '\', and map them to a key, one by one.
My program works fine in most cases, except that when I read characters such as '\' from a string, I obviously get a completely different character representation (e.g. '\0' yields a null character, or '\' simply escapes itself when it needs to be treated as a character).
Since I am not supposed to have any control on what character keys are included, I have been desperately trying to find a way to treat special control characters such as the backslash as the character itself.
My questions are basically these:
Is there a way to turn all special characters off within the scope of my program?
Is there a way to override current digraphs definitions of special characters and define them as something else (like digraphs using very rare keys)?
Is there some obscure method on the String class that I missed which can force the actual character on the string to be read instead of the pre-defined constant?
I have been trying to look for a solution for hours now but all possible fixes I've found are for other languages.
Any help is greatly appreciate.
If you read in a string like "\0" from stdin or a file, it will be treated as two separate characters: '\\' and '0'. There is no additional processing that you have to do.
Escaping characters is only used for string/character literals. That is to say, when you want to hard-code something into your source code.

Failsafe conversion between different character encodings

I need to convert strings from one encoding (UTF-8) to another. The problem is that in the target encoding we do not have all characters from the source encoding and libc iconv(3) function fails in such situation. What I want is to be able to perform conversion but in output string have this problematic characters been replaced with some symbol, say '?'.
Programming language is C or C++.
Is there a way to address this issue ?
Try appending "//TRANSLIT" or "//IGNORE" to the end of the destination charset string. Note that this is only supported under the GNU C library.
From iconv_open(3):
//TRANSLIT
When the string "//TRANSLIT" is appended to tocode, translitera‐
tion is activated. This means that when a character cannot be
represented in the target character set, it can be approximated
through one or several similarly looking characters.
//IGNORE
When the string "//IGNORE" is appended to tocode, characters
that cannot be represented in the target character set will be
silently discarded.
Alternately, manually skip a character and insert a substitution in the output when you get -EILSEQ from iconv(3).
Regex based on the translatable source ranges used to swap a corresponding placeholder in for any chars that don't match.