How to get displayed width of a string? - c++

When you have non-fixed width characters (such as \t) in a string , or escape codes, such as those for ANSI color (such as \1xb[31m), these characters add to the .length() of an std::string, but do not add to the displayed length when printed.
Is there any way in C++ to get the displayed width of a string in *nix?
For instance:
displayed_width("a\tb") would be 4 if the displayed tab width is 2
displayed_width("\1xb[33mGREEN") would be 5

Most commonly, a tab asks the terminal program to move the cursor to a column that's a multiple of 8, though many terminal programs let you configure that. With such behaviour, how much width a tab actually adds depends on where the cursor was beforehand relative to the tab stops. So, simply knowing the string content is not enough to calculate a printable width without some assumption or insight regarding prior cursor placement and tab stops.
Non-printable codes also vary per terminal type, though if you only need ANSI colour then that's pretty easy. You can move along the string counting characters; when you see an ESCAPE skip through to the terminating m. Something like (untested):
int displayed_width(const char* p)
{
int result = 0;
for ( ; *p; ++p)
{
if (p[0] == '\e' && p[1] == '[')
while (*p != 'm')
if (*p)
++p;
else
throw std::runtime_error("string terminates inside ANSI colour sequence");
else
++result;
}
return result;
}

Nothing built in. The "displayed width" of the tab character is an implementation detail, as are console escape sequences. C++ doesn't care about platform-specific things like that.
Is there something in particular you're trying to do? We may be able to suggest alternatives if we know what particular task you're working on.

Not with standard methods to my knowledge. C++ does not know about terminals.
My guess would be to use NCURSES for that. Dunno if boost has something up the sleeve for that though.

Display length on what device? A console that uses a fixed-width font? A window that uses a proportional font? This is highly device-dependent question. There is no fixed answer. You will have to use the tools associated with the target output device.

Related

How does Vim "take control" of the terminal and allow it to be used as a modifiable buffer?

How does Vim "take control" of the terminal and allow it to be used as a modifiable buffer?
How difficult would it be to create my own program which could arbitrarily modify the character buffer used by the terminal using a language like c++ (in linux)?
Essentially the output buffer is just a bunch of characters.
The most basic approach to use terminal as VIM does is to emit backspace.
The backspaces in terminal are non-destructive so they just move the cursor to the left.
So you can emit backspace till you find yourself in the right place.
If your terminal support ANSI escape sequences (the special text sequences) you can use \033[number;numberH to jump around e.g. \033[3;3H will jump to row 3 and height 3 then you can print something.
In C that is:
int row = 3;
int col = 3;
printf("\033[%d;%dH", row, col);
printf("Hello world");
Of course not all terminals support ANSI escape sequences.
That's why you've got ncurses library and the move(row,col) function.
You can also try to execute native functions.
I don't know what sits inside linux headers but in Windows the winapi.h provides efficient way to use the buffer: SetConsoleCursorPosition

win api readConsole()

I am trying to use Win API ReadConsole(...) and I want to pass in a delimiter char to halt the input from the console.
The below code works but it only stops reading the input on \r\n.
I would like it to stop reading the console input on '.' for instance.
void read(char *cIn, char delim)
{
HANDLE hFile;
DWORD charsRead;
DWORD charsToRead = MAX_PATH;
CONSOLE_READCONSOLE_CONTROL cReadControl;
cReadControl.nLength = sizeof(CONSOLE_READCONSOLE_CONTROL);
cReadControl.nInitialChars = 0;
cReadControl.dwCtrlWakeupMask = delim;
cReadControl.dwControlKeyState = NULL;
DWORD lpMode;
// char cIn[MAX_PATH]; //-- buffer to hold data from the console
hFile = CreateFile("CONIN$", GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_WRITE | FILE_SHARE_READ, NULL,
OPEN_EXISTING, 0, NULL);
GetConsoleMode(hFile,&lpMode);
// lpMode &= ~ENABLE_LINE_INPUT; //-- turns off this flag
// SetConsoleMode(hFile, lpMode); //-- set the mode with the new flag off
bool read = ReadConsole(hFile, cIn, charsToRead * sizeof(TCHAR), &charsRead, &cReadControl);
cIn[charsRead - 2] = '\0';
}
I know there are other easy ways to do this but I am just trying to understand some of the win api functions and how to use them.
Thank you.
I saw this question and assumed it would be trivial, but spent the last 30 minutes trying to figure it out and finally have something.
That dwCtrlWakeupMask is pretty poorly documented in CONSOLE_READCONSOLE_CONTROL. MSDN says "A user-defined control character used to signal that the read is complete.", but why is it called mask? Why is it a ULONG instead of a TCHAR or something like that? I tried feeding it chars and wchars and nothing happened, so there must be more to the story.
I took to the web searching for that particular variable and found this link:
https://groups.google.com/forum/#!topic/golang-codereviews/KSp37ITmcUg It is a random Go library coder asking for help and the answer is that tab is 1 << '\t'. I tried it, and it works!
So, for future web searchers, dwCtrlWakeupMask is a bitmask of ASCII control characters that will cause ReadConsole to return. You can | together as many 1 << ctrl_chars as you like... but it cannot be arbitrary characters, since it is a bitmask in a 32 bit value, only the chars 1-31 (inclusive) are possible (this group btw is called control characters, it includes things like tab, backspace, bell; things that do not represent printable characters per se).
Thus, this mask:
cReadControl.dwCtrlWakeupMask = (1 << '\t') | (1 << 0x08);
Will cause ReadConsole to return when tab (\t) OR when backspace (0x08) is pressed.
The characters represented by ctrl+ some_ascii_value are the number of that letter in the english alphabet, starting at a == 1. So, ctrl+d is 4, and ctrl+z is 26.
Therefore, this will return when the user hits ctrl+d or ctrl+z:
cReadControl.dwCtrlWakeupMask = (1 << 4) | (1 << 26);
Note that the Linux terminal driver also returns on read when the user hits ctrl+d so this might be a nice compatibility thing.
I believe the point of this argument is to allow easier tab-completion in processed input mode; otherwise, you'd have to turn processed input off and process keys one by one to do that. Now you don't have to.... though tbh, I still prefer taking my input with ReadConsoleInput for interactive programs since you get much better control over it all.
But while there are a lot of other ways to do what you want - and using . as a delimiter is impossible here, since it has a value >= 32, so you need to do it yourself... understanding what this does is interesting to me anyway, and there's scarce resources on the web so I'm writing this up just for future reference.
Note that this does not appear to work in wineconsole so be sure you are on a real Windows box to test it out.
Now, dwControlKeyState is actually set BY the function. Your value passed in is ignored (at least as far as I can tell), but you can inspect it for the given flags when the function returns. So, for example, after calling ReadConsole and hitting the key, it will be 32 if your numlock was on. It will be 48 is numlock was on and you pressed shift+tab (and had numlock on). So you test it after the function returns.
I typically like MSDN docs but IMO they completely dropped the ball on explaining this parameter!
You will find this code ridiculous. It is most likely the only way to do this. If you have to adapt to use ReadFile later it is the only way that doesn't consume more input.
Most of the time you don't really want ReadConsole at all you want ReadFile on the standard input handle, but I digress.
char *cInptr = cIn;
do {
bool read = ReadConsole(hFile, cInptr, sizeof(TCHAR), &charsRead, &cReadControl);
if (read) cInptr += charsRead;
} while (read && charsRead > 0 && cInptr[-1] && cInptr[-1] != '.');
I might have too many tests in the loop due to being paranoid. I'm not inclined to look up all predicates to determine which are implied by the contract of ReadConsole.

Windows.h - SetWindowText Shows the CR-LF Characters

I am writing a basic program in visual c++ that allows the user to enter text and then the program flips the text and displays it for the user to copy. The program works pretty good, until you add an enter to the EDIT box. When the user clicks to flip the text, instead of going down one line, it displays the actual characters for \r\n.
Is there a way to display the text as should instead of the actual string itself?
Here is how I set the text:
wchar_t lpwString[4096];
int length = GetWindowTextW(text->hwnd, lpwString, 4096);
SetWindowText(text->hwnd, flipText(lpwString, length));
Here is the method flipText
LPWSTR flipText(wchar_t textEntered[], const int len) {
wchar_t text[4096];
wchar_t flipped[4096];
wcsncpy_s(text, textEntered, len +1);
wcsncpy_s(flipped, textEntered, len +1);
for (int i = len -1, k = 0; i > -1; i--, k++)
flipped[k] = text[i];
return flipped;
}
"text" is just an object I created to store data for an EDIT box.
For an edit box, a return is a CR+LF sequence, when you reverse the text you are transforming it in an LF+CR, which is not recognized (it shows the individual characters). An easy way out could be to do a second pass on the reversed string and swap all the LF+CR pairs into CR+LF.
Incidentally, your flipText function is seriously broken - you are performing a useless extra copy of the original string, and you are returning a pointer to a local array, which is working only by chance. A way easier method could be just to reverse the string in-place.
Also, if you are working in C++ you should consider using std::string (or std::wstring if working with wide characters), which removes whole classes of buffers lifetime/size problems.
EDIT control needs '\r\n' combination to break. when you flip all the text, you get \n\r which means nothing to windows but text.
suggestion - flip the text and replace all the \n\r back to \r\n
Make sure ES_WANTRETURN style is used for Edit Box.
Also you should change \n\r back to \r\n right after flipText() call.

How to use carriage return with multiple line?

When I want to print out another text in the same line, I can do this:
int i = 0;
string text = "Paragraph ";
while (i < 10) {
if (clock() % CLOCKS_PER_SEC == 0) {
cout << text << i + 1 << "\r";
cout.flush();
i++;
}
}
But, how I can I do this with multiple line? I want to retain a paragraph as a whole in its initial position in terminal. If I change text with a string that contains paragraph with some newline characters, it prints another new block of paragraph below the last printed.
How can I retain it's position?
Your question isn't very clear, but I'm going to assume you want to know how to overwrite text in places other than the current line.
Standard C++ doesn't give you this capability. You will have to use OS-specific functionality to place the cursor at an arbitrary place of the console.
Under Unix-like systems you will generally use ANSI escape sequences
Under Windows you're best served by the console manipulation functions, in particular SetConsoleCursorPosition. Look here for more console functions.
It is not possible in standard C++.
The technique depends on what the standard output device (i.e. std::cout) is - which is difficult, as that depends on the operating system and choices by the end user. For example, a lot of physical terminals (and terminal/console emulators) support escape sequences. Standard output can be redirected to various devices (including to a text file, which makes positioning the cursor a bit pointless).
In general terms, you will need to specify the output device (i.e. what your program can assume output is being written to), the host system, system settings, and a bunch of other things. And then use an API (or library) supported on the host system. Depending on your choices here, the techniques are highly variable.
Under unix, functions libraries like curses might be used. If you use curses, it will probably be necessary to use other curses functions to actually write your output (rather than cout).
Under windows, there is a set of console API functions (a subset of the win API), such as SetConsoleCursorPosition(). Again, it might be easier if you use other console functions, rather than cout.

FillConsoleOutputCharacter/WriteConsoleOutput and special characters

I'm messing with some of the native windows console functions, and am impressed by their speed,if not their ease of use.
Anyway, I have long known that the following code will produce some interesting characters
for(int i = 0; i < 256; i++)
{
cout << char(i) << endl;
}
However, I cannot get FillConsoleOutputCharacter or WriteConsoleOutput to produce all of those characters (many simply appear as question marks).
Here is an example of the code I am using:
COORD spot = {0,0};
HANDLE hOut = GetStdHandle(STD_OUTPUT_HANDLE);
DWORD Written;
for(int i = 0; i < 256; i++)
{
FillConsoleOutputAttribute(hOut, 7, 1, spot, &Written);
FillConsoleOutputCharacterW(hOut, char(i), 1, spot, &Written);
spot.Y++;
}
Does anyone know of a relatively convenient way to write those characters with the native functions?
By the way, I am using Visual Studio 2010 on Windows 7 x64.
Try using FillConsoleOutputCharacterA instead of FillConsoleOutputCharacterW which is using the unicode character which can take a little bit of knowledge to get correctly.
edit I tried using FillConsoleOutputCharacterA and it gives equivalent output to your first case.
FillConsoleOutputCharacterA should write the same set of characters that the cout function does. These characters are determined by the console's current code page.
With FillConsoleOutputCharacterW, you can still generate all the same characters (as well as any additional characters that may be included in the console font) but you need to use the Unicode (16-bit) codes for these characters, rather than the 8-bit codes used with cout.
Note that Windows internally uses an out-of-date version of Unicode, with characters limited to 16 bits (0-65536) rather than Unicode proper which uses 0-1,112,063 (although most of these codes remain unassigned). I believe the console's Unicode character set corresponds to plane 0 of Unicode, the basic multilingual plane.
The question marks appear when you write a control character or a character that isn't included in the current font.