C++ Unicode Bullet Point

C++ Unicode Bullet Point - c++

I am trying to insert the Unicode character U+2022 (bullet •) in my C++ application.
I can't figure out how to convert that U+2022 to a char/string for use in std::string constructor...
char bullet = char(0x2022);
mPassword.SetText( std::string(mText.length(), bullet) );
This one doesn't work. Hope you can help !!
Thanksopatut

Unicode character has type wchar_t(see §2.13.4 of the C++ Standard). You could use it as follows:
wchar_t bullet = L'\x2022';
In string it will look like:
std::wstring str_w_bullet( L"some text with \x2022" );

use std::wstring which is that same as std::string but specialized on wchar_t

Related

Converting string to wchar_t (wide character) C++ [duplicate]

Is there any method?
My computer is AMD64.
::std::string str;
BOOL loadU(const wchar_t* lpszPathName, int flag = 0);
When I used:
loadU(&str);
the VS2005 compiler says:
Error 7 error C2664:: cannot convert parameter 1 from 'std::string *__w64 ' to 'const wchar_t *'
How can I do it?

First convert it to std::wstring:
std::wstring widestr = std::wstring(str.begin(), str.end());
Then get the C string:
const wchar_t* widecstr = widestr.c_str();
This only works for ASCII strings, but it will not work if the underlying string is UTF-8 encoded. Using a conversion routine like MultiByteToWideChar() ensures that this scenario is handled properly.

If you have a std::wstring object, you can call c_str() on it to get a wchar_t*:
std::wstring name( L"Steve Nash" );
const wchar_t* szName = name.c_str();
Since you are operating on a narrow string, however, you would first need to widen it. There are various options here; one is to use Windows' built-in MultiByteToWideChar routine. That will give you an LPWSTR, which is equivalent to wchar_t*.

You can use the ATL text conversion macros to convert a narrow (char) string to a wide (wchar_t) one. For example, to convert a std::string:
#include <atlconv.h>
...
std::string str = "Hello, world!";
CA2W pszWide(str.c_str());
loadU(pszWide);
You can also specify a code page, so if your std::string contains UTF-8 chars you can use:
CA2W pszWide(str.c_str(), CP_UTF8);
Very useful but Windows only.

If you are on Linux/Unix have a look at mbstowcs() and wcstombs() defined in GNU C (from ISO C 90).
mbs stand for "Multi Bytes String" and is basically the usual zero terminated C string.
wcs stand for Wide Char String and is an array of wchar_t.
For more background details on wide chars have a look at glibc documentation here.

Need to pass a wchar_t string to a function and first be able to create the string from a literal string concantenated with an integer variable.
The original string looks like this, where 4 is the physical drive number, but I want that to be changeable to match whatever drive number I want to pass to the function
auto TargetDrive = L"\\\\.\\PhysicalDrive4";
The following works
int a = 4;
std::string stddrivestring = "\\\\.\\PhysicalDrive" + to_string(a);
std::wstring widedrivestring = std::wstring(stddrivestring.begin(), stddrivestring.end());
const wchar_t* TargetDrive = widedrivestring.c_str();

Using UNICODE character values in C++

How do you use unicode in C++ ?
Im aware of wchar_t and wchar_t* but I want to know how you can assign value using only Unicode Values, similar to the way a character can be assigned by equating the variable to the ASCII value:
char a = 92;
Im uysing the MinGW compiler, if it makes a difference.

It can be as simple as:
wchar_t a=L'a';
wchar_t hello_world[]=L"Hello World";
// Or if you really want it to be (old school) C++ and not C
std::wstring s(L"Hello World");
// Or if you want to (be bleeding edge and) use C++11
std::u16string s16(u"Hello World");
std::u32string s32(U"Hello World for the ∞ᵗʰ time");

Exactly the same way:
wchar_t a = 97;
wchar_t xi = 0x03be; // ξ

CStringT to char[]

I'm trying to make changes to some legacy code. I need to fill a char[] ext with a file extension gotten using filename.Right(3). Problem is that I don't know how to convert from a CStringT to a char[].
There has to be a really easy solution that I'm just not realizing...
TIA.

If you have access to ATL, which I imagine you do if you're using CString, then you can look into the ATL conversion classes like CT2CA.
CString fileExt = _T ("txt");
CT2CA fileExtA (fileExt);
If a conversion needs to be performed (as when compiling for Unicode), then CT2CA allocates some internal memory and performs the conversion, destroying the memory in its destructor. If compiling for ANSI, no conversion needs to be performed, so it just hangs on to a pointer to the original string. It also provides an implicit conversion to const char * so you can use it like any C-style string.
This makes conversions really easy, with the caveat that if you need to hang on to the string after the CT2CA goes out of scope, then you need to copy the string into a buffer under your control (not just store a pointer to it). Otherwise, the CT2CA cleans up the converted buffer and you have a dangling reference.

Well you can always do this even in unicode
char str[4];
strcpy( str, CStringA( cString.Right( 3 ) ).GetString() );
If you know you AREN'T using unicode then you could just do
char str[4];
strcpy( str, cString.Right( 3 ).GetString() );
All the original code block does is transfer the last 3 characters into a non unicode string (CStringA, CStringW is definitely unicode and CStringT depends on whether the UNICODE define is set) and then gets the string as a simple char string.

First use CStringA to make sure you're getting char and not wchar_t. Then just cast it to (const char *) to get a pointer to the string, and use strcpy or something similar to copy to your destination.
If you're completely sure that you'll always be copying 3 characters, you could just do it the simple way.
ext[0] = filename[filename.Length()-3];
ext[1] = filename[filename.Length()-2];
ext[2] = filename[filename.Length()-1];
ext[3] = 0;

I believe this is what you are looking for:
CString theString( "This is a test" );
char* mychar = new char[theString.GetLength()+1];
_tcscpy(mychar, theString);
If I remember my old school MS C++.

You do not specify where is the CStringT type from. It could be anything, including your own implementation of string handling class. Assuming it is CStringT from MFC/ATL library available in Visual C++, you have a few options:
It's not been said if you compile with or without Unicode, so presenting using TCHAR not char:
CStringT
<
TCHAR,
StrTraitMFC
<
TCHAR,
ChTraitsCRT<TCHAR>
>
> file(TEXT("test.txt"));
TCHAR* file1 = new TCHAR[file.GetLength() + 1];
_tcscpy(file1, file);
If you use CStringT specialised for ANSI string, then
std::string file1(CStringA(file));
char const* pfile = file1.c_str(); // to copy to char[] buffer

Assigning a "const char*" to std::string is allowed, but assigning to std::wstring doesn't compile. Why?

I assumed that std::wstring and std::string both provide more or less the same interface.
So I tried to enable unicode capabilities for our application
# ifdef APP_USE_UNICODE
typedef std::wstring AppStringType;
# else
typedef std::string AppStringType;
# endif
However that gives me a lot of compile errors when -DAPP_USE_UNICODE is used.
It turned out, that the compiler chokes when a const char[] is assigned to std::wstring.
EDIT: improved example by removing the usage of literal "hello".
#include <string>
void myfunc(const char h[]) {
string s = h; // compiles OK
wstring w = h; // compile Error
}
Why does it make such a difference?
Assigning a const char* to std::string is allowed, but assigning to std::wstring gives compile errors.
Shouldn't std::wstring provide the same interface as std::string? At least for such a basic operation as assignment?
(environment: gcc-4.4.1 on Ubuntu Karmic 32bit)

You should do:
#include <string>
int main() {
const wchar_t h[] = L"hello";
std::wstring w = h;
return 0;
}
std::string is a typedef of std::basic_string<char>, while std::wstring is a typedef of std::basic_string<wchar_t>. As such, the 'equivalent' C-string of a wstring is an array of wchar_ts.
The 'L' in front of the string literal is to indicate that you are using a wide-char string constant.

The relevant part of the string API is this constructor:
basic_string(const charT*);
For std::string, charT is char. For std::wstring it's wchar_t. So the reason it doesn't compile is that wstring doesn't have a char* constructor. Why doesn't wstring have a char* constructor?
There is no one unique way to convert a string of char to a string of wchar. What's the encoding used with the char string? Is it just 7 bit ASCII? Is it UTF-8? Is it UTF-7? Is it SHIFT-JIS? So I don't think it would entirely make sense for std::wstring to have an automatic conversion from char*, even though you could cover most cases. You can use:
w = std::wstring(h, h + sizeof(h) - 1);
which will convert each char in turn to wchar (except the NUL terminator), and in this example that's probably what you want. As int3 says though, if that's what you mean it's most likely better to use a wide string literal in the first place.

To convert from a multibyte encoding to a wide character encoding, take a look at the header <locale> and the type std::codecvt. The Dinkumware library has a class Dinkum::wstring_convert that makes performing such multibyte-to-wide conversions easier.
The function std::codecvt_byname allows one to find a codecvt instance for a particular named encoding. Unfortunately, discovering the names of the encodings (or locales) on your system is implementation-specific.

Small suggestion... Do not use "Unicode" strings under Linux (a.k.a. wide strings). std::string is perfectly fine and holds Unicode very well (UTF-8).
Most Linux API works with char * strings and most popular encoding is UTF-8.
So... Just don't bother yourself using wstring.

In addition to the other answers, you could use a trick from Microsoft's book (specifically, tchar.h), and write something like this:
# ifdef APP_USE_UNICODE
typedef std::wstring AppStringType;
#define _T(s) (L##s)
# else
typedef std::string AppStringType;
#define _T(s) (s)
# endif
AppStringType foo = _T("hello world!");
(Note: my macro-fu is weak, and this is untested, but you get the idea.)

Looks like you can do something like this:
#include <sstream>
// ...
std::wstringstream tmp;
tmp << "hello world";
std::wstring our_string =
Although for a more complex situation, you may want to break down and use mbstowcs

you should use
#include <tchar.h>
tstring instead of wstring/string
TCHAR* instead of char*
and _T("hello") instead of "hello" or L"hello"
this will use the appropriate form of string+char, when _UNICODE is defined.

I want to convert std::string into a const wchar_t *

Is there any method?
My computer is AMD64.
::std::string str;
BOOL loadU(const wchar_t* lpszPathName, int flag = 0);
When I used:
loadU(&str);
the VS2005 compiler says:
Error 7 error C2664:: cannot convert parameter 1 from 'std::string *__w64 ' to 'const wchar_t *'
How can I do it?

First convert it to std::wstring:
std::wstring widestr = std::wstring(str.begin(), str.end());
Then get the C string:
const wchar_t* widecstr = widestr.c_str();
This only works for ASCII strings, but it will not work if the underlying string is UTF-8 encoded. Using a conversion routine like MultiByteToWideChar() ensures that this scenario is handled properly.

If you have a std::wstring object, you can call c_str() on it to get a wchar_t*:
std::wstring name( L"Steve Nash" );
const wchar_t* szName = name.c_str();
Since you are operating on a narrow string, however, you would first need to widen it. There are various options here; one is to use Windows' built-in MultiByteToWideChar routine. That will give you an LPWSTR, which is equivalent to wchar_t*.

You can use the ATL text conversion macros to convert a narrow (char) string to a wide (wchar_t) one. For example, to convert a std::string:
#include <atlconv.h>
...
std::string str = "Hello, world!";
CA2W pszWide(str.c_str());
loadU(pszWide);
You can also specify a code page, so if your std::string contains UTF-8 chars you can use:
CA2W pszWide(str.c_str(), CP_UTF8);
Very useful but Windows only.

If you are on Linux/Unix have a look at mbstowcs() and wcstombs() defined in GNU C (from ISO C 90).
mbs stand for "Multi Bytes String" and is basically the usual zero terminated C string.
wcs stand for Wide Char String and is an array of wchar_t.
For more background details on wide chars have a look at glibc documentation here.

Need to pass a wchar_t string to a function and first be able to create the string from a literal string concantenated with an integer variable.
The original string looks like this, where 4 is the physical drive number, but I want that to be changeable to match whatever drive number I want to pass to the function
auto TargetDrive = L"\\\\.\\PhysicalDrive4";
The following works
int a = 4;
std::string stddrivestring = "\\\\.\\PhysicalDrive" + to_string(a);
std::wstring widedrivestring = std::wstring(stddrivestring.begin(), stddrivestring.end());
const wchar_t* TargetDrive = widedrivestring.c_str();

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ Unicode Bullet Point - c++

Unicode character has type wchar_t(see §2.13.4 of the C++ Standard). You could use it as follows: wchar_t bullet = L'\x2022'; In string it will look like: std::wstring str_w_bullet( L"some text with \x2022" );

use std::wstring which is that same as std::string but specialized on wchar_t

Related

Converting string to wchar_t (wide character) C++ [duplicate]

Using UNICODE character values in C++

CStringT to char[]

Assigning a "const char*" to std::string is allowed, but assigning to std::wstring doesn't compile. Why?

I want to convert std::string into a const wchar_t *

Categories

Resources