How to convert AS3 ByteArray into wchar_t const* filename?
So in my C code I have a function waiting for a file with void fun (wchar_t const* filename) how to send to that function my ByteArray? (Or, how should I re-write my function?)
A four month old question. Better late than never?
To convert a ByteArray to a String in AS3, there are two ways depending on how the String was stored. Firstly, if you use writeUTF it will write an unsigned short representing the String's length first, then write out the string data. The string is easiest to recover this way. Here's how it's done in AS3:
byteArray.position = 0;
var str:String = byteArray.readUTF();
Alternatively, toString also works in this case. The second way to store is with writeUTFBytes. This won't write the length to the beginning, so you'll need to track it independantly somehow. It sounds like you want the entire ByteArray to be a single String, so you can use the ByteArray's length.
byteArray.position = 0;
var str:String = byteArray.readUTFBytes(byteArray.length);
Since you want to do this with Alchemy, you just need to convert the above code. Here's a conversion of the second example:
std::string batostr(AS3_Val byteArray) {
AS3_SetS(byteArray, "position", AS3_Int(0));
return AS3_StringValue(AS3_CallS("readUTFBytes", byteArray,
AS3_Array("AS3ValType", AS3_GetS(byteArray, "length")) ));
}
This has a ridiculous amount of memory leaks, of course, since I'm not calling AS3_Release anywhere. I use a RAII wrapper for AS3_Val in my own code... for the sake of my sanity. As should you.
Anyway, the std::string my function returns will be UTF-8 multibyte. Any standard C++ technique for converting to wide characters should work from here. (search the site for endless reposts) I suggest leaving it is as it, though. I can't think of any advantage to using wide characters on this platform.
Related
Many topics have discussed the difference between string and char[]. However, they are not clear to me to understand why we need to bring string in c++? Any insight is welcome, thanks!
char[] is C style. It is not object oriented, it forces you as the programmer to deal with implementation details (such as '\0' terminator) and rewrite standard code for handling strings every time over and over.
char[] is just an array of bytes, which can be used to store a string, but it is not a string in any meaningful way.
std::string is a class that properly represents a string and handles all string operations.
It lets you create objects and keep your code fully OOP (if that is what you want).
More importantly, it takes care of memory management for you.
Consider this simple piece of code:
// extract to string
#include <iostream>
#include <string>
main ()
{
std::string name;
std::cout << "Please, enter your name: ";
std::cin >> name;
std::cout << "Hello, " << name << "!\n";
return 0;
}
How would you write the same thing using char[]?
Assume you can not know in advance how long the name would be!
Same goes for string concatenation and other operations.
With real string represented as std::string you combine two strings with a simple += operator. One line.
If you are using char[] however, you need to do the following:
Calculate the size of the combined string + terminator character.
Allocate memory for the new combined string.
Use strncpy to copy first string to new array.
Use strncat to append second string to first string in new array.
Plus, you need to remember not to use the unsafe strcpy and strcat and to free the memory once you are done with the new string.
std::string saves you all that hassle and the many bugs you can introduce while writing it.
As noted by MSalters in a comment, strings can grow. This is, in my opinion, the strongest reason to have them in C++.
For example, the following code has a bug which may cause it to crash, or worse, to appear to work correctly:
char message[] = "Hello";
strcat(message, "World");
The same idea with std::string behaves correctly:
std::string message{"Hello"};
message += "World";
Additional benefits of std::string:
You can send it to functions by value, while char[] can only be sent by reference; this point looks rather insignificant, but it enables powerful code like std::vector<std::string> (a list of strings which you can add to)
std::string stores its length, so any operation which needs the length is more efficient
std::string works similarly to all other C++ containers (vector, etc) so if you are already familiar with containers, std::string is easy to use
std::string has overloaded comparison operators, so it's easy to use with std::map, std::sort, etc.
String class is no more than an amelioration of the char[] variable.
With strings you can achieve the same goals than the use of a char[] variable, but you won't have to matter about little tricks of char[] like pointers, segmentation faults...
This is a more convenient way to build strings, but you don't really see the "undergrounds" of the language, like how to implement concatenation or length functions...
Here is the documentation of the std::string class in C++ : C++ string documentation
I am using ESP8266 Wifi chip with the SMING framework which uses C++. I have a tcpServer function which receives data from a TCP port. I would like to convert the incoming char *data into String data type. This is what I did.
bool tcpServerClientReceive(TcpClient& client, char *data, int size)
{
String rx_data;
rx_data = String(data);
Serial.printf("rx_data=%s\r",rx_data);
}
The contents of rx_data is rubbish. What is wrong with the code? How to make rx_data into a proper string?
Why what you are doing is wrong:
A C style string is an array of char where the last element is a 0 Byte. This is how functions now where the string ends. They scan the next character until they find this zero byte. A C++ string is a class which can hold additional data.
For instance to get the length of a string one might choose to store the length of the stirng in a member of the class and update it everytime the string is modified. While this means additional work if the string is modified it makes the call t length trivial and fast, since it simply returns the stored value.
For C Strings on the other hand length has to loop over the array and count the number of characters until it finds the null byte. thus the runime of strlen depends on the lengh of the string.
The solution:
As pointed out above you have to print it correctly, try either:
#include <iostream>
...
std::cout << "rx_data=" << rx_data << std::endl;
or if you insist on printf (why use c++ then?) you can use either string::c_str(), or (since C++11, before the reutrned array might not be null terminated) string::data(): your code would become:
Serial.printf("rx_data=%s\r",rx_data.c_str());
I would suggest you have a look at std::string to get an idea of the details. In fact if you have the time a good book could help explaining a lot of important concepts, including containers, like std::string or std::vector. Don't assume that because you know C you know how to write C++.
I am currently working on an application in C++, that ties into Lua, that ties into Flash (in that order). My goal at the moment is getting wchar_ts from C++ into Flash, via Lua. I would love any insights as to how I can accomplish this!
If any other information is required, please ask and I'll do my best to provide it
What I have tried
It's my understanding that Lua is not a fan of Unicode, but it should still be able to receive the string of bytes from my C++ application. I imagine there must be a way to then pass those bytes over to Flash to then render out my intended Unicode. So what I've done so far:
C++:
//an example wchar_t*
const wchar_t *text = L"Test!";
//this function pushes a char* to my Lua code
lua.PushString((char*)text); //directly casting text to a char*... D:
Lua:
theString = FunctionThatGetsWCharFromCpp();
flash.ShowString(theString);
Flash:
function ShowString(theString:String)
{
myTextField.text = theString;
}
Now the outcome here is that myTextField only shows "T". This made sense to me. The cast from wchar_t to char would end up padding out the chars with some zeros, especially since "T" doesn't really utilize both bytes of a wchar_t. A quick look at the documentation yields:
lua_pushstring
The string cannot contain embedded zeros; it is assumed to end at the first zero.
So I ran a little test:
C++:
//prefixing with a Japanese character
//which will use both bytes of the wchar_t
const wchar_t *text = L"たTest!";
The Flash textbox now reads: "_0T", 3 characters. Makes total sense, the 2 bytes of the Japanese character + T, then termination.
I understand what is going on, but I am still completely unsure of how to tackle this problem. And I'm really unsure of what to search for. Is there a specific Lua function I can use to pass a wad of bytes over to Lua from C++ (I've read somewhere that lua_pushlstring is often used for this, but that also terminates at first zero)? Is there a Flash datatype that will accept these bytes, then I'll need to do some sort of conversion to get them into a readable, multibyte string? or is this just really not possible?
Note:
I'm not too familiar with Unicode and code pages and whatnot, so I'm not too sure if there'll also be a step where I'll need to specify the correct encoding in Flash so that I can get the correct output - but I'm happy to cross that bridge when I get there, but if anyone has any insight here too, that would be great!
I don't know if this will work, but I'd recommend trying to use UTF-8. A string encoded in UTF-8 doesn't have any embedded zeros in it, so Lua should be able handle it, and Flash ought to also be able to handle it, depending on how exactly the languages interface.
Here's one way to convert a wide-character string to UTF-8 using setlocale(3) wcstombs(3):
// Error checking omitted for expository purposes
// Call this once at program startup. If you'd rather not change the locale,
// you can instead write your own conversion routine (but beware of UTF-16
// surrogate pairs if you do)
setlocale(LC_ALL, "en_US.UTF-8");
// Do this for each string you want to convert
const wchar_t *wideString = L"たTest!";
size_t len = wcslen(wideString);
size_t maxUtf8len = 4 * len + 1; // Each wchar_t encodes to a max of 4 bytes
char *utf8String = new char[maxUtf8len];
wcstombs(utf8String, wideString, maxUtf8len);
...
// Do stuff with utf8string
...
delete [] utf8String;
If you're on Windows, you can instead use the WideCharToMultiByte function with the CP_UTF8 code page to do the conversion, since I don't believe that the Visual Studio C runtime supports UTF-8 locales:
// Error checking omitted for expository purposes
const wchar_t *wideString = L"たTest!";
size_t len = wcslen(wideString);
size_t maxUtf8len = 4 * len + 1; // Each wchar_t encodes to a max of 4 bytes
char *utf8String = new char[maxUtf8len];
WideCharToMultiByte(CP_UTF8, 0, wideString, len + 1, utf8String, maxUtf8len, NULL, NULL);
...
// Do stuff with utf8string
...
delete [] utf8String;
I'm new to c++ (I'm a c# developer).
I have an SQLite wrapper class that requires you to pass in a database name as a const char* , however I only have it as a Platform::String (after doing a file search).
I cant seem to find a way to convert the Platform::String to const char*.
Ive seen another question on StackOverflow that explain why it isnt straight-forward, but no sample code or end-to-end solution.
Can anyone help me ?
Thanks
Disclaimer: I know little about C++/CX, and I'm basing the answer on the documentation here.
The String class contains 16-bit Unicode characters, so you can't directly get a pointer to 8-bit char-typed characters; you'll need to convert the contents.
If the string is known to only contain ASCII characters, then you can convert it directly:
String s = whatever();
std::string narrow(s.Begin(), s.End());
function_requiring_cstring(narrow.c_str());
Otherwise, the string will need translating, which gets rather hairy. The following might do the right thing, converting the wide characters to multi-byte sequences of narrow characters:
String s = whatever();
std::wstring wide(s.Begin(), s.End());
std::vector<char> buffer(s.Length()+1); // We'll need at least that much
for (;;) {
size_t length = std::wcstombs(buffer.data(), wide.c_str(), buffer.size());
if (length == buffer.size()) {
buffer.resize(buffer.size()*2);
} else {
buffer.resize(length+1);
break;
}
}
function_requiring_cstring(buffer.data());
Alternatively, you may find it easier to ignore Microsoft's ideas about how strings should be handled, and use std::string instead.
Is it possible to somehow adapt a c-style string/buffer (char* or wchar_t*) to work with the Boost String Algorithms Library?
That is, for example, it's trimalgorithm has the following declaration:
template<typename SequenceT>
void trim(SequenceT &, const std::locale & = std::locale());
and the implementation (look for trim_left_if) requires that the sequence type has a member function erase.
How could I use that with a raw character pointer / c string buffer?
char* pStr = getSomeCString(); // example, could also be something like wchar_t buf[256];
...
boost::trim(pStr); // HOW?
Ideally, the algorithms would work directly on the supplied buffer. (As far as possible. it obviously can't work if an algorithm needs to allocate additional space in the "string".)
#Vitaly asks: why can't you create a std::string from char buffer and then use it in algorithms?
The reason I have char* at all is that I'd like to use a few algorthims on our existing codebase. Refactoring all the char buffers to string would be more work than it's worth, and when changing or adapting something it would be nice to just be able to apply a given algorithm to any c-style string that happens to live in the current code.
Using a string would mean to (a) copy char* to string, (b) apply algorithm to string and (c) copy string back into char buffer.
For the SequenceT-type operations, you probably have to use std::string. If you wanted to implement that by yourself, you'd have to fulfill many more requirements for creation, destruction, value semantics etc. You'd basically end up with your implementation of std::string.
The RangeT-type operations might be, however, usable on char*s using the iterator_range from Boost.Range library. I didn't try it, though.
There exist some code which implements a std::string like string with a fixed buffer. With some tinkering you can modify this code to create a string type which uses an external buffer:
char buffer[100];
strcpy(buffer, " HELLO ");
xstr::xstring<xstr::fixed_char_buf<char> >
str(buffer, strlen(buffer), sizeof(buffer));
boost::algorithm::trim(str);
buffer[str.size()] = 0;
std::cout << buffer << std::endl; // prints "HELLO"
For this I added an constructor to xstr::xstring and xstr::fixed_char_buf to take the buffer, the size of the buffer which is in use and the maximum size of the buffer. Further I replaced the SIZE template argument with a member variable and changed the internal char array into a char pointer.
The xstr code is a bit old and will not compile without trouble on newer compilers but it needs some minor changes. Further I only added the things needed in this case. If you want to use this for real, you need to make some more changes to make sure it can not use uninitialized memory.
Anyway, it might be a good start for writing you own string adapter.
I don't know what platform you're targeting, but on most modern computers (including mobile ones like ARM) memory copy is so fast you shouldn't even waste your time optimizing memory copies. I say - wrap char* in std::string and check whether the performance suits your needs. Don't waste time on premature optimization.