How can I combine two const char*s into a third?
i'm trying to do this with this code:
const char* pName = "Foo"
printf("\nMy name is %s.\n\n\n",pName);
const char* nName;
int num_chars = asprintf(&nName, "%s%s", "Somebody known as ", pName);
But I get this error:
'asprintf': identifier not found
I include stdio.h via this code:
#include <stdio.h>
Simple, just use C++:
const char* pName = "Foo"
std::string name("Somebody known as ");
name += pName;
const char* nName = name.c_str();
asprintf is a GNU extension. You can instead use snprintf, or
strncat, but you'll need to handle the memory management yourself: asprintf allocates the result for you.
Better to use std:string, which will make the code much easier.
sprintf(snprintf) or strcat(strncat).
sprintf.
sprintf(nName, "%s%s", "Somebody known as ", pName);
strcat.
strcpy(nName, "Somebody known as ");
strcat(nName, pName);
I will assume that you are using C, besides that you've tagged this question as C++. If you want C++, see Luchian's answer.
There are few errors in the code - the bigger error is that you didn't allocated memory for string pointing by pName. Second error is that you are taking address of the nName variable, and not the address of reserved memory location in you asprintf function. Third error is that asprintf function is no standard C function, but the GNU extension and it might not be available on your compiler (you didn't say which is): http://linux.die.net/man/3/asprintf
You should use something like this:
#include <stdio.h>
const char* pName = "Foo"
printf("\nMy name is %s.\n\n\n",pName);
char nName[30];
int num_chars = sprintf(nName, "%s%s", "Somebody known as ", pName);
Edit: I've read more about asprintf function now. You should pass address of your pointer in asprintf, but it should not be const char * but the char*, as memory location it points will change after allocating enough memory in asprintf.
Related
I'm trying to write a function to parse and extract the components of a URL. Moreover, I need the components (e.g. hostname) to have the type char * since I intend to pass them to C APIs.
My current approach is to save the components in the parse_url function to the heap by calling malloc. But for some reason, the following code is printing gibberish. I'm confused by this behavior because I thought memory allocated on the heap will persist even after the function allocating it returns.
I'm new to C/C++, so please let me know what I did wrong and how to achieve what I wanted. Thank you.
#include <iostream>
#include <string>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
using namespace std;
void cast_to_cstyle(string source, char *target)
{
target = (char *)malloc(source.size() + 1);
memcpy(target, source.c_str(), source.size() + 1);
}
void parse_url(string url, char *protocol_cstyle, char *hostname_cstyle, char *port_cstyle, char *path_cstyle)
{
size_t found = url.find_first_of(":");
string protocol = url.substr(0, found);
string url_new = url.substr(found + 3); // `url_new` is the url excluding the "http//" part
size_t found1 = url_new.find_first_of(":");
string hostname = url_new.substr(0, found1);
size_t found2 = url_new.find_first_of("/");
string port = url_new.substr(found1 + 1, found2 - found1 - 1);
string path = url_new.substr(found2);
cast_to_cstyle(protocol, protocol_cstyle);
cast_to_cstyle(hostname, hostname_cstyle);
cast_to_cstyle(port, port_cstyle);
cast_to_cstyle(path, path_cstyle);
}
int main() {
char *protocol;
char *hostname;
char *port;
char *path;
parse_url("http://www.example.com:80/file.txt", protocol, hostname, port, path);
printf("%s, %s, %s, %s\n", (protocol), (hostname), (port), (path));
return 0;
}
The problem is that arguments are passed by value, so the newly created string never leaves the function (albeit exists until program termination as free is never called on it). You can pass by reference¹ like:
void cast_to_cstyle(string source, char *&target)
or better, pass the source string by (constant) reference too (string is expensive to copy):
void cast_to_cstyle(const string &source, char *&target)
(neither function body nor the call site need to be changed).
But you may not need even that.
If the API doesn’t actually modify the string despite using non-const pointer (pretty common in C AFAIK), you can use const_cast, like const_cast<char *>(source.c_str()).
Even if it may modify the string, &source[0] is suitable (at least since C++11). It may not seem right but it is:
a pointer to s[0] can be passed to functions that expect a pointer to the first element of a null-terminated (since C++11)CharT[] array.
— https://en.cppreference.com/w/cpp/string/basic_string
(and since C++17 data() is the way to go).
However, unlike that obtained from malloc any such pointer becomes invalid when the string is resized or destroyed (be careful “the string” means “that particular copy of the string” if you have several).
¹ Strictly speaking, pass a reference; references aren’t restricted to function arguments in C++.
The problem was as #WeatherVane and #JerryJeremiah mentioned. The pointer returned by malloc and assigned to target was in the local scope of cast_to_cstyle(), which got destroyed after the function returns. So the protocol, hostname, port, path variables declared in main were never assigned, hence it printed out gibberish. I've fixed this by making the cast_to_style() returns a char *.
char *cast_to_cstyle_str(string source)
{
char *target = (char *)malloc(source.size() + 1);
memcpy(target, source.c_str(), source.size() + 1);
return target;
}
Note: I forgot to free up malloc in my question.
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
I'm getting warnings when I compile something like this...
std::string something = "bacon";
sprintf("I love %s a lot", something.c_str());
Where it says "warning: deprecated conversion from string constant to 'char *'. I tried converting the text to be...
const char *
instead but I get a different error. I'm not committed to sprintf if there is a better option.
For sprintf to work, you need to provide an array of char big enough to write the result to as the first argument.
However, you can (and should!) just use the far easier operator+ for C++ strings:
std::string res = "I love " + something + " a lot";
sprintf("I love %s a lot", something.c_str);
In that code, you should call something.c_str() with proper function call () syntax.
Note also that the above use of sprintf() is wrong, since you didn't provide a valid destination string buffer for the resulting formatted string.
Moreover, for security reasons, you should use the safer snprintf() instead of sprintf(). In fact, with snprintf() you can specify the size of the destination buffer, to avoid buffer overruns.
The following compilable code is an example of snprintf() usage:
#include <stdio.h>
#include <string>
int main()
{
std::string something = "bacon";
char buf[128];
snprintf(buf, sizeof(buf), "I love %s a lot", something.c_str());
printf("%s\n", buf);
}
P.S.
In general, in C++ you may consider string concatenation using std::string::operator+, e.g.:
std::string result = "I love " + something + " a lot";
It doesn't look like a correct use of sprintf.
First parameter is supposed to be a char * with already a backing memory.
For example:
char *str = malloc (BUFSIZ);
sprintf (str, "I love %s a lot", something.c_str);
I've tried implementing a function like this, but unfortunately it doesn't work:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t wc[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
My main goal here is to be able to integrate normal char strings in a Unicode application. Any advice you guys can offer is greatly appreciated.
In your example, wc is a local variable which will be deallocated when the function call ends. This puts you into undefined behavior territory.
The simple fix is this:
const wchar_t *GetWC(const char *c)
{
const size_t cSize = strlen(c)+1;
wchar_t* wc = new wchar_t[cSize];
mbstowcs (wc, c, cSize);
return wc;
}
Note that the calling code will then have to deallocate this memory, otherwise you will have a memory leak.
Use a std::wstring instead of a C99 variable length array. The current standard guarantees a contiguous buffer for std::basic_string. E.g.,
std::wstring wc( cSize, L'#' );
mbstowcs( &wc[0], c, cSize );
C++ does not support C99 variable length arrays, and so if you compiled your code as pure C++, it would not even compile.
With that change your function return type should also be std::wstring.
Remember to set relevant locale in main.
E.g., setlocale( LC_ALL, "" ).
const char* text_char = "example of mbstowcs";
size_t length = strlen(text_char );
Example of usage "mbstowcs"
std::wstring text_wchar(length, L'#');
//#pragma warning (disable : 4996)
// Or add to the preprocessor: _CRT_SECURE_NO_WARNINGS
mbstowcs(&text_wchar[0], text_char , length);
Example of usage "mbstowcs_s"
Microsoft suggest to use "mbstowcs_s" instead of "mbstowcs".
Links:
Mbstowcs example
mbstowcs_s, _mbstowcs_s_l
wchar_t text_wchar[30];
mbstowcs_s(&length, text_wchar, text_char, length);
You're returning the address of a local variable allocated on the stack. When your function returns, the storage for all local variables (such as wc) is deallocated and is subject to being immediately overwritten by something else.
To fix this, you can pass the size of the buffer to GetWC, but then you've got pretty much the same interface as mbstowcs itself. Or, you could allocate a new buffer inside GetWC and return a pointer to that, leaving it up to the caller to deallocate the buffer.
I did something like this. The first 2 zeros are because I don't know what kind of ascii type things this command wants from me. The general feeling I had was to create a temp char array. pass in the wide char array. boom. it works. The +1 ensures that the null terminating character is in the right place.
char tempFilePath[MAX_PATH] = "I want to convert this to wide chars";
int len = strlen(tempFilePath);
// Converts the path to wide characters
int needed = MultiByteToWideChar(0, 0, tempFilePath, len + 1, strDestPath, len + 1);
Andrew Shepherd 's answer.
Andrew Shepherd 's answer is Good for me, I add up some fix :
1, remove the ending char L'\0', casue sometime it will trouble.
2, use mbstowcs_s
std::wstring wtos(std::string& value){
const size_t cSize = value.size() + 1;
std::wstring wc;
wc.resize(cSize);
size_t cSize1;
mbstowcs_s(&cSize1, (wchar_t*)&wc[0], cSize, value.c_str(), cSize);
wc.pop_back();
return wc;
}
The question has several problems, but so do some of the answers. The idea of returning a pointer to allocated memory "and leaving it up to the caller to de-allocate" is asking for trouble. As a rule the best pattern is always to allocate and de-allocate within the same function. For example, something like:
wchar_t* buffer = new wchar_t[get_wcb_size(str)];
mbstowcs(buffer, str, get_wcb_size(str) + 1);
...
delete[] buffer;
In general, this requires two functions, one the caller calls to find out how much memory to allocate and a second to initialize or fill the allocated memory.
Unfortunately, the basic idea of using a function to return a "new" object is problematic -- not inherently, but because of the C++ inheritance of C memory handling. Using C++ and STL's strings/wstrings/strstreams is a better solution, but I felt the memory allocation thing needed to be better addressed.
Your problem has nothing to do with encodings, it's a simple matter of understanding basic C++. You are returning a pointer to a local variable from your function, which will have gone out of scope by the time anyone can use it, thus creating undefined behaviour (i.e. a programming error).
Follow this Golden Rule: "If you are using naked char pointers, you're Doing It Wrong. (Except for when you aren't.)"
I've previously posted some code to do the conversion and communicating the input and output in C++ std::string and std::wstring objects.
auto Ascii_To_Wstring = [](int code)->std::wstring
{
if (code>255 || code<0 )
{
throw std::runtime_error("Incorrect ASCII code");
}
std::string s{ char(code) };
std::wstring w{ s.begin(),s.end() };
return w;
};
I have a C++ string. I need to pass this string to a function accepting a char* parameter (for example - strchr()).
a) How do I get that pointer?
b) Is there some function equivalent to strschr() that works for C++ strings?
To get the C string equivalent of
the C++ string object use c_str
function.
To locate the first occurence of a
char in a string object use
find_first_of function.
Example:
string s = "abc";
// call to strlen expects char *
cout<<strlen(s.c_str()); // prints 3
// on failure find_first_of return string::npos
if(s.find_first_of('a') != string::npos)
cout<<s<<" has an a"<<endl;
else
cout<<s<<" has no a"<<endl;
Note: I gave the strlen just an example of a function that takes char*.
Surprisingly, std:;string has far, far more capabilities than C-style strings. You probably want the find_first_of() method. In general, if you find yourself using the strxxx() functions on C++ std::strings, you are almost certainly doing something wrong.
Like much of the C++ Standard Library, the string class is a complex beast. To make the most of it, you really need a good reference book. I recommend The C++ Standard Library, by Nicolai Josuttis.
You can't get a char* from a string
string does not allow you free access to its internal buffer.
The closest you can get is a const char* using .c_str() if you want it null terminated or .data() if it doesn't have to be null terminated.
You can then cast the pointer returned by these functions to char* but you do this on your own risk. That being said this is a relatively safe cast to make as long as you make sure you're not changing the string. If you changed it then the pointer you got from c_str() may no longer be valid.
This code:
string str("Hello World!");
char* sp = (char*)str.c_str();
sp[5] = 'K';
is probably ok
However this:
string str("Hello World!");
char* sp = (char*)str.c_str();
str = "Chaged string";
sp[5] = 'K';
is most definitely not ok.
If you just want to assign a string literal to pw, you can do it like
char *pw = "Hello world";
If you have a C++ std::string object, the value of which you want to assign to pw, you can do it like
char *pw = some_string.c_str()
However, the value that pw points to will only be valid for the life time of some_string.
More here :
How to assign a string to char *pw in c++
GoodLUCK!!
std::string yourString("just an example");
char* charPtr = new char[yourString.size()+1];
strcpy(charPtr, yourString.c_str());
If str in your string use str.c_str() method to get the char* inside it.
Perhaps this exmaple will help you
#include <iostream>
#include <string>
using namespace std;
int main ()
{
string str ("Replace the vowels in this sentence by asterisks.");
size_t found;
found=str.find_first_of("aeiou");
while (found!=string::npos)
{
str[found]='*';
found=str.find_first_of("aeiou",found+1);
}
cout << str << endl;
return 0;
}
The C++ Standard provides two member functions of claass std::basic_string that return pointer to the first element of the string. They are c_str() and data(). But the both return const char *, So you may not use them with a function that has parameter of type char *.
As for function strchr then its first parameter is const char *. So you may use c_str() and data() with this function. However it is much better to use member function find()of class sttd::basic_string instead of strchr.