Weird iconv_open behavior in Linux - c++

In porting a big app from a Windows to Linux, I need to be able to convert between wide characters and multibyte characters. To do this, I have code that looks like this:
void IConv(const InType* begin, const InType* end, const char* inCode, OutType* outBegin, OutType*& outEnd, const char* outCode)
{
assert(end >= begin);
assert(outEnd > outBegin);
iconv_t cd = iconv_open(outCode, inCode);
if (cd == reinterpret_cast<iconv_t>(-1))
throw (InvalidLocale ());
/* blah, blah, blah other code we never reach */
}
That code is always throwing an exception. To debug this, I created a simpler version that uses the same parameters as the code that fails. Here's my test code
int main( void )
{
const char outCode[] = "";
const char inCode[] = "wchar_t";
//Using wchar_t and "" means that iconv will just use the system locale settings.
iconv_t cd = iconv_open(outCode, inCode);
if (cd == reinterpret_cast<iconv_t>(-1))
{
printf("iconv failed to use outCode %s and inCode %s\n",outCode, inCode);
return 1;
}
iconv_close(cd);
return 0;
}
Notice that the code is pretty much the same. But in my test code I never see a failure, whereas the IConv function always fails. The locale on the system is set via the LANG env variable, which in this case is always ISO-8859-1.
So, the question is, does anyone know of any particular behavior in iconv that might present itself in a big app, but not in a simple case?
Thank you

The problem is likely that the target machine doesn't have the appropriate iconv libraries and indexes installed. See /usr/lib[64]/gconv. The shared libraries are typically part of the glibc installation. Tools such as localedef can create them.

Related

I need to use the shell script variable in c++ program, how can i? [duplicate]

I'd like to have access to the $HOME environment variable in a C++ program that I'm writing. If I were writing code in C, I'd just use the getenv() function, but I was wondering if there was a better way to do it. Here's the code that I have so far:
std::string get_env_var( std::string const & key ) {
char * val;
val = getenv( key.c_str() );
std::string retval = "";
if (val != NULL) {
retval = val;
}
return retval;
}
Should I use getenv() to access environment variables in C++? Are there any problems that I'm likely to run into that I can avoid with a little bit of knowledge?
There is nothing wrong with using getenv() in C++. It is defined by stdlib.h, or if you prefer the standard library implementation, you can include cstdlib and access the function via the std:: namespace (i.e., std::getenv()). Absolutely nothing wrong with this. In fact, if you are concerned about portability, either of these two versions is preferred.
If you are not concerned about portability and you are using managed C++, you can use the .NET equivalent - System::Environment::GetEnvironmentVariable(). If you want the non-.NET equivalent for Windows, you can simply use the GetEnvironmentVariable() Win32 function.
I would just refactor the code a little bit:
std::string getEnvVar( std::string const & key ) const
{
char * val = getenv( key.c_str() );
return val == NULL ? std::string("") : std::string(val);
}
If you are on Windows you can use the Win32 API GetEnvironmentVariable
On other linux/unix based systems use getenv
Why use GetEnvironmentVariable in Windows, from MSDN getenv:
getenv operates only on the data
structures accessible to the run-time
library and not on the environment
"segment" created for the process by
the operating system. Therefore,
programs that use the envp argument to
main or wmain may retrieve invalid
information.
And from MSDN GetEnvironment:
This function can retrieve either a
system environment variable or a user
environment variable.
In c++ you have to use std::getenv and #include <cstdlib>
A version of #Vlad's answer with some error checking and which distinguishes empty from missing values:
inline std::string get_env(const char* key) {
if (key == nullptr) {
throw std::invalid_argument("Null pointer passed as environment variable name");
}
if (*key == '\0') {
throw std::invalid_argument("Value requested for the empty-name environment variable");
}
const char* ev_val = getenv(key);
if (ev_val == nullptr) {
throw std::runtime_error("Environment variable not defined");
}
return std::string(ev_val);
}
Notes:
You could also replace the use of exceptions in the above with an std::optional<std::string> or, in the future, with an std::expected (if that ends up being standardized).
I've chosen safety over informativity here, by not concatenating the key into the what-string of the exception. If you make the alternative choice, try and limit copying from key to within reason (e.g. 100 characters? 200 characters?), and I'd also check these characters are printable, and sanitize those characters.
Yes, I know this is an old thread!
Still, common mistakes are, by definition, not new. :-)
The only reasons I see for not just using std::getenv(), would be to add a known default or to adopt common pattern/API in a framework. I would also avoid exceptions in this case (not generally though) simply because a non-value return is often enough a valid response for an environment variable. Adding the complexity of handling exceptions is counter-intuitive.
This is basically what I use:
const char* GetEnv( const char* tag, const char* def=nullptr ) noexcept {
const char* ret = std::getenv(tag);
return ret ? ret : def;
}
int main() {
int ret=0;
if( GetEnv("DEBUG_LOG") ) {
// Setup debug-logging
} else {
...
}
return (-1==ret?errno:0);
}
The difference between this and the other answers may seem small, but I find such small details are very rewarding when you form habits in how you code.
Just like the fact that getenv() returns a non-const pointer, which could easily lead to bad habits!

Access pre-compiled functions within a class C++/11

Sorry if the title is misleading, I'm currently looking for solutions to the following:
I'm developing a library, for other people to use. They have to follow a strict design concept and the way they structure any additional features within the library. They all use Linux and (Vim) and as such as are allowed to use terminal commands (i.e to be able to compile etc..) and we all use clang as a compiler.
My question is this: Let's suppose I write a function called: "checkCode":
template<typename T>
void checkCode(T&& codeSnippet)
{
//// code
}
I want to make this function run so whenever they type "checkCode" in a terminal this function is therefore called. I know using clang thy have similar functionality, however, this is understandable as you're using the whole of clang. So:
1) Is it possible to just compile a class, and then access each of the functions through
the .dylab | .so file?
2) Might it be a better idea, or, better to take a copy of the source of clang, add this functionality and role it out to those using and contributing to the library? This would be like an additional add-on to clang?
Thanks
you could use one executable and symbolic links to it like busybox:
int main(int argc, char **argv)
{
string programName = argv[0];
size_t lastSlash = programName.find_last_of('/');
if(lastSlash != string::npos)
programName = programName.substr(lastSlash + 1);
if(programName == "function_1")
{
function_1();
return 0;
}
if(programName == "function_2")
{
function_2();
return 0;
}
// ...
// normal main code
return 0;
}

Trash characters when using buffers in c++

I have a DLL that I need to handle in C++. I'm using WxWidgets (standard compilation, but I also tried Unicode on/off) and NetBeans. I also tried dealing with this without WxWidgets (windows.h) and had same problems.
Here is how I access the DLL functions using WxWidgets:
// -------------------- POINTERS TO FUNCTIONS
typedef bool(*TYPE_DLL_SetLicense)(char*, char*);
typedef bool(*TYPE_DLL_PingConnection)(char*);
typedef char*(*TYPE_DLL_ERR_DESCRIPTION)(void);
class DLL_Library
{
public:
// pointers to functions inside dll
TYPE_DLL_SetLicense DLL_SetLicense; //initialize - will wor fine as it returns only true/false (buffer only provide data)
TYPE_DLL_PingConnection DLL_PingConnection; //ping to serwer. Will return trahs, becouse it uses buffer to provide data ang get answear back
TYPE_DLL_ERR_DESCRIPTION DLL_ERR_DESCRIPTION; //error description. No buffer, no trouble. Returns correct string.
wxDynamicLibrary dynLib2;
int initialize(void)
{
//patch to dll
wxString path = wxStandardPaths::Get().GetExecutablePath().BeforeLast('\\') + _("\\DLL_dll\\DLLMOK.dll");
if(!wxFile::Exists(path)) return -1;
//load dll
if(!dynLib2.Load(path)) return -2;
//Assign functions in dll to variable
DLL_SetLicense=(TYPE_DLL_SetLicense) dynLib2.GetSymbol(wxT("DLL_SetLicense"));
DLL_PingConnection=(TYPE_DLL_PingConnection) dynLib2.GetSymbol(wxT("DLL_PingConnection"));
DLL_ERR_DESCRIPTION=(TYPE_DLL_ERR_DESCRIPTION) dynLib2.GetSymbol(wxT("DLL_ERROR_DESCRIPTION"));
return 0;
}
};
And here is the function I run. It should return and XML content, that I try to save to the file.
//DLL_PingConnection
//result ping to be save in file
wxFile file_ping_xml;
plik_ping_xml.Open(wxT("C:\\dll\\ping.xml"),wxFile::write);
char buffor_ping_xml[2000];
//I run the function here
bool is_ping = DLL_PingConnection(buffor_ping_xml);
if(is_ping)
{
tex_box->AppendText(wxT("DLL_PingConnection True\n"));
//we save result to file
bool is_write_ping_ok = file_ping_xml.Write(buffor_ping_xml,2000);
if (is_write_ping_ok){tex_box->AppendText(wxT("Save to file is ok ok\n"));}
else {tex_box->AppendText(wxT("Save to file failed :( \n"));}
}
else
{
tex_box->AppendText(wxT("DLL_PingConnection False\n"));
}
std::cout << "Error description: " << DLL_ERR_DESCRIPTION() << "\n"; //will work fine both in saving to file, and in streaming to screen.
The problem is that inside the file instead of good content I get rubbish like this:
NOTE that this only happens in functions that use buffers like:
char buffer[2000] //buffer will contain for example file xml
function do_sth_with_xml(buffer) //buffer containing xml will (should) be overwriten with xml results of the function - in our case DLL_PingCONNECTION should save in buffer xml with connection data
Documentation say that the DLL operates on Windows-1250. File ping.xml I have set to windows ANSI, but I don't think problem lies here.
EDIT: I have written problem without WxWidgets (I load DLL using windows.h) - same problems. Here is the code: Getting trash data in char* while using it as buffer in function . Please help :(
This
DLL_PingConnection=(TYPE_DLL_PingConnection)
shouldn't it be
DLL_PingConnection=(TYPE_DLL_PingConnection) dynLib2.GetSymbol(wxT("DLL_PingConnection"));
?
seems otherwise you will not get a valid pointer to the function in the DLL.
as a general rule you should check return values, especially from a DLL
you load dynamically since it happens that you sometimes get another version
of the DLL which may have a function with same name but other signature or
where is missing entirely.
You named a function
DLL_PingConnection=(TYPE_DLL_PingConnection) dynLib2.GetSymbol(....
and call it with
OSOZ.OSOZ_PingConnection(buffor_ping_xml);
you typedef a function
typedef bool(*TYPE_DLL_PingConnection)(char*);
you create a variable
char buffor_ping_xml[2000];
in your typedef it is char* and your buffor_ping_xml is char
how can that work ?
try
char *buffor_ping_xml = new char[2000];
/* or */
wchar_t *buffor_ping_xml = new wchar_t[2000];
/* or */
wxChar *buffor_ping_xml = new wxchar[2000];
bool is_ping = DLL_PingConnection(buffor_ping_xml);
wxString mystring = wxString::FromUTF8(buffor_ping_xml);
write mystring to file.
To Do:
look in your wxwidgets\libs folder for your libs
are there libwxmsw29ud_* with a 'u' in the name (after version number here 29)?
If not You can not use unicode
If yes next steps
for all different test char *, wchar_t *, wxChar * give the files different name.
for example file_ping_xml.Open(wxT("C:\dll\ping_w_t_FromUTF8.xml"), ...
for wchar_t * in combination with
wxString mystring = wxString::FromUTF8(buffor_ping_xml);
also in combination with
wxString mystring(buffor_ping_xml);
Then check out the look like, of the files in a browser .
To test you can go to your wxWidgets sample folder . Compile in the folder C:\wxWidgets\samples\docview\docview.cpp . Open with docview.exe a unicode file . How does it look.
Unicode download file
Unicode-related compilation settings
You should define wxUSE_UNICODE to 1 to compile your program in Unicode mode. This currently works for wxMSW, wxGTK, wxMac and wxX11. If you compile your program in ANSI mode you can still define wxUSE_WCHAR_T to get some limited support for wchar_t type.
Here is answear: Getting trash data in char* while using it as buffer in function.
Thanks everyone - expecially for patience.

How to use VS C++ GetEnvironmentVariable as cleanly as possible?

(This is not so much a problem as an exercise in pedantry, so here goes.)
I've made a nice little program that is native to my linux OS, but I'm thinking it's useful enough to exist on my Windows machine too. Thus, I'd like to access Windows' environment variables, and MSDN cites an example like this:
const DWORD buff_size = 50;
LPTSTR buff = new TCHAR[buff_size];
const DWORD var_size = GetEnvironmentVariable("HOME",buff,buff_size);
if (var_size==0) { /* fine, some failure or no HOME */ }
else if (var_size>buff_size) {
// OK, so 50 isn't big enough.
if (buff) delete [] buff;
buff = new TCHAR[var_size];
const DWORD new_size = GetEnvironmentVariable("HOME",buff,var_size);
if (new_size==0 || new_size>var_size) { /* *Sigh* */ }
else { /* great, we're done */ }
}
else { /* in one go! */ }
This is not nearly as nice (to me) as using getenv and just checking for a null pointer. I'd also prefer not to dynamically allocate memory since I'm just trying to make the program run on Windows as well as on my linux OS, which means that this MS code has to play nicely with nix code. More specifically:
template <class T> // let the compiler sort out between char* and TCHAR*
inline bool get_home(T& val) { // return true if OK, false otherwise
#if defined (__linux) || (__unix)
val = getenv("HOME");
if (val) return true;
else return false;
#elif defined (WINDOWS) || defined (_WIN32) || defined (WIN32)
// something like the MS Code above
#else
// probably I'll just return false here.
#endif
}
So, I'd have to allocate on the heap universally or do a #ifdef in the calling functions to free the memory. Not very pretty.
Of course, I could have just allocated 'buff' on the stack in the first place, but then I'd have to create a new TCHAR[] if 'buff_size' was not large enough on my first call to GetEnvironmentVariable. Better, but what if I was a pedant and didn't want to go around creating superfluous arrays? Any ideas on something more aesthetically pleasing?
I'm not that knowledgeable, so would anyone begrudge me deliberately forcing GetEnvironmentVariable to fail in order to get a string size? Does anyone see a problem with:
const DWORD buff_size = GetEnvironmentVariable("HOME",0,0);
TCHAR buff[buff_size];
const DWORD ret = GetEnvironmentVariable("HOME",buff,buff_size);
// ...
Any other ideas or any suggestions? (Or corrections to glaring mistakes?)
UPDATE:
Lots of useful information below. I think the best bet for what I'm trying to do is to use a static char[] like:
inline const char* get_home(void) { // inline not required, but what the hell.
#if defined (__linux) || (__unix)
return getenv("HOME");
#elif defined (WINDOWS) || defined (WIN32) || defined (_WIN32)
static char buff[MAX_PATH];
const DWORD ret = GetEnvironmentVariableA("USERPROFILE",buff,MAX_PATH);
if (ret==0 || ret>MAX_PATH)
return 0;
else
return buff;
#else
return 0;
#endif
}
Perhaps it's not the most elegant way of doing it, but it's probably the easiest way to sync up what I want to do between *nix and Windows. (I'll also worry about Unicode support later.)
Thank you to everybody who has helped.
DWORD bufferSize = 65535; //Limit according to http://msdn.microsoft.com/en-us/library/ms683188.aspx
std::wstring buff;
buff.resize(bufferSize);
bufferSize = GetEnvironmentVariableW(L"Name", &buff[0], bufferSize);
if (!bufferSize)
//error
buff.resize(bufferSize);
Of course, if you want ASCII, replace wstring with string and GetEnvironmentVariableW with GetEnvironmentVariableA.
EDIT: You could also create getenv yourself. This works because
The same memory location may be used in subsequent calls to getenv, overwriting the previous content.
const char * WinGetEnv(const char * name)
{
const DWORD buffSize = 65535;
static char buffer[buffSize];
if (GetEnvironmentVariableA(name, buffer, buffSize))
{
return buffer;
}
else
{
return 0;
}
}
Of course, it would probably be a good idea to use the wide character versions of all of this if you want to maintain unicode support.
This wasn't the original question, but it might worth to add the MFC way to this thread for reference:
CString strComSpec;
if (strComSpec.GetEnvironmentVariable(_T("COMSPEC")))
{
//Do your stuff here
}
VC++ implements getenv in stdlib.h, see, for example, here.
The suggestion you made at the end of your post is the right way to do this - call once to get required buffer size and then again to actually get the data. Many of the Win32 APIs work this way, it's confusing at first but common.
One thing you could do is to pass in a best-guess buffer and its size on the first call, and only call again if that fails.
Don't bother. %HOME% is a path on Windows, and should be usable by all reasonable programs. Therefore, it will fit in a WCHAR[MAX_PATH]. You don't need to deal with the edge case where it's longer than that - if it's longer, most file functions will reject it anyway so you might as well fail early.
However, do not assume you can use a TCHAR[MAX_PATH] or a char[MAX_PATH]. You do not have control over the contents of %HOME%; it will contain the users name. If that's "André" (i.e. not ASCII) you must store %HOME% in a WCHAR[MAX_PATH].

Man machine interface command syntax and parsing

What I want is to add possibility to interact with application, and be able to extract information from application or event ask it to change some states.
For that purpose I though of building cli utility. The utility will connect to the application and send user commands (one line strings) to the application and wait for response from the application.
The command should contain:
- command name (e.g. display-session-table/set-log-level etc.)
- optionally command may have several arguments (e.g. log-level=10)
The question to choose syntax and to learn parse it fast and correctly.
I don't want to reinvent the wheel, so maybe there's already an answer out there.
Take a look at the interpreter example (example usage) from Boost.FunctionTypes. Note however that as it is it only supports free functions.
boost::program_options is worth a look.
The Readline library could be useful.
I would suggest using a JSON library.
I use an unholy mix of readline, boost::spirit, and the factory pattern to handle all that. It wouldn't be nearly as unholy if it weren't for readlines unapologetic C syntax :)
The outer loop looks like this
while(true)
{
char *line(NULL);
line = readline((cmd.leaf() + " > ").c_str());
if (line && *line)
{
add_history(line);
int error = ParseLine(line,*s_g, std::cout);
free(line);
if (error == ErrQuit)
break;
if (error == ErrSave)
....
Each command has a completion function and a parser/completion function
char **completeCreate(const std::vector<std::string> &, const char *text, int depth)
{
switch (depth)
{
case 1:
case 2:
{
return rl_completion_matches(text, rl_filename_completion_function);
break;
}
case 3:
{
return rl_completion_matches(text, rulesFill);
break;
}
}
return NULL;
}
Defines the completer for a command that takes two arguments, a filename and a string, which gets registered with the completion mechanism of readline through a factory + macro, that lets me register everything with something that looks like this
REG_COMP(Create, completeCreate);
On the parser side, I have a similar factory setup
int parseCreate(const std::vector<std::string> &names, Game &g, std::ostream &out)
{
if (names.size() != 4)
return parseHelpC(names, g, out);
if (!CreateGame(names[1],names[2],names[3],g))
return ErrGameCreation;
return ErrNone;
}
REG_PARSE(Create,"CardList PowerList RuleSet");
that provides the actual logic and help text
I've left out huge swaths of code that glues everything together, but would be happy to share the hideousness that is the code base (it is currently a private git repository) I look forward to see if someone has something that works better.