How would I place a unicode character into the file name?
I have an ostringstream that I use in defining the file name through ofstream, but I cannot use unicode characters. What would be the simplest way of doing this? Renaming it in a unicode format? And please explain how I would do so.
Your question is unclear. If you want to place unicode character - any string/stream class there is in STL has its unicode equivalent. std::string/std::wstring, std::stringstream/std::wstringstream. If you std::wstringstream, here how you would put unicode characters into it:
std::wstringstream wideStream;
wideStream << L"Hello, world";
std::wstring wideString = wideStream.str();
Hope this helps.
/* This program attempts to rename a file named
* CRT_RENAMER.OBJ to CRT_RENAMER.JBO. For this operation
* to succeed, a file named CRT_RENAMER.OBJ must exist and
* a file named CRT_RENAMER.JBO must not exist.
*/
#include <stdio.h>
int main(void)
{
int result;
char old[] = "CRT_RENAMER.OBJ", new[] = "CRT_RENAMER.JBO";
/* Attempt to rename file: */
result = rename(old, newArray);
if(result != 0)
printf("Could not rename '%s'\n", old );
else
printf("File '%s' renamed to '%s'\n", old, newArray);
}
Related
I'm using this function to add my program to startup. But it doesn't work and I don't know why weird ascii characters and words are showing up in startup applications. What am I doing wrong?
Instead this is being added to starup. U㫅萹㐀蠀渐晁Ɉ U㫆萺㝈耀 U㫆萺㝈耀 and C. Which has no file location and also no details.
HKEY NewVal;
char loggeduser[UNLEN + 1];
std::ostringstream fileinhome;
GetUserNameA(loggeduser, &len);
fileinhome << "C:\\Users\\" << loggeduser << "\\AppData\\Roaming\\snetwork\\snetwork.exe";
std::string fp = fileinhome.str();
const char* file = fp.c_str();
if (RegOpenKey(HKEY_CURRENT_USER, _T("Software\\Microsoft\\Windows\\CurrentVersion\\Run"), &NewVal) != ERROR_SUCCESS)
{
return;
}
if (RegSetValueEx(NewVal, _T("CLI-Social-Network"), 0, REG_SZ, (LPBYTE)file, sizeof(file)) != ERROR_SUCCESS)
{
return;
}
else {
// std::cout << "Program added to Startup.\n";
// Do nothing, Program was added to Startup
}
RegCloseKey(NewVal);
A possibility: You have UNICODE and/or _UNICODE defined, so RegSetValueEx is actually RegSetValueExW. Therefore, this function passes Unicode data into the buffer file. But file is an ASCII buffer, so the otherwise-valid Unicode data is incorrectly parsed as ASCII, leading to the strange output.
To fix, use std::wstring and W functions explicitly.
Unicode considerations aside, you can't use a const char * as a buffer for receiving data. You must allocate sufficient memory for the buffer first.
The code below demonstrates how stat and GetFileAttributes fail when the path contains some strange (but valid) ASCII characters.
As a workaround, I would use the 8.3 DOS file name. But this does not work when the drive has 8.3 names disabled.
(8.3 names are disabled with the fsutil command: fsutil behavior set disable8dot3 1).
Is it possible to get stat and/or GetFileAttributes to work in this case?
If not, is there another way of determining whether or not a path is a directory or file?
#include "stdafx.h"
#include <sys/stat.h>
#include <string>
#include <Windows.h>
#include <atlpath.h>
std::wstring s2ws(const std::string& s)
{
int len;
int slength = (int)s.length() + 1;
len = MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, 0, 0);
wchar_t* buf = new wchar_t[len];
MultiByteToWideChar(CP_ACP, 0, s.c_str(), slength, buf, len);
std::wstring r(buf);
delete[] buf;
return r;
}
// The final characters in the path below are 0xc3 (Ã) and 0x3f (?).
// Create a test directory with the name à and set TEST_DIR below to your test directory.
const char* TEST_DIR = "D:\\tmp\\VisualStudio\\TestProject\\ConsoleApplication1\\test_data\\Ã";
int main()
{
std::string testDir = TEST_DIR;
// test stat and _wstat
struct stat st;
const auto statSucceeded = stat(testDir.c_str(), &st) == 0;
if (!statSucceeded)
{
printf("stat failed\n");
}
std::wstring testDirW = s2ws(testDir);
struct _stat64i32 stW;
const auto statSucceededW = _wstat(testDirW.data(), &stW) == 0;
if (!statSucceededW)
{
printf("_wstat failed\n");
}
// test PathIsDirectory
const auto isDir = PathIsDirectory(testDirW.c_str()) != 0;
if (!isDir)
{
printf("PathIsDirectory failed\n");
}
// test GetFileAttributes
const auto fileAttributes = ::GetFileAttributes(testDirW.c_str());
const auto getFileAttributesWSucceeded = fileAttributes != INVALID_FILE_ATTRIBUTES;
if (!getFileAttributesWSucceeded)
{
printf("GetFileAttributes failed\n");
}
return 0;
}
The problem you have encountered comes from using the MultiByteToWideChar function. Using CP_ACP can default to a code page that does not support some characters. If you change the default system code page to UTF8, your code will work. Since you cannot tell your clients what code page to use, you can use a third party library such as International Components for Unicode to convert from the host code page to UTF16.
I ran your code using console code page 65001 and VS2015 and your code worked as written. I also added positive printfs to verify that it did work.
Don't start with a narrow string literal and try to convert it, start with a wide string literal - one that represents the actual filename. You can use hexadecimal escape sequences to avoid any dependency on the encoding of the source code.
If the actual code doesn't use string literals, the best resolution depends on the situation; for example, if the file name is being read from a file, you need to make sure that you know what encoding the file is in and perform the conversion accordingly.
If the actual code reads the filename from the command line arguments, you can use wmain() instead of main() to get the arguments as wide strings.
I have been unable to open the file. The fb.is_Open() never returns true. Only when I hard code the data source in the fb.open() it works.
I've tried converting it to a string, char, and wstring with no effect.
What am I missing? The correct code would be fantastic but also and explanation.
Trying to open a file with the data source variable:
wchar_t dataSource[2048];
DWORD errNum = GetModuleFileName(NULL, dataSource, sizeof(dataSource)); //get current dir.
ifstream fb;
wcscat_s(dataSource, L".confg"); //adds ".config" to get full data Sournce
fb.open(dataSource, ios::in);
if (fb.is_open())
{
//get information
}
fb.close();
Here are some things Ive tried that have not worked:
wstring x = dataSource;
x.c_str()
char* cnvFileLoc = (char*)malloc(2048);
size_t count;
count = wcstombs_s(&count, cnvFileLoc, 2048, dataSource, 2048);
what does work is:
fb.open(X:\CPP.Lessons\PluralSight\PluralSight.Fundamentals\Debug\PluralSight.Fundamentals.exe.config, ios::in)
Your call to GetModuleFileName() is wrong. The last parameter is expressed in characters, not in bytes, and the return value tells how many characters were copied:
wchar_t dataSource[2048];
if (GetModuleFileName(NULL, dataSource, 2048) > 0)
{
...
}
Or:
wchar_t dataSource[2048];
if (GetModuleFileName(NULL, dataSource, sizeof(dataSource)/sizeof(dataSource[0])) > 0)
{
...
}
Or:
wchar_t dataSource[2048];
if (GetModuleFileName(NULL, dataSource, _countof(dataSource)) > 0)
{
...
}
Or:
wchar_t dataSource[2048];
if (GetModuleFileName(NULL, dataSource, ARRAYSIZE(dataSource)) > 0)
{
...
}
That being said, you are appending .confg to the end of the full filename. So, if your application is named myapp.exe, you are trying to open myapp.exe.confg. Is that what you really want?
If yes, then make sure the .confg file actually exists, and that your app has permission to access it. CreateFile() would offer much more useful error info then ifstream does.
Otherwise, assuming the .confg file is at least in the same folder as your app, you would have to manually remove the filename portion from the buffer and then substitute in the correct filename. Have a look at PathRemoveFileSpec() and PathCombine() for that. Or, if the file is named myapp.confg, look at PathRenameExtension().
Update: I just noticed that your code is appending .confg, but your comment says .config instead:
//wcscat_s(dataSource, L".confg");
wcscat_s(dataSource, L".config");
You may have mistyped the file extension: L".confg" instead of L".config" as stated by the comment in your code.
Hi i am using Standard Regex Library (regcomp, regexec..). But now on demand i should add unicode support to my codes for regular expressions.
Does Standard Regex Library provide unicode or basically non-ascii characters? I researched on the Web, and think not.
My project is resource critic therefore i don't want to use large libraries for it (ICU and Boost.Regex).
Any help would be appreciated..
Looks like POSIX Regex working properly with UTF-8 locale. I've just wrote a simple test (see below) and used it for matching string with a cyrillic characters against regex "[[:alpha:]]" (for example). And everything working just fine.
Note: The main thing you must remember - regex functions are locale-related. So you must call setlocale() before it.
#include <sys/types.h>
#include <string.h>
#include <regex.h>
#include <stdio.h>
#include <locale.h>
int main(int argc, char** argv) {
int ret;
regex_t reg;
regmatch_t matches[10];
if (argc != 3) {
fprintf(stderr, "Usage: %s regex string\n", argv[0]);
return 1;
}
setlocale(LC_ALL, ""); /* Use system locale instead of default "C" */
if ((ret = regcomp(®, argv[1], 0)) != 0) {
char buf[256];
regerror(ret, ®, buf, sizeof(buf));
fprintf(stderr, "regcomp() error (%d): %s\n", ret, buf);
return 1;
}
if ((ret = regexec(®, argv[2], 10, matches, 0)) == 0) {
int i;
char buf[256];
int size;
for (i = 0; i < sizeof(matches) / sizeof(regmatch_t); i++) {
if (matches[i].rm_so == -1) break;
size = matches[i].rm_eo - matches[i].rm_so;
if (size >= sizeof(buf)) {
fprintf(stderr, "match (%d-%d) is too long (%d)\n",
matches[i].rm_so, matches[i].rm_eo, size);
continue;
}
buf[size] = '\0';
printf("%d: %d-%d: '%s'\n", i, matches[i].rm_so, matches[i].rm_eo,
strncpy(buf, argv[2] + matches[i].rm_so, size));
}
}
return 0;
}
Usage example:
$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
... (skip)
LC_ALL=
$ ./reg '[[:alpha:]]' ' 359 фыва'
0: 5-7: 'ф'
$
The length of the matching result is two bytes because cyrillic letters in UTF-8 takes so much.
Basically, POSIX regexes are not Unicode aware. You can try to use them on Unicode characters, but there might be problems with glyphs that have multiple encodings and other such issues that Unicode aware libraries handle for you.
From the standard, IEEE Std 1003.1-2008:
Matching shall be based on the bit pattern used for encoding the character, not on the graphic representation of the character. This means that if a character set contains two or more encodings for a graphic symbol, or if the strings searched contain text encoded in more than one codeset, no attempt is made to search for any other representation of the encoded symbol. If that is required, the user can specify equivalence classes containing all variations of the desired graphic symbol.
Maybe libpcre would work for you? It's slightly heavier than POSIX regexes, but I would think it lighter than ICU or Boost.
If you really mean "Standard", i.e. std::regex from C++11, then all you need to do is switch to std::wregex (and std::wstring of course).
I'm new to C++ world, I stuck with a very trivial problem i.e. to get file name without extension.
I have TCHAR variable containing sample.txt, and need to extract only sample, I used PathFindFileName function it just return same value what I passed.
I tried googling for solution but still no luck?!
EDIT: I always get three letter file extension, I have added the following code,
but at the end I get something like Montage (2)««þîþ how do I avoid junk chars at the end?
TCHAR* FileHandler::GetFileNameWithoutExtension(TCHAR* fileName)
{
int fileLength = _tcslen(fileName) - 4;
TCHAR* value1 = new TCHAR;
_tcsncpy(value1, fileName, fileLength);
return value1;
}
Here's how it's done.
#ifdef UNICODE //Test to see if we're using wchar_ts or not.
typedef std::wstring StringType;
#else
typedef std::string StringType;
#endif
StringType GetBaseFilename(const TCHAR *filename)
{
StringType fName(filename);
size_t pos = fName.rfind(T("."));
if(pos == StringType::npos) //No extension.
return fName;
if(pos == 0) //. is at the front. Not an extension.
return fName;
return fName.substr(0, pos);
}
This returns a std::string or a std::wstring, as appropriate to the UNICODE setting. To get back to a TCHAR*, you need to use StringType::c_str(); This is a const pointer, so you can't modify it, and it is not valid after the string object that produced it is destroyed.
You can use PathRemoveExtension function to remove extension from filename.
To get only the file name (with extension), you may have first to use PathStripPath, followed by PathRemoveExtension.
Try below solution,
string fileName = "sample.txt";
size_t position = fileName.find(".");
string extractName = (string::npos == position)? fileName : fileName.substr(0, position);
TCHAR* FileHandler::GetFileNameWithoutExtension(TCHAR* fileName)
{
int fileLength = _tcslen(fileName) - 4;
TCHAR* value1 = new TCHAR[fileLength+1];
_tcsncpy(value1, fileName, fileLength);
return value1;
}
Try this:
Assuming the file name is in a string.
string fileName = your file.
string newFileName;
for (int count = 0;
fileName[count] != '.';
count++)
{
newFileName.push_back(fileName[count]);
}
This will count up the letters in your original file name and add them one by one to the new file name string.
There are several ways to do this, but this is one basic way to do it.