GetFullPathNameW and long Windows file paths - c++

In the Windows version of my current personal project, I'm looking to support extended length filepaths. As a result, I'm a little confused with how to use the GetFullPathNameW API to resolve the full name of a long filepath.
According to the MSDN (with regards to the lpFileName parameter):
In the ANSI version of this function, the name is limited to MAX_PATH characters. To extend this limit to 32,767 wide characters, call the Unicode version of the function and prepend "\?\" to the path. For more information, see Naming a File.
If I'm understanding this correctly, in order to use an extended length filepath with GetFullPathNameW, I need to specify a path with the \\?\ prefix attached. Since the \\?\ prefix is only valid before volume letters or UNC paths, this would mean that the API is unusable for resolving the full name of a path relative to the current directory.
If that's the case, is there another API I can use to resolve the full name of a filepath like ..\somedir\somefile.txt if the resulting name's length exceeds MAX_PATH? If not, would I be able to combine GetCurrentDirectory with the relative filepath (\\?\C:\my\cwd\..\somedir\somefile.txt) and use it with GetFullPathNameW, or would I need to handle all of the filepath resolution on my own?

GetFullPathNameA is limited to MAX_PATH characters, because it converts the ANSI name to a UNICODE name beforehand using a hardcoded MAX_PATH-sized (in chars) UNICODE buffer. If the conversion doesn't fail due to the length restrictions, then GetFullPathNameW (or direct GetFullPathName_U[Ex]) is called and the resulting UNICODE name is converted to ANSI.
GetFullPathNameW is a very thin shell over GetFullPathName_U. It is limited to MAXSHORT (0x7fff) length in WCHARs, independent of the \\?\ file prefix. Even without \\?\, it will be work for long (> MAX_PATH) relative names. However, if the lpFileName parameter does not begin with the \\?\ prefix, the result name in the lpBuffer parameter will not begin with \\?\ either.
if you will be use lpBuffer with functions like CreateFileW - this function internally convert Win32Name to NtName. and result will be depended from nape type (RTL_PATH_TYPE). if the name does not begin with \\?\ prefix, the conversion fails because RtlDosPathNameToRelativeNtPathName_U[_WithStatus] fails (because if the path not begin with \\?\ it will be internally call GetFullPathName_U (same function called by GetFullPathNameW) with nBufferLength hardcoded to MAX_PATH (exactly 2*MAX_PATH in bytes – NTDLL functions use buffer size in bytes, not in WCHARs). If name begin with \\?\ prefix, another case in RtlDosPathNameToRelativeNtPathName_U[_WithStatus] is executed – RtlpWin32NtNameToNtPathName, which replaces \\?\ with \??\ and has no MAX_PATH limitation
So the solution may look like this:
if(ULONG len = GetFullPathNameW(FileName, 0, 0, 0))
{
PWSTR buf = (PWSTR)_alloca((4 + len) * sizeof(WCHAR));
buf[0] = L'\\', buf[1] = L'\\', buf[2] = L'?', buf[3] = L'\\';
if (len - 1 == GetFullPathName(FileName, len, buf + 4, &c))
{
CreateFile(buf, ...);
}
}
So we need to specify a path with the \\?\ prefix attached, but not before GetFullPathName - after!
For more info, read this - The Definitive Guide on Win32 to NT Path Conversion

Just to update with the current state:
Starting in Windows 10, version 1607, MAX_PATH limitations have been removed from common Win32 file and directory functions. However, you must opt-in to the new behavior. To enable the new long path behavior, both of the following conditions must be met: ...
For the rest, please see my answer here: https://stackoverflow.com/a/57624626/3736444

Related

Overcome MAX_PATH filename length

I have read a lot of documentation on this subject, but I can't seem to figure it out.
The cause is that I have to process file paths which may be longer than the MAX_PATH parameter, causing a lot of issues
I have already replaced all my ANSI-functions like GetFileAttributesA with the UNICODE equivalent (GetFileAttributesW) in order to support the extended file path length with the prefix: \\?\.
However, I also need to check whether the file path for instance is a symbolic link and I need to know the filesize, last modified date, etc.
In order to do so, I use the stat function, as shown below:
if (fstat(LongFilePath, &file_info) == 0) //THIS FAILS WITH THE ENAMETOOLONG FOR LONG FILEPATHS
So, here again, the problem is in the ENAMETOOLONG error, due to a too long filename (exceeding MAX_PATH).
So, I found out I could use fstat to access the file by its descriptor. However, to obtain the descriptor, I need to use fopen, which also has the ENAMETOOLONG limitation.
So, my question is. How can I get the file information I need (symlink, filesize, last modified, .... as the stat function offers) for file paths exceeding MAX_PATH

Can I retrieve a path, containing other than Latin characters?

I call GetModuleFileName function, in order to retrieve the fully qualified path of a specified module, in order to call another .exe in the same file, via Process::Start method.
However, .exe cannot be called when the path contains other than Latin characters (in my case Greek characters).
Is there any way I can fix this?
Code:
TCHAR path[1000];
GetModuleFileName(NULL, path, 1000) ; // Retrieves the fully qualified path for the file that
// contains the specified module.
PathRemoveFileSpec(path); // Removes the trailing file name and backslash from a path (TCHAR).
CHAR mypath[1000];
// Convert TCHAR to CHAR.
wcstombs(mypath, path, wcslen(path) + 1);
// Formatting the string: constructing a string by substituting computed values at various
// places in a constant string.
CHAR mypath2[1000];
sprintf_s(mypath2, "%s\\Client_JoypadCodesApplication.exe", mypath);
String^ result;
result = marshal_as<String^>(mypath2);
Process::Start(result);
Strings in .NET are encoded in UTF-16. The fact that you are calling wcstombs() means your app is compiled for Unicode and TCHAR maps to WCHAR, which is what Windows uses for UTF-16. So there is no need to call wcstombs() at all. Retrieve and format the path as UTF-16, then marshal it as UTF-16. Stop using TCHAR altogether (unless you need to compile for Windows 9x/ME):
WCHAR path[1000];
GetModuleFileNameW(NULL, path, 1000);
PathRemoveFileSpecW(path);
WCHAR mypath[1000];
swprintf_s(mypath, 1000, L"%s\\Client_JoypadCodesApplication.exe", path);
String^ result;
result = marshal_as<String^>(mypath);
Process::Start(result);
A better option would be to use a native .NET solution instead (untested):
String^ path = Path::DirectoryName(Application->StartupPath); // uses GetModuleFileName() internally
// or:
//String^ path = Path::DirectoryName(Process::GetCurrentProcess()->MainModule->FileName);
Process::Start(path + L"\\Client_JoypadCodesApplication.exe");
You must use GetModuleFileNameW and store the result in a wchar_t string.
Most Win32 API functions have a "Unicode" variant, which takes/gives UTF-16 strings. Using the ANSI versions is highly discouraged.

Storing and retrieving UTF-8 strings from Windows resource (RC) files

I created an RC file which contains a string table, I would like to use some special
characters: ö ü ó ú ő ű á é. so I save the string with UTF-8 encoding.
But when I call in my cpp file, something like this:
LoadString("hu.dll", 12, nn, MAX_PATH);
I get a weird result:
How do I solve this problem?
As others have pointed out in the comments, the Windows APIs do not provide direct support for UTF-8 encoded text. You cannot pass the MessageBox function UTF-8 encoded strings and get the output that you expect. It will, instead, interpret them as characters in your local code page.
To get a UTF-8 string to pass to the Windows API functions (including MessageBox), you need to use the MultiByteToWideChar function to convert from UTF-8 to UTF-16 (what Windows calls Unicode, or wide strings). Passing the CP_UTF8 flag for the first parameter is the magic that enables this conversion. Example:
std::wstring ConvertUTF8ToUTF16String(const char* pszUtf8String)
{
// Determine the size required for the destination buffer.
const int length = MultiByteToWideChar(CP_UTF8,
0, // no flags required
pszUtf8String,
-1, // automatically determine length
nullptr,
0);
// Allocate a buffer of the appropriate length.
std::wstring utf16String(length, L'\0');
// Call the function again to do the conversion.
if (!MultiByteToWideChar(CP_UTF8,
0,
pszUtf8String,
-1,
&utf16String[0],
length))
{
// Uh-oh! Something went wrong.
// Handle the failure condition, perhaps by throwing an exception.
// Call the GetLastError() function for additional error information.
throw std::runtime_error("The MultiByteToWideChar function failed");
}
// Return the converted UTF-16 string.
return utf16String;
}
Then, once you have a wide string, you will explicitly call the wide-string variant of the MessageBox function, MessageBoxW.
However, if you only need to support Windows and not other platforms that use UTF-8 everywhere, you will probably have a much easier time sticking exclusively with UTF-16 encoded strings. This is the native Unicode encoding that Windows uses, and you can pass these types of strings directly to any of the Windows API functions. See my answer here to learn more about the interaction between Windows API functions and strings. I recommend the same thing to you as I did to the other guy:
Stick with wchar_t and std::wstring for your characters and strings, respectively.
Always call the W variants of Windows API functions, including LoadStringW and MessageBoxW.
Ensure that the UNICODE and _UNICODE macros are defined either before you include any of the Windows headers or in your project's build settings.

GetDiskFreeSpaceEx with NULL Directory Name failing

I'm trying to use GetDiskFreeSpaceEx in my C++ win32 application to get the total available bytes on the 'current' drive. I'm on Windows 7.
I'm using this sample code: http://support.microsoft.com/kb/231497
And it works! Well, almost. It works if I provide a drive, such as:
...
szDrive[0] = 'C'; // <-- specifying drive
szDrive[1] = ':';
szDrive[2] = '\\';
szDrive[3] = '\0';
pszDrive = szDrive;
...
fResult = pGetDiskFreeSpaceEx ((LPCTSTR)pszDrive,
    (PULARGE_INTEGER)&i64FreeBytesToCaller,
    (PULARGE_INTEGER)&i64TotalBytes,
(PULARGE_INTEGER)&i64FreeBytes);
fResult becomes true and i can go on to accurately calculate the number of free bytes available.
The problem, however, is that I was hoping to not have to specify the drive, but instead just use the 'current' one. The docs I found online (Here) state:
lpDirectoryName [in, optional]
A directory on the disk. If this parameter is NULL, the function uses the root of the current disk.
But if I pass in NULL for the Directory Name then GetDiskFreeSpaceEx ends up returning false and the data remains as garbage.
fResult = pGetDiskFreeSpaceEx (NULL,
    (PULARGE_INTEGER)&i64FreeBytesToCaller,
    (PULARGE_INTEGER)&i64TotalBytes,
(PULARGE_INTEGER)&i64FreeBytes);
//fResult == false
Is this odd? Surely I'm missing something? Any help is appreciated!
EDIT
As per JosephH's comment, I did a GetLastError() call. It returned the DWORD for:
ERROR_INVALID_NAME 123 (0x7B)
The filename, directory name, or volume label syntax is incorrect.
2nd EDIT
Buried down in the comments I mentioned:
I tried GetCurrentDirectory and it returns the correct absolute path, except it prefixes it with \\?\
it returns the correct absolute path, except it prefixes it with \\?\
That's the key to this mystery. What you got back is the name of the directory with the native api path name. Windows is an operating system that internally looks very different from what you are familiar with winapi programming. The Windows kernel has a completely different api, it resembles the DEC VMS operating system a lot. No coincidence, David Cutler used to work for DEC. On top of that native OS were originally three api layers, Win32, POSIX and OS/2. They made it easy to port programs from other operating systems to Windows NT. Nobody cared much for the POSIX and OS/2 layers, they were dropped at XP time.
One infamous restriction in Win32 is the value of MAX_PATH, 260. It sets the largest permitted size of a C string that stores a file path name. The native api permits much larger names, 32000 characters. You can bypass the Win32 restriction by using the path name using the native api format. Which is simply the same path name as you are familiar with, but prefixed with \\?\.
So surely the reason that you got such a string back from GetCurrentDirectory() is because your current directory name is longer than 259 characters. Extrapolating further, GetDiskFreeSpaceEx() failed because it has a bug, it rejects the long name it sees when you pass NULL. Somewhat understandable, it isn't normally asked to deal with long names. Everybody just passes the drive name.
This is fairly typical for what happens when you create directories with such long names. Stuff just starts falling over randomly. In general there is a lot of C code around that uses MAX_PATH and that code will fail miserably when it has to deal with path names that are longer than that. This is a pretty exploitable problem too for its ability to create stack buffer overflow in a C program, technically a carefully crafted file name could be used to manipulate programs and inject malware.
There is no real cure for this problem, that bug in GetDiskFreeSpaceEx() isn't going to be fixed any time soon. Delete that directory, it can cause lots more trouble, and write this off as a learning experience.
I am pretty sure you will have to retrieve the current drive and directory and pass that to the function. I remember attempting to use GetDiskFreeSpaceEx() with the directory name as ".", but that did not work.

ExpandEnvironmentStrings Not Expanding My Variables

I have a process under the Run key in the registry. It is trying to access an environment variable that I have defined in a previous session. I'm using ExpandEnvironmentStrings to expand the variable within a path. The environment variable is a user profile variable. When I run my process on the command line it does not expand as well. If I call 'set' I can see the variable.
Some code...
CString strPath = "\\\\server\\%share%"
TCHAR cOutputPath[32000];
DWORD result = ExpandEnvironmentStrings((LPSTR)&strPath, (LPSTR)&cOutputPath, _tcslen(strPath) + 1);
if ( !result )
{
int lastError = GetLastError();
pLog->Log(_T( "Failed to expand environment strings. GetLastError=%d"),1, lastError);
}
When debugging Output path is exactly the same as Path. No error code is returned.
What is goin on?
One problem is that you are providing the wrong parameters to ExpandEnvironmentStrings and then using a cast to hide that fact (although you do need a cast to get the correct type out of a CString).
You are also using the wrong value for the last parameter. That should be the size of the output buffer, not the size of the input length (from the documentation the maximum number of characters that can be stored in the buffer pointed to by the lpDst parameter)
Putting that altogether, you want:
ExpandEnvironmentStrings((LPCTSTR)strPath,
cOutputPath,
sizeof(cOuputPath) / sizeof(*cOutputPath));
I don't see any error checking code in your snippet, you don't assert the return value. If there's a problem, you'd never discover it. Also, you are using ANSI strings, beware of the weirdo requirement for the nSize argument (1 extra).
What about buffersize ? Is it initialized - to the right value ?
The documentation states that If the destination buffer is too small to hold the expanded string, the return value is the required buffer size, in characters.