Delphi alternatives to some File I/O C++ lib functions? - c++

fopen_s <--> OpenFile
fclose <--> CloseFile
Is my assumption correct?
I wonder what is better to use, OpenFile or CreateFile. The latter gives more freedom, but is it faster?

I would use neither in Delphi – I would use streams. Low level file handling is messy and error-prone, it's much better to use higher level routines if you can.
You ask which is faster, OpenFile or CreateFile. They are basically the same, but any method of opening a file is going to map onto the system call anyway so the performance will be the same no matter how you do it. What's more, when does performance for opening a file matter, it's when reading or writing that time is expended.
Any questions about performance are hard to answer without context. The answer for an app which reads thousands of small text files is different from one which streams backups to a tape drive, for example.
Anyway, to stress my original point, take advantage of the excellent high-level framework that Delphi provides, use streams, avoid low-level I/O and enjoy!
So, how does one use a Delphi stream? I'll try to illustrate this with a made up example of writing some text, in a string, to a file.
procedure SaveTextToFile(FileName, Text: string);
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmCreate);
Try
if Length(Text)>0 then
Stream.WriteBuffer(Text[1], Length(Text)*SizeOf(Char));
Finally
Stream.Free;
End;
end;
It's pretty self-explanatory. The second parameter to the TFileStream constructor determines the file mode. Here we want to create a brand new file and so if any contents exist, they are removed. You can also specify file sharing with this parameter.
The code to write the buffer out has a little boiler-plate but again is very simple.
Loading it back results in an almost identical routine:
function LoadTextFromFile(FileName: string): string;
var
Stream: TFileStream;
begin
Stream := TFileStream.Create(FileName, fmOpenRead);
Try
SetLength(Result, Stream.Size div SizeOf(Char));
if Length(Result)>0 then
Stream.ReadBuffer(Result[1], Length(Result)*SizeOf(Char));
Finally
Stream.Free;
End;
end;
If you wish to seek around the file then you can set the Position property of the stream, or call the Seek() method. The advantage of the latter is that you can seek from current position or end position.
Streams are idiomatic Delphi. They are used pervasively in the RTL and VCL and by 3rd party libraries. They signal errors with exceptions in the native Delphi manner. There are many different stream classes that all derive from a common ancestor and many routines accept this common ancestor.

Low-level Delphi file handling is done like this:
procedure Proc;
var
f: file; // or f: TextFile;
begin
FileMode := fmOpenRead; // or fmOpenWrite or fmOpenReadWrite
AssignFile(f, 'C:\file.txt');
try
// Reset/Rewrite
// A number of BlockRead/BlockWrite/ReadLn/WriteLn...
finally
CloseFile(f);
end;
end;
This is the classic way of working with files in Delphi, and this is what corresponds to the C++ functions.
OpenFile and CreateFile are not Delphi functions, so they cannot correspond to the C++ functions. Instead, these are functions of the Windows API, which is available in all (Windows) programming languages. The former, OpenFile, is not recommended. Use CreateFile instead. But if you use the Windows API file-handling functions to open/create a file, you should also use these to read/write the file, e.g. the ReadFile function, and you must finish by using the CloseHandle function.
Notice in particular that OpenFile is a function of the Windows API, whereas CloseFile is a Delphi RTL function, so you cannot even use these together! Delphi: AssignFile->CloseFile; Windows API: CreateFile->CloseHandle.
You should also know, that there are high-level functions for managing files in the Delphi RTL (run-time library). I am sure other users will promote these.

It's been a long while since I have done any Delphi programming, but I remember that file IO were much better served using TStream suite of classes (TFileStream for file IO). They are essentially the equivalent mechanism of C++'s IO streams library, which is, of course, the preferred way of doing file IO in C++. See this simple example and this wiki.

Related

How to write custom input function for Flex in C++ mode?

I have a game engine and a shader parser. The engine has an API for reading from a virtual file system. I would like to be able to load shaders through this API. I was thinking about implementing my own std::ifstream but I don't like it, my api is very simple and I don't want to do a lot of unnecessary work. I just need to be able to read N bytes from the VFS. I used a C++ mod for more convenience, but in the end I can not find a solution to this problem, since there is very little official information about this. Everything is there for the C API, at least I can call the scan_string function, I did not find such a function in the yyFlexParser interface.
To be honest, I wanted to abandon the std::ifstream in the parser, and return only the C api . The only thing I used the Flex C++ mode for is to interact with the Bison C++ API and so that the parser can be used in a multi-threaded environment, but this can also be achieved with the C API.
I just couldn't compile the C parser with the C++ compiler.
I would be happy if there is a way to add such functionality through some kind of macro.
I wouldn't mind if there was a way to return the yy_scan_string function, I could read the whole file myself and provide just a string.
The simple solution, if you just want to provide a string input, is to make the string into a std::istringstream, which is a valid std::istream. The simplicity of this solution reduces the need for an equivalent to yy_scan_string.
On the other hand, if you have a data source you want to read from which is not derived from std::istream, you can easily create a lexical scanner which does whatever is necessary. Just subclass yyFlexLexer, add whatever private data members you will need and a constructor which initialises them, and override int LexerInput(char* buffer, size_t maxsize); to read at least one and no more than maxsize bytes into buffer, returning the number of characters read. (YY_INPUT also works in the C++ interface, but subclassing is more convenient precisely because it lets you maintain your own reader state.)
Notes:
If you decide to subclass and override LexerInput, you need to be aware that "interactive" mode is actually implemented in LexerInput. So if you want your lexer to have an interactive mode, you'll have to implement it in your override, too. In interactive mode, LexerInput always reads exactly one character (unless, of course, it's at the end of the file).
As you can see in the Flex code repository, a future version of Flex will use refactored versions of these functions, so you might need to be prepared to modify your code in the future, although Flex generally maintains backwards compatibility for a long time.

How do I get a HANDLE to the containing directory from a file HANDLE?

Given a HANDLE to a file (e.g. C:\\FolderA\\file.txt), I want a function which will return a HANDLE to the containing directory (in the previous example, it would be a HANDLE to C:\\FolderA). For example:
HANDLE hFile = CreateFileA(
"C:\\FolderA\\file.txt",
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
HANDLE hDirectory = somefunc(hFile);
Possible implementation for someFunc:
HANDLE someFunc(HANDLE h)
{
char *path = getPath(h); // "C:\\FolderA\\file.txt"
char *parent = getParentPath(path); // "C:\\FolderA"
HANDLE hFile = CreateFileA(
parent,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
free(parent);
free(path);
return hFile;
}
But is there a way to implement someFunc without getParentPath or without making it look at the string and removing everything after the last directory separator (because this is terrible from a performance point of view)?
I don't know what getParentPath is. I assume it's a function that searches for the trailing backslash in the string and uses that to strip off the file specification. You don't have to define such a function yourself; Windows already provides one for you—PathCchRemoveFileSpec. (Note that this assumes the specified path actually contains a file name to remove. If the path doesn't contain a file name, it will remove the trailing directory name. There are other functions you can use to verify whether a path contains a file specification.)
The older version of this function is PathRemoveFileSpec, which is what you would use on downlevel operating systems where the newer, safer function is not available.
Outside of the Windows API, there are other ways of doing the same thing. If you're targeting C++17, there is the filesystem::path class. Boost provides something similar. Or you could write it yourself with the find_last_of member function of the std::string class, if you absolutely have to. (But prefer not to re-invent the wheel. There are lots of edge cases when it comes to path manipulation that you probably won't think of, and that your testing probably won't reveal.)
You express concerns about the performance of this approach. This is nonsense. Stripping some characters from a string is not a slow operation. It wouldn't even be slow if you started searching from the beginning of the string and then, once you found the file specification, made a second copy of the string, again starting from the beginning of the string. It's a simple loop searching through the characters of a reasonable-length string, and then a simple memcpy. There is absolutely no way that this operation could be a performance bottleneck in code that does file I/O.
But, the implementation probably isn't even going to be that naïve. You can optimize it by starting the search from the end of the path string, reducing the number of characters that you have to iterate through, and you can avoid any type of memory copy altogether if you're allowed to manipulate the original string. With a C-style string, you just replace the trailing path separator (the one that demarcates the beginning of the path specification) with a NUL character (\0). With a C++-style string, you just call the erase member function.
In fact, if you really care about performance, this is virtually guaranteed to be faster than making a system call to retrieve the containing folder from a file object. System calls are a lot slower than some compiler-generated, inlinable code to iterate through a string and strip out a sub-string.
Once you have the path to the directory, you can obtain a HANDLE to it by calling the CreateFile function with the FILE_FLAG_BACKUP_SEMANTICS flag. (It is necessary to pass that flag if you want to retrieve a handle to a directory.
I have measured that this is slow and am looking for a faster way.
Your measurements are wrong. Either you've made the common mistake of benchmarking a debugging build, where the standard library functionality (e.g., std::string) is not optimized, and/or the real performance bottleneck is the file I/O. CreateFile is not a speedy function by any stretch of the imagination. I can almost guarantee that is going to be your hotspot.
Note that if you don't already have the path, it is straightforward to obtain the path from a HANDLE to a file. As was pointed out in the comments, on Windows Vista and later, you simply need to call the GetFinalPathNameByHandle function. More details are available in this article on MSDN, including sample code and an alternative for use on downlevel versions of Windows.
As was mentioned already in the comments to the question, you can optimize this further by allocating a buffer of length MAX_PATH (or perhaps even larger) on the stack. That compiles to a single instruction to adjust the stack pointer, so it won't be a performance bottleneck, either. (Okay, I lied: you actually will need two instructions—one to create space on the stack, and the other to free the allocated space on the stack. Still not a performance problem.) That way, you don't even have to do any dynamic memory allocation.
Note that for maximum robustness, especially on Windows 10, you want to handle the case that a path is longer than MAX_PATH. In such cases, your stack-allocated buffer will be too small, and the function you call to fill it will return an error. Handle that error, and allocate a larger buffer on the free store. That will be slower, but this is an edge case and probably not one that is worth optimizing. The 99% common case will use the stack-allocated buffer.
Furthermore, eryksun points out (in comments to this answer) that, although it is convenient, GetFinalPathNameByHandle requires multiple system calls to map the file object between the NT and DOS namespaces and to normalize the path. I haven't disassembled this function, so I can't confirm his claims, but I have no reason to doubt them. Under normal circumstances, you wouldn't worry about this sort of overhead or possible performance costs, but since this seems to be a big concern for your application, you can use eryksun's alternative suggestion of calling GetFileInformationByHandleEx and requesting the FileNameInfo class. GetFileInformationByHandleEx is a general, multi-purpose function that can retrieve all different sorts of information about a file, including the path. Its implementation is simpler, calling directly down to the native NtQueryInformationFile function. I would have thought GetFinalPathNameByHandle was just a user-mode wrapper providing exactly this service, but eryksun's research suggests it is doing extra work that you might want to avoid if this is truly a performance hot-spot. I have to qualify this slightly by noting that GetFileInformationByHandleEx, in order to retrieve the FileNameInfo, is going to have to create an I/O Request Packet (IRP) and call down to the underlying device driver. That's not a cheap operation, so I'm not sure that the additional overhead of normalizing the path is really going to matter. But in this case, there's no real harm in using the GetFileInformationByHandleEx approach, since it's a documented function.
If you've written the code as described but are still having measurable performance problems, then please post that code for someone to review and help you optimize. The Code Review Stack Exchange site is a great place to get help like that on working code. Feel free to leave me a link to such a question in a comment under this answer so that I don't miss it.
Whatever you do, please stop calling the ANSI versions of the Windows API functions (the ones that end with an A suffix). You want the wide-character (Unicode) versions. These end with a W suffix, and work with strings composed of WCHAR (== wchar_t) characters. Aside from the fact that the ANSI versions have been deprecated for decades now because they do not provide Unicode support (it is not optional for any application written after the year 2000 to support Unicode characters in paths), as much as you care about performance, you should be aware of the fact that all A-suffixed API functions are just stubs that convert the passed-in ANSI string to a Unicode string and then delegate to the W-suffixed version. If the function returns a string, a second conversion also must be done by the A-suffixed version, since all native APIs work with Unicode strings. Performance isn't the real reason why you should avoid calling ANSI functions, but perhaps it's one that you'll find more convincing.
There might be a way to do what you want (map a file object via a HANDLE to its containing directory), but it would require undocumented usage of the NT native API. I don't see anything at all in the documented functions that would allow you to obtain this information. It certainly isn't accessible via the GetFileInformationByHandleEx function. For better or worse, the user-mode file system API is almost entirely path-based. Presumably, it is tracked internally, but even the documented NT native API functions that take a root directory HANDLE (e.g., NtDeleteFile via the OBJECT_ATTRIBUTES structure) allow this field to be NULL, in which case the full path string is used.
As always, if you had provided more details on the bigger picture, we could probably provide a more appropriate solution. This is what the commenters were driving at when they mentioned an XY problem. Yes, people are questioning your motives because that's how we provide the most appropriate help.

Creating a new file avoiding race conditions

I need to develop a C++ routine performing this apparently trivial task: create a file only if it does not exist, else do nothing/raise error.
As I need to avoid race conditions, I want to use the "ask forgiveness not permission" principle (i.e. attempting the intended operation and checking if it succeeded, as opposed to checking preconditions in advance), which, to my knowledge, is the only robust and portable method for this purpose [Wikipedia article][an example with getline].
Still, I could not find a way to implement it in my case. The best I could come up with is opening a fstream in app mode (or fopening with "a"), checking the output position with tellp (C++) or ftell (C) and aborting if such position is not zero. This has however two disadvantages, namely that if the file exists it gets locked (although for a short time) and its modification date is altered.
I checked other possible combinations of ios_base::openmode for fstream, as well as the mode strings of fopen but found no option that suited my needs. Further search in the C and C++ standard libraries, as well as Boost Filesystem, proved unfruitful.
Can someone point out a method to perform my task in a robust way (no collateral effects, no race conditions) without relying on OS-specific functions?
My specific problem is in Windows, but portable solutions would be preferred.
EDIT: The answer by BitWhistler completely solves the problem for C programs. Still, I am amazed that no C++ idiomatic solution seems to exist. Either one uses open with the O_EXCL attribute as proposed by Andrew Henle, which is however OS-specific (in Windows the attribute seems to be called _O_EXCL with an additional underscore [MSDN]) or one separately compiles a C11 file and links it from the C++ code. Moreover, the file descriptor obtained cannot be converted to a stream except with nonstandard extensions (e.g. GCC's __gnu_cxx::stdio_filebuf). I hope a future version of C++ will implement the "x" subattribute and possibly also a corresponding ios:: modificator for file streams.
The new C standard (C2011, which is not part of C++) adds a new standard subspecifier ("x"), that can be appended to any "w" specifier (to form "wx", "wbx", "w+x" or "w+bx"/"wb+x"). This subspecifier forces the function to fail if the file exists, instead of overwriting it.
source: http://www.cplusplus.com/reference/cstdio/fopen/

fstream delete N bytes from the end of a binary file

Is it possible to delete N bytes from the end of a binary file in C++ using fstream (or something similar)? I don´t want to read the whole file, cut it and write it again, but since it´s from the end of a file it seems like it shouldn't be such a problem.
I'm not aware of a generic C++ (platform independent) way to do this without writing a new file. However, on POSIX systems (Linux, etc.) you can use the ftruncate() function. On Windows, you can use SetEndOfFile().
This also means you'll need to open the file using the native functions instead of fstream since you need the native descriptor/handle for those functions.
EDIT: If you are able to use the Boost library, it has a resize_file() function in its Filesystem library which would do what you want.
Update:
Now in C++17 you can use resize_file from filesystem
Live on Coliru
In case you want to use Qt, QFile also provides two resize() methods that allow to truncate a file.

How to pass a Delphi Stream to a c/c++ DLL

Is it possible to pass a Delphi stream (TStream descendant) to a DLL written in c/c++? DLL will be written in Microsoft c/c++. If that is not possible, how about if we use C++ Builder to create the DLL? Alternatively, are there any Stream (FIFO) classes which can be shared between Microsoft C/C++ and Delphi?
Thanks!
You can do this using IStream and TStreamAdapter. Here's a quick example (tested in D2007 and XE2):
uses
ActiveX;
procedure TForm1.DoSomething;
var
MemStream: TMemoryStream;
ExchangeStream: IStream;
begin
MemStream := TMemoryFile.Create;
try
MemStream.LoadFromFile('C:\Test\SomeFile.txt');
MemStream.Position := 0;
ExchangeStream := TStreamAdapter.Create(MemStream) as IStream;
// Pass ExchangeStream to C++ DLL here, and do whatever else
finally
MemStream.Free;
end;
end;
Just in case, if you need to go the other way (receiving an IStream from C/C++), you can use TOleStream to get from that IStream to a Delphi TStream.
Code compiled by Microsoft C/C++ cannot call methods directly on a Delphi object. You would have to wrap the methods up and present, to the C++ code, an interface, for example.
Code compiled by C++ Builder can call methods directly on a Delphi object.
In general, wrapping up a Delphi class and presenting it as an interface is not completely trivial. One reason why you can't just expose the raw methods via an interface is that the Delphi methods using the register calling convention which is proprietary to Embarcadero compilers. You'd need to use a calling convention that is understood by the Microsoft compiler, e.g. stdcall.
Another complication comes with exceptions. You would need to make sure that your interface methods did not throw exceptions since your C++ code can't be expected to catch them. One option would be to use Delphi's safecall calling convention. The safecall calling convention is stdcall but with an added twist that converts exceptions into HRESULT values.
All rather straight forward in concept, but probably requiring a certain amount of tedious boilerplate code.
Thankfully, in the case of TStream, you can use TStreamAdapter to expose the Delphi stream as a COM IStream. In fact, the source code for this small class shows how to handle the issues I describe above.