Is it poosible to open the same file using wfstream and fstream - c++

Actually i have a requirement wherein i need to open the same file using wfstream file instance at one part of the code and open it using fstream instance at the other part of the code. I need to access a file where the username is of type std::wstring and password is of type std::string. how do i get the values of both the variables in the same part of the code?
Like you can see below i need to get the values for username and password from the file and assign it to variables.
type conversion cannot be done. Please do not give that solution.
......file.txt.......
username-amritha
password-rajeevan
the code is written as follows:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
int main()
{
std::string y;
unsigned int l;
std::wstring username;
std::wstring x=L"username";
std::wstring q;
std::string password;
std::string a="password";
std::cout<<"enter the username:";
std::wcin>>username;
std::cout<<"enter the password:";
std::cin>>password;
std::wfstream fpp("/home/aricent/Documents/testing.txt",std::ios::in | std::ios::out );
std::getline(fpp,q);
if (q.find(x, 0) != std::string::npos) {
std::wstring z=q.substr(q.find(L"-") + 1) ;
std::wcout<<"the username is:"<<z;
fpp.seekg( 0, std::ios::beg );
fpp<<q.replace(x.length()+1, z.length(), username);
}
fpp.close();
std::fstream fp("/home/aricent/Documents/testing.txt",std::ios::in | std::ios::out );
std::getline(fp,y);
if (y.find(a, 0) != std::string::npos)
{
unsigned int len=x.length()+1;
unsigned int leng=username.length();
l=len+leng;
fp.seekg(l+1);
std::string b=y.substr(y.find("-") + 1) ;
fp<<y.replace(a.length()+1, b.length(), password);
}
fp.close();
}

It's not recommended to open multiple streams to a same file simultaneously. On the other hand, if you don't write to the file, but only read (and thus, would be using ifstream and wifstream), that's probably safe.
Alternatively, you can simply open a wfstream, read the username, close the stream, open a fstream and read the password.
If you have the choice, avoid mixed encoding files entirely.

You should not try to open the same file with two descriptors. Even if it worked (read only mode for example), both descriptors would not be synchronised, so you would read first characters on one, and next same characters on second.
So IMHO, you should stick to one single solution. My advice is to use a character stream to process the file, and use a codecvt to convert from a narrow string to a wide wstring when you need it.
An example conversion function could be (ref: cplusplus.com: codecvt::in):
std::wstring wconv(const std::string& str, const std::locale mylocale) {
// define a codecvt facet for the locale
typedef std::codecvt<wchar_t,char,std::mbstate_t> facet_type;
const facet_type& myfacet = std::use_facet<facet_type>(mylocale);
// define a mbstate to use in codecvt::in
std::mbstate_t mystate = std::mbstate_t();
size_t l = str.length();
const char * ix = str.data(), *next; // narrow character pointers
wchar_t *wc, *wnext; // wide character pointers
// use a wide char array of same length than the narrow char array to convert
wc = new wchar_t[str.length() + 1];
// conversion call
facet_type::result result = myfacet.in(mystate, ix, ix + l,
next, wc, wc + l, wnext);
// should test for error conditions
*wnext = 0; // ensure the wide char array is properly null terminated
std::wstring wstr(wc); // store it in a wstring
delete[] wc; // destroy the char array
return wstr;
}
This code should test for abnormal conditions, and use try catch to be immune to exceptions but it is left as exercise for the reader :-)
A variant of the above using codecvt::out could be used to convert from wide string to narrow string.
In above code, I would use (assuming nconv is the function using codecvt::out to convert from wide string to narrow string):
...
#include <locale>
...
std::cin>>password;
std::locale mylocale;
std::fstream fp("/home/aricent/Documents/testing.txt",std::ios::in | std::ios::out );
std::getline(fp,y);
q = wconv(y, mylocale);
...
fp<<nconv(q.replace(x.length()+1, z.length(), username));
}
std::getline(fp, y);
...

Related

base64 encoding removing carriage return from dos header

I have been trying to encode the binary data of an application as base64 (specifically boosts base64), but I have run into an issue where the carriage return after the dos header is not being encoded correctly.
it should look like this:
This program cannot be run in DOS mode.[CR]
[CR][LF]
but instead its outputting like this:
This program cannot be run in DOS mode.[CR][LF]
it seems this first carriage return is being skipped, which then causes the DOS header to be invalid when attempting to run the program.
the code for the base64 algorithm I am using can be found at: https://www.boost.org/doc/libs/1_66_0/boost/beast/core/detail/base64.hpp
Thanks so much!
void load_file(const char* filename, char** file_out, size_t& size_out)
{
FILE* file;
fopen_s(&file, filename, "r");
if (!file)
return false;
fseek(file, 0, SEEK_END);
size = ftell(file);
rewind(file);
*out = new char[size];
fread(*out, size, 1, file);
fclose(file);
}
void some_func()
{
char* file_in;
size_t file_in_size;
load_file("filename.bin", &file_in, file_in_size);
auto encoded_size = base64::encoded_size(file_in_size);
auto file_encoded = new char[encoded_size];
memset(0, file_encoded, encoded_size);
base64::encode(file_encoded, file_in, file_in_size);
std::ofstream orig("orig.bin", std::ios_base::binary);
for (int i = 0; i < file_in_size; i++)
{
auto c = file_in[i];
orig << c; // DOS header contains a NULL as the 3rd char, don't allow it to be null terminated early, may cause ending nulls but does not affect binary files.
}
orig.close();
std::ofstream encoded("encoded.txt"); //pass this output through a base64 to file website.
encoded << file_encoded; // for loop not required, does not contain nulls (besides ending null) will contain trailing encoded nulls.
encoded.close();
auto decoded_size = base64::decoded_size(encoded_size);
auto file_decoded = new char[decoded_size];
memset(0, file_decoded, decoded_size); // again trailing nulls but it doesn't matter for binary file operation. just wasted disk space.
base64::decode(file_decoded, file_encoded, encoded_size);
std::ofstream decoded("decoded.bin", std::ios_base::binary);
for (int i = 0; i < decoded_size; i++)
{
auto c = file_decoded[i];
decoded << c;
}
decoded.close();
free(file_in);
free(file_encoded);
free(file_decoded);
}
The above code will show that the file reading does not remove the carriage return, while the encoding of the file into base64 does.
Okay thanks for adding the code!
I tried it, and indeed there was "strangeness", even after I simplified the code (mostly to make it C++, instead of C).
So what do you do? You look at the documentation for the functions. That seems complicated since, after all, detail::base64 is, by definition, not part of public API, and "undocumented".
However, you can still read the comments at the functions involved, and they are pretty clear:
/** Encode a series of octets as a padded, base64 string.
The resulting string will not be null terminated.
#par Requires
The memory pointed to by `out` points to valid memory
of at least `encoded_size(len)` bytes.
#return The number of characters written to `out`. This
will exclude any null termination.
*/
std::size_t
encode(void* dest, void const* src, std::size_t len)
And
/** Decode a padded base64 string into a series of octets.
#par Requires
The memory pointed to by `out` points to valid memory
of at least `decoded_size(len)` bytes.
#return The number of octets written to `out`, and
the number of characters read from the input string,
expressed as a pair.
*/
std::pair<std::size_t, std::size_t>
decode(void* dest, char const* src, std::size_t len)
Conclusion: What Is Wrong?
Nothing about "dos headers" or "carriage returns". Perhaps maybe something about "rb" in fopen (what's the differences between r and rb in fopen), but why even use that:
template <typename Out> Out load_file(std::string const& filename, Out out) {
std::ifstream ifs(filename, std::ios::binary); // or "rb" on your fopen
ifs.exceptions(std::ios::failbit |
std::ios::badbit); // we prefer exceptions
return std::copy(std::istreambuf_iterator<char>(ifs), {}, out);
}
The real issue is: your code ignored all return values from encode/decode.
The encoded_size and decoded_size values are estimations that will give you enough space to store the result, but you have to correct it to the actual size after performing the encoding/decoding.
Here's my fixed and simplified example. Notice how the md5sums checkout:
Live On Coliru
#include <boost/beast/core/detail/base64.hpp>
#include <fstream>
#include <iostream>
#include <vector>
namespace base64 = boost::beast::detail::base64;
template <typename Out> Out load_file(std::string const& filename, Out out) {
std::ifstream ifs(filename, std::ios::binary); // or "rb" on your fopen
ifs.exceptions(std::ios::failbit |
std::ios::badbit); // we prefer exceptions
return std::copy(std::istreambuf_iterator<char>(ifs), {}, out);
}
int main() {
std::vector<char> input;
load_file("filename.bin", back_inserter(input));
// allocate "enough" space, using an upperbound prediction:
std::string encoded(base64::encoded_size(input.size()), '\0');
// encode returns the **actual** encoded_size:
auto encoded_size = base64::encode(encoded.data(), input.data(), input.size());
encoded.resize(encoded_size); // so adjust the size
std::ofstream("orig.bin", std::ios::binary)
.write(input.data(), input.size());
std::ofstream("encoded.txt") << encoded;
// allocate "enough" space, using an upperbound prediction:
std::vector<char> decoded(base64::decoded_size(encoded_size), 0);
auto [decoded_size, // decode returns the **actual** decoded_size
processed] // (as well as number of encoded bytes processed)
= base64::decode(decoded.data(), encoded.data(), encoded.size());
decoded.resize(decoded_size); // so adjust the size
std::ofstream("decoded.bin", std::ios::binary)
.write(decoded.data(), decoded.size());
}
Prints. When run on "itself" using
g++ -std=c++20 -O2 -Wall -pedantic -pthread main.cpp -o filename.bin && ./filename.bin
md5sum filename.bin orig.bin decoded.bin
base64 -d < encoded.txt | md5sum
It prints
d4c96726eb621374fa1b7f0fa92025bf filename.bin
d4c96726eb621374fa1b7f0fa92025bf orig.bin
d4c96726eb621374fa1b7f0fa92025bf decoded.bin
d4c96726eb621374fa1b7f0fa92025bf -

Load shellcode from file to char* comes strange characters in end of text

I have a char array[] and is like following:
// MessageBox
char xcode[] = "\x31\xc9\x64\x8b\x41\x30\x8b\x40\xc\x8b\x70\x14\xad\x96\xad\x8b\x58\x10\x8b\x53\x3c\x1\xda\x8b\x52\x78\x1\xda\x8b\x72\x20\x1\xde\x31\xc9\x41\xad\x1\xd8\x81\x38\x47\x65\x74\x50\x75\xf4\x81\x78\x4\x72\x6f\x63\x41\x75\xeb\x81\x78\x8\x64\x64\x72\x65\x75\xe2\x8b\x72\x24\x1\xde\x66\x8b\xc\x4e\x49\x8b\x72\x1c\x1\xde\x8b\x14\x8e\x1\xda\x31\xc9\x53\x52\x51\x68\x61\x72\x79\x41\x68\x4c\x69\x62\x72\x68\x4c\x6f\x61\x64\x54\x53\xff\xd2\x83\xc4\xc\x59\x50\x51\x66\xb9\x6c\x6c\x51\x68\x33\x32\x2e\x64\x68\x75\x73\x65\x72\x54\xff\xd0\x83\xc4\x10\x8b\x54\x24\x4\xb9\x6f\x78\x41\x0\x51\x68\x61\x67\x65\x42\x68\x4d\x65\x73\x73\x54\x50\xff\xd2\x83\xc4\x10\x68\x61\x62\x63\x64\x83\x6c\x24\x3\x64\x89\xe6\x31\xc9\x51\x56\x56\x51\xff\xd0";
Then i had inserted all this content of variable above into a file (file with UTF-8 format and content without the "") and tried load this way:
ifstream infile;
infile.open("shellcode.bin", std::ios::in | std::ios::binary);
infile.seekg(0, std::ios::end);
size_t file_size_in_byte = infile.tellg();
char* xcode = (char*)malloc(sizeof(char) * file_size_in_byte);
infile.seekg(0, std::ios::beg);
infile.read(xcode, file_size_in_byte);
printf("%s\n", xcode); // << prints content of xcode after load from file
if (infile.eof()) {
size_t bytes_really_read = infile.gcount();
}
else if (infile.fail()) {
}
infile.close();
I'm seeing some strange characters in end of text see:
What is need to fix it?
The issue is that the printf format specifier "%s" requires that the string is null-terminated. In your case, the null-terminator just happens to be after those characters you're seeing, but nothing guarantees where the null is unless you put one there.
Since you're using C++, one way to print the characters is to use the write() function available for streams:
#include <iostream>
//...
std::cout.write(xcode, file_size_in_bytes);
The overall point is this -- if you have a character array that is not null-terminated and contains data, you must either:
Put the null in the right place before using the array in functions that look for the null-terminator or
Use functions that state how many characters to process from the character array.
The answer above uses item 2.

Can't write chinese character into textfile with wofstream

I'm using std::wofstream to write characters in a text file.My characters can have chars from very different languages(english to chinese).
I want to print my vector<wstring> into that file.
If my vector contains only english characters I can print them without a problem.
But if I write chineses characters my file remains empty.
I browsed trough stackoverflow and all answers said bascially to use functions from the library:
#include <codecvt>
I can't include that library, because I am using Dev-C++ in version 5.11.
I did:#define UNICODE in all my header files.
I guess there is a really simple solution for that problem.
It would be great, if someone could help me out.
My code:
#define UNICODE
#include <string>
#include <fstream>
using namespace std;
int main()
{
string Path = "D:\\Users\\\t\\Desktop\\korrigiert_RotCommon_zh_check_error.log";
wofstream Out;
wstring eng = L"hello";
wstring chi = L"程序";
Out.open(Path, ios::out);
//works.
Out << eng;
//fails
Out << chi;
Out.close();
return 0;
}
Kind Regards
Even if the name of the wofstream implies it's a wide char stream, it's not. It's still a char stream that uses a convert facet from a locale to convert the wchars to char.
Here is what cppreference says:
All file I/O operations performed through std::basic_fstream<CharT> use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.
So you could either set the global locale to one that supports Chinese or imbue the stream. In both cases you'll get a single byte stream.
#include <locale>
//...
const std::locale loc = std::locale(std::locale(), new std::codecvt_utf8<wchar_t>);
Out.open(Path, ios::out);
Out.imbue(loc);
Unfortunately std::codecvt_utf8 is already deprecated[2]. This MSDN
magazine
article explains how to do UTF-8 conversion using MultiByteToWideChar C++ - Unicode Encoding Conversions with STL Strings and Win32 APIs.
Here the Microsoft/vcpkg variant of an to_utf8 conversion:
std::string to_utf8(const CWStringView w)
{
const size_t size = WideCharToMultiByte(CP_UTF8, 0, w.c_str(), -1, nullptr, 0, nullptr, nullptr);
std::string output;
output.resize(size - 1);
WideCharToMultiByte(CP_UTF8, 0, w.c_str(), -1, output.data(), size - 1, nullptr, nullptr);
return output;
}
On the other side you can use normal binary stream and write the wstring data with write().
std::ofstream Out(Path, ios::out | ios::binary);
const uint16_t bom = 0xFEFF;
Out.write(reinterpret_cast<const char*>(&bom), sizeof(bom)); // optional Byte order mark
Out.write(reinterpret_cast<const char*>(chi.data()), chi.size() * sizeof(wchar_t));
You forgot to tell your stream what locale to use:
Out.imbue(std::locale("zh_CN.UTF-8"));
You'll obviously need to include <locale> for this.

writing a string to file as a sequence of bytes

I want to write a wide string to a file as a sequence of bytes. I tried two ways, the first way:
std::wstring str = L"This is a test";
LPBYTE pBuf = (LPBYTE)str.c_str();
FILE* hFile = _wfopen( L"c:\\temp.txt", L"w" );
for( int i = 0; i<(str.length()*sizeof(wchar_t)); ++i)
fwprintf( hFile, L"%02X", pBuf[i] );
fclose(hFile);
The second way:
std::wstring str = L"This is a test";
LPBYTE pBuf = (LPBYTE)str.c_str();
HANDLE hFile = CreateFile( L"c:\\temp.txt", GENERIC_WRITE, 0, NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL );
DWORD dwRet;
WriteFile( hFile, pBuf, str.length()*sizeof(wchar_t), &dwRet, NULL );
CloseHandle(hFile);
When I open the result file, in the first case the contents of the file are:
54006800690073002000690073002000610020007400650073007400
In the second case, the contents of the file are:
This is a test
Why the first way doesn't work as expected? it looks like both ways are equal.
In the first example, you used fwprintf to format the bytes as 2-digit hex strings so that is why you see hex in that file.
I suspect you should spend some time researching the ASCII code and UTF-16LE and looking at text using a hex editor.
Every file is just a sequence of bytes so your question is not well defined and makes me think you have some fundamental misunderstanding about bytes and encodings but I'm not sure what it is.
Assuming you want to write out the in-memory representation of the string:
#include <fstream>
int main (int argc,char *argv[]) {
std::wstring str = L"This is a test";
std::ofstream fout(R"(c:\temp.txt)");
fout.exceptions(std::ios::badbit | std::ios::failbit);
fout.write(reinterpret_cast<const char*>(str.data()), sizeof(wchar_t) * str.size());
}
We use ofstream because this is C++ and it's better to use RAII types instead of having to manually call fclose or CloseHandle. We use a raw string for the filename so we don't have to deal with escaping the backslash. (On platforms that use a sensible path separator ; ) the raw string here is unnecessary.) We also turn on exceptions so that we don't have to explicitly check for errors.
Then we write out the bytes using the write member function. Note that the codecvt facet is still applied to the data written using this method. This is the reason we're using ofstream instead of wofstream; The default facet for ofstream does nothing, but the default facet for wofstream would convert the wchar_t to char using the default locale.
If you simply want to write UTF-16 data out then there are better ways than trying to write the raw bytes of a wchar_t string. (wchar_t isn't necessarily UTF-16. Some platforms just happen to use UTF-16.)
One way is to use a the codecvt_utf16 facet:
#include <fstream>
#include <codecvt>
int main(int argc, char *argv[]) {
std::wstring str = L"This is a test";
std::wofstream fout(R"(C:\temp.txt)");
fout.exceptions(std::ios::badbit | std::ios::failbit);
fout.imbue(std::locale(std::locale("C"), new std::codecvt_utf16<wchar_t>));
fout << str;
}
Here we write a wchar_t string normally, but we've imbued the wstream with codecvt_utf16, so that the the wchar_t is converted to UTF-16. If you want little endian UTF-16, or you want to include U+FEFF at the beginning of the file (these are frequently done on Windows) then there are flags to enable that: std::codecvt_utf16<wchar_t, 0x10FFFF, std::codecvt_mode::generate_header | std::codecvt_mode::little_endian>. (also note that codecvt_utf16 will treat wchar_t as UCS-2 or UCS-4, never UTF-16. The upshot is that this only handles the BMP on Windows)
Another option is to use normal streams and the wstring_convert facility:
#include <fstream>
#include <codecvt>
int main(int argc, char *argv[]) {
std::wstring str = L"This is a test";
std::ofstream fout(R"(C:\temp.txt)");
fout.exceptions(std::ios::badbit | std::ios::failbit);
std::wstring_convert<std::codecvt_utf16<wchar_t>, wchar_t> convert;
fout << convert.to_bytes(str);
}
This is probably the option I would choose, since it allows one to almost completely avoid wchar_t.

Buffer size for reading a UTF-8-encoded file using ICU (ICU4C)

I am trying to read a UTF-8-encoded file using ICU4C on Windows with msvc11. I need to determine the size of the buffer to build a UnicodeString. Since there is no fseek-like function in the ICU4C API I thought I could use an underlying C-file:
#include <unicode/ustdio.h>
#include <stdio.h>
/*...*/
UFILE *in = u_fopen("utfICUfseek.txt", "r", NULL, "UTF-8");
FILE* inFile = u_fgetfile(in);
fseek(inFile, 0, SEEK_END); /* Access violation here */
int size = ftell(inFile);
auto uChArr = new UChar[size];
There are two problems with this code:
It "throws" access violation at the fseek() line for some reason (Unhandled exception at 0x000007FC5451AB00 (ntdll.dll) in test.exe: 0xC0000005: Access violation writing location 0x0000000000000024.)
The size returned by the ftell function will not be the size I want because UTF-8 can use up to 4 bytes for a code point (a u8"tю" string will be of length 3).
So the questions are:
How do I determine a buffer size for a UnicodeString if I know that the input file is UTF-8-encoded?
Is there a portable way to use iostream/fstream for both reading and writing ICU's UnicodeStrings?
Edit:
Here is the possible solution (tested on msvc11 and gcc 4.8.1) based on the first answer and C++11 Standard. A few things from ISO IEC 14882 2011:
"The fundamental storage unit in the C++ memory model is the byte. A
byte is at least large enough to contain any member of the basic
execution character set (2.3) and the eight-bit code units of the
Unicode UTF-8 encoding form..."
"The basic source character set consists of 96 characters...", - 7 bits needed already
"The basic execution character set and the basic execution
wide-character set shall each contain all the members of the basic
source character set..."
"Objects declared as characters (char) shall be large enough to
store any member of the implementation’s basic character set."
So, to make this portable for platforms where the implementation defined size of char is 1 byte = 8 bits (don't know where this isn't true) we can read Unicode characters into chars using unformatted input operation:
std::ifstream is;
is.open("utfICUfSeek.txt");
is.seekg(0, is.end);
int strSize = is.tellg();
auto inputCStr = new char[strSize + 1];
inputCStr[strSize] = '\0'; //add null-character at the end
is.seekg(0, is.beg);
is.read(inputCStr, strSize);
is.seekg(0, is.beg);
UnicodeString uStr = UnicodeString::fromUTF8(inputCStr);
is.close();
What troubles me is that I have to create an additional buffer for chars and only then convert them to the required UnicodeString.
This is an alternative to using ICU.
Using the standard std::fstream you can read the whole/ part of the file into a standard std::string then iterate over that with a unicode aware iterator. http://code.google.com/p/utf-iter/
std::string get_file_contents(const char *filename)
{
std::ifstream in(filename, std::ios::in | std::ios::binary);
if (in)
{
std::string contents;
in.seekg(0, std::ios::end);
contents.reserve(in.tellg());
in.seekg(0, std::ios::beg);
contents.assign((std::istreambuf_iterator<char>(in)), std::istreambuf_iterator<char>());
in.close();
return(contents);
}
throw(errno);
}
Then in your code
std::string myString = get_file_contents( "foobar" );
unicode::iterator< std::string, unicode::utf8 /* or utf16/32 */ > iter = myString.begin();
while ( iter != myString.end() )
{
...
++iter;
}
Well, either you want to read in the whole file at once for some kind of postprocessing, in which case icu::UnicodeString is not really the best container...
#include <iostream>
#include <fstream>
#include <sstream>
int main()
{
std::ifstream in( "utfICUfSeek.txt" );
std::stringstream buffer;
buffer << in.rdbuf();
in.close();
// ...
return 0;
}
...or what you really want is to read into icu::UnicodeString just like into any other string object but went the long way around...
#include <iostream>
#include <fstream>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
int main()
{
std::ifstream in( "utfICUfSeek.txt" );
icu::UnicodeString uStr;
in >> uStr;
// ...
in.close();
return 0;
}
...or I am completely missing what your problem really is about. ;)