Preventing Windows program from interpreting ^Z as end of file - c++

My job is to translate a application from C -> C++ that have been installed on a linux distribution.so I wish the functionallity of C and linux.
I have a problem with reading binary file. It says that it reaches the EOF when it encounters a ctrl-Z character before it has reached the actual end of the file.
Precious execution in bash
zcat file.txt.gz | txtToBinary | binaryToOutput
Execution in command prompt
txtToBinary.exe < file.txt | binaryToOutput.exe
Raw text file
R 5643BYIDK DK0016060346 11DKKXKLY 160 1
R 10669VJK 98 1 IS0000004018 4ISKXICE 240 5000000
M814
txtToBinary.exe - Sample Output:
^#^#^# hello ^# ^Z^#^#^#^#
^#^#^[SWMA ^Y^YC
The problem is that the program interprets the first ^Z as the end of file.
Tried so far
My solutions has been to do the following when compiling on windows using c++
Execution in command prompt
txtToBinary.exe < file.txt | binaryToOutput.exe
int main(int argc, char* argv []){
int loop (args_t* args){
for (;;){
char data [1024];
int temp = read_msg (data, sizeof (data));
}
int read_msg(void* data, int size){
_setmode(_fileno(stdin), _O_BINARY);
_setmode(0,_0_BINARY);
if(fread(((unsigned char *)data)+sizeof(*hdr),hdr->size-sizeof (*hdr),1,stdin) != 1);
if(feof(stdin))
printf("End of file error\n");
}
I have also tried Cygwin which some of the answers have me. But that also failed.
StackOverflow Answers
When looking at answer here in SO, we see Windows, Windows EOF, Binary solution,Binary Mode and Stream data end at byte 26 and Reaching EOF early Windows. They tell me that:
- Windows keys (CTRL + Z, ^Z) makes an end of file
- I have to read in binary format

I found the answer to my question. It had to do with where you read from. You need to put
_setmode(0,_0_BINARY);
in the main() function!!!!!!!! Remember this, otherwise other reads or writes will not be included.

fread() is part of stdio. What you're doing is opening the raw file as binary, but then doing text-mode standard I/O.
You could replace your existing fread() call with the read() system call. (That is, fread() is a library call that does some buffering, ultimately to call through to read().)

Related

C++ App killing USB disk accsess?

I've been working on this project, and I'll do my best to explain what I'm doing.
I will run a bat file from native DOS (USB DOS Boot thumbdrive) that starts a program (SHOWDATA.EXE) and outputs some data from that program to a text file. Then it will launch my app (compiled for DOS16Bit with Open-Watcom) which modifies a second exe file (EDITED.EXE) using information that was just previously output. Then it should run the newly modified exe (EDITED.EXE). My Environments for testing have been in Virtual Box and using a USB DOS bootable drive. So far my system and program runs as intended until the line where I display 'Finished' and want to Launch the newly modified exe, I get a
General failure reading drive C
Abort, Retry, Ignore, Fail?
Virtual Box here also displays a write Error to Drive A (Floppy drive A)
If I restart the system, I can then run the newly modified file without issue and have the desired results.
Is there a problem with the way I am opening or editing or closing my file that would cause this behavior?
#include <stdio.h>
#include <string.h>
int main(void)
{
FILE * pFile;
char data1[11];
char data2[33];
pFile = fopen ("testfile.exe","r+b");
printf ("PROGAM TITLE TEXT\n");
printf ("2013\n");
printf ("\n");
printf("Enter 10 Digit String from 1st filed above:\n"); //From previous program output
scanf ("%10s",data1); //only read 10 Chars
printf("Enter 32 Digits from 2nd field above:\n"); //From previous program output
scanf ("%32s",uuid); //only read 32 Chars
fseek (pFile,24523,SEEK_SET);//file offset location to begin write
fputs (sn,pFile); //Write our data!
fseek (pFile,24582,SEEK_SET);//file offset location to begin write
fputs (uuid,pFile); //Write our data!
fseek (pFile,24889,SEEK_SET);//file offset location to begin write
fputs (uuid,pFile); //Write our data!
fclose(pFile); //Close our file
printf ("Finished\n");
return(0);
}
My Bat file looks like this, where I pass a variable from text file "D.txt" to SHOWDATA.EXE and write the output to info.txt. I then parse info.txt with a FOR /F to display only useful information that will be used to edit the second exe file (EDITED.EXE). Then it will launch the edited exe file.
TYPE D.txt | SHOWDATA.EXE > Info.txt
PAUSE
MYPROGRAM.EXE
PAUSE
EDITED.EXE
I'm at a loss.

in c++, after reading in a file using a terminal window how can i use standard input?

I have a program where I read in a data file to populate a list of information
(./myProgram < dataFile.dat)
After the file is read i am unable to get standard input (cin) in that terminal window to work, it does not ever give the chance to enter input, but simply skips over it most likely grabbing some random value to be stored in. Is there any way to utilize cin after reading in a file as listed above?
Depending on your shell environment, you can feed the file into another fd than stdin:
$ cat fd_in.c
int main()
{
unsigned char buf[1024];
int bytesread;
bytesread = read(3,buf,sizeof(buf));
printf("file is %d bytes\n",bytesread);
bytesread = read(1,buf,sizeof(buf));
printf("you entered %d bytes\n",bytesread);
}
$ gcc fd_in.c
$ ./a.out 3< fd_in.c
file is 220 bytes
my input!
you entered 10 bytes
./myProgram < dataFile.dat means that standard input will be sourced from dataFile.dat, not from the keyboard. So if you run the program like this, you will not be able read keyboard data from standard input. While standard input/output are easy to use and make the program amenable to piping, etc. they work most naturally when your program has a single input and/or a single output. If your input comes from two different sources, as seems to be the case for you, then you should make your program explicitly open the file and read from it, using an ifstream. That way, keyboard input will still be available from cin.
On a separate note, it is a good idea to check if your reads succeed. If you had done so, you would have seen that all your reads from cin after the file is done are failing due to the end of file being reached.

feof() returning true when EOF is not reached

I'm trying to read from a file at a specific offset (simplified version):
typedef unsigned char u8;
FILE *data_fp = fopen("C:\\some_file.dat", "r");
fseek(data_fp, 0x004d0a68, SEEK_SET); // move filepointer to offset
u8 *data = new u8[0x3F0];
fread(data, 0x3F0, 1, data_fp);
delete[] data;
fclose(data_fp);
The problem becomes, that data will not contain 1008 bytes, but 529 (seems random). When it reaches 529 bytes, calls to feof(data_fp) will start returning true.
I've also tried to read in smaller chunks (8 bytes at a time) but it just looks like it's hitting EOF when it's not there yet.
A simple look in a hex editor shows there are plenty of bytes left.
Opening a file in text mode, like you're doing, makes the library translate some of the file contents to other stuff, potentially triggering a unwarranted EOF or bad offset calculations.
Open the file in binary mode by passing the "b" option to the fopen call
fopen(filename, "rb");
Is the file being written to in parallel by some other application? Perhaps there's a race condition, so that the file ends at wherever the read stops, when the read is running, but later when you inspect it the rest has been written. That would explain the randomness, too.
Maybe it's a difference between textual and binary file. If you're on Windows, newlines are CRLF, which is two characters in file, but converted to only one when read. Try using fopen(..., "rb")
I can't see your link from work, but if your computer claims no more bytes exist, I'd tend to believe it. Why don't you print the size of the file rather than doing things by hand in a hex editor?
Also, you'd be better off using level 2 I/O the f-calls are ancient C ugliness, and you're using C++ since you have new.
int fh =open(filename, O_RDONLY);
struct stat s;
fstat(fh, s);
cout << "size=" << hex << s.st_size << "\n";
Now do your seeking and reading using level 2 I/O calls, which are faster anyway, and let's see what the size of the file really is.

How do I check if my program has data piped into it

Im writing a program that should read input via stdin, so I have the following contruct.
FILE *fp=stdin;
But this just hangs if the user hasn't piped anything into the program, how can I check if the user is actually piping data into my program like
gunzip -c file.gz |./a.out #should work
./a.out #should exit program with nice msg.
thanks
Since you're using file pointers, you'll need both isatty() and fileno() to do this:
#include <unistd.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
FILE* fp = stdin;
if(isatty(fileno(fp)))
{
fprintf(stderr, "A nice msg.\n");
exit(1);
}
/* carry on... */
return 0;
}
Actually, that's the long way. The short way is to not use file pointers:
#include <unistd.h>
int main(int argc, char* argv[])
{
if(isatty(STDIN_FILENO))
{
fprintf(stderr, "A nice msg.\n");
exit(1);
}
/* carry on... */
return 0;
}
Several standard Unix programs do this check to modify their behavior. For example, if you have ls set up to give you pretty colors, it will turn the colors off if you pipe its stdout to another program.
Try "man isatty", I think that function will tell you if you are talking to the user or not.
Passing stdin to select() or poll() should tell you if input is waiting. Under many OSes you can also tell if stdin is a tty or pipe.
EDIT: I see I'm going to have to emphasize the also part of the tty test. A fifo is not a tty, yet there might be no input ready for an indefinite amount of time.
Use isatty to detect that stdin is coming from a terminal rather than a redirect.
See the function "isatty" - if STDIN is a terminal, you can skip reading from it. If it's not a terminal, you're getting data piped or redirected and you can read until EOF.
An additional option you get with select() is setting a timeout for reading from stdin (with respect to either the first read from stdin or consecutive reads from stdin).
For a code example using select on stdin see:
How to check if stdin is still opened without blocking?

Difference between files written in binary and text mode

What translation occurs when writing to a file that was opened in text mode that does not occur in binary mode? Specifically in MS Visual C.
unsigned char buffer[256];
for (int i = 0; i < 256; i++) buffer[i]=i;
int size = 1;
int count = 256;
Binary mode:
FILE *fp_binary = fopen(filename, "wb");
fwrite(buffer, size, count, fp_binary);
Versus text mode:
FILE *fp_text = fopen(filename, "wt");
fwrite(buffer, size, count, fp_text);
I believe that most platforms will ignore the "t" option or the "text-mode" option when dealing with streams. On windows, however, this is not the case. If you take a look at the description of the fopen() function at: MSDN, you will see that specifying the "t" option will have the following effect:
line feeds ('\n') will be translated to '\r\n" sequences on output
carriage return/line feed sequences will be translated to line feeds on input.
If the file is opened in append mode, the end of the file will be examined for a ctrl-z character (character 26) and that character removed, if possible. It will also interpret the presence of that character as being the end of file. This is an unfortunate holdover from the days of CPM (something about the sins of the parents being visited upon their children up to the 3rd or 4th generation). Contrary to previously stated opinion, the ctrl-z character will not be appended.
In text mode, a newline "\n" may be converted to a carriage return + newline "\r\n"
Usually you'll want to open in binary mode. Trying to read any binary data in text mode won't work, it will be corrupted. You can read text ok in binary mode though - it just won't do automatic translations of "\n" to "\r\n".
See fopen
Additionally, when you fopen a file with "rt" the input is terminated on a Crtl-Z character.
Another difference is when using fseek
If the stream is open in binary mode, the new position is exactly offset bytes measured from the beginning of the file if origin is SEEK_SET, from the current file position if origin is SEEK_CUR, and from the end of the file if origin is SEEK_END. Some binary streams may not support the SEEK_END.
If the stream is open in text mode, the only supported values for offset are zero (which works with any origin) and a value returned by an earlier call to std::ftell on a stream associated with the same file (which only works with origin of SEEK_SET.
Even though this question was already answered and clearly explained, I think it would be interesting to show the main issue (translation between \n and \r\n) with a simple code example. Note that I'm not addressing the issue of the Crtl-Z character at the end of the file.
#include <stdio.h>
#include <string.h>
int main() {
FILE *f;
char string[] = "A\nB";
int len;
len = strlen(string);
printf("As you'd expect string has %d characters... ", len); /* prints 3*/
f = fopen("test.txt", "w"); /* Text mode */
fwrite(string, 1, len, f); /* On windows "A\r\nB" is writen */
printf ("but %ld bytes were writen to file", ftell(f)); /* prints 4 on Windows, 3 on Linux*/
fclose(f);
return 0;
}
If you execute the program on Windows, you will see the following message printed:
As you'd expect string has 3 characters... but 4 bytes were writen to file
Of course you can also open the file with a text editor like Notepad++ and see yourself the characters:
The inverse conversion is performed on Windows when reading the file in text mode.
We had an interesting problem with opening files in text mode where the files had a mixture of line ending characters:
1\n\r
2\n\r
3\n
4\n\r
5\n\r
Our requirement is that we can store our current position in the file (we used fgetpos), close the file and then later to reopen the file and seek to that position (we used fsetpos).
However, where a file has mixtures of line endings then this process failed to seek to the actual same position. In our case (our tool parses C++), we were re-reading parts of the file we'd already seen.
Go with binary - then you can control exactly what is read and written from the file.
In 'w' mode, the file is opened in write mode and the basic coding is 'utf-8'
in 'wb' mode, the file is opened in write -binary mode and it is resposible for writing other special characters and the encoding may be 'utf-16le' or others