Reading a huge text - c++

I am trying to read a file like I use option more in Linux. I am just trying to make almost the same realization of more.
My problem is when I read file in order like:
aaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbb
.....................
ccccccccccccccccccccc
For, example: I have 100 strings in txt.file, in order which I showed you. So every string does not exceed the horizontal size of the console and goes after ENTER.
The program works excellent. I mean, if my information of a txt.file does not show fully, I just press SPACE and see another screen of my information from a file.
But if I put a huge text with long strings, It just read a file fully and shows me the end of the file.
What do I do wrong with this? Might I not considered the horizontal size of a console? But I think, Linux thinks for me in this case.. Can you help me with this?
My arguments are:./more 0 1.txt
./more - make file, 0- I beging on this string, 1.txt -file name.
My code :
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sys/ioctl.h>
#include <unistd.h>
#include <termios.h>
int getch();
void printRecord(int& numStr,struct winsize w, std::vector<std::string>& lines);
int main(int argc, char *argv[]) {
struct winsize w;
char ch;
ioctl(STDOUT_FILENO, TIOCGWINSZ, &w);
std::ifstream readRecord;
std::vector<std::string> lines;
std::string str;
readRecord.open(argv[2]);
int numStr= atoi(argv[1]);
while (!readRecord.eof())
{
getline(readRecord, str);
str.size();
lines.push_back(str);
}
printRecord(numStr,w, lines);
ch=getch();
while(ch!='q'){
if (ch==32)
{
numStr--;
printRecord(numStr,w,lines);
}
if (numStr>=lines.size()){
break;}
ch=getch();
}
return 0;
}
void printRecord(int& numStr,struct winsize w,std::vector<std::string>& lines)
{
for (int i = numStr; i < numStr + w.winsize::ws_row-1; i++)
{
if (i>=lines.size())
break;
else
std::cout << lines[i] << std::endl;
}
numStr += w.winsize::ws_row;
}
int getch()
{
int ch;
struct termios oldt, newt;
tcgetattr( STDIN_FILENO, &oldt );
newt = oldt;
newt.c_lflag &= ~( ICANON | ECHO );
tcsetattr( STDIN_FILENO, TCSANOW, &newt );
ch = getchar();
tcsetattr( STDIN_FILENO, TCSANOW, &oldt );
return ch;
}

No, Linux doesn't "thinks for me in this case." If your code prints out the contents of the file, as is, then the contents of the file get printed to standard output. If the terminal is not big enough to show the contents of the file, only whatever fits on the screen will remain at the end. As the contents of the file get printed, the initial parts of the file will briefly appear, before the display scrolls off.
The terminal is not going to automatically paginate your program's output for you. That's the real more's job. If you want to replicate more's functionality, you must do it yourself.
Your program can obtain the size of the terminal display by making the TIOCGWINSZ ioctl, as explained in the tty_ioctl(4) manual page, that you should read. After obtaining the terminal's size, it is going to be up to your program to calculate how much of the file's contents will fit on the screen, and to paginate it properly.
Things get complicated rather quickly, if you have to deal with multibyte UTF-8-encoded content, not to mention double-wide characters if your output contains certain Asian character sets. Computing the actual size of printed text, in the age of internationalization and localization, is surprisingly difficult. But, for plain Latin text, this should be sufficient. You should also set up a signal handler for SIGWINCH, as explained in that manual page, to detect changes to the terminal size, so that you can repaginate the contents of the file accordingly.
You can also consider using a higher-level library, like curses, which might be useful in this case.

Related

Reading large strings in c++ [duplicate]

I am trying to read in a string of length in 10^5 order. I get incorrect string if the size of string grows beyond 4096.
I am using the following code
string a;
cin>>a;
This didn't work then I tried reading character by character by following code
unsigned char c;
vector<unsigned char> a;
while(count>0){
c = getchar();
a.push_back(c);
count--;
}
I have done necessary escaping for using getchar this also had the 4096 bytes problem. Can someone suggest a workaround or point to correct way of reading it.
It is because your terminal inputs are buffered in the I/O queue of the kernel.
Input and output queues of a terminal device implement a form of buffering within the kernel independent of the buffering implemented by I/O streams.
The terminal input queue is also sometimes referred to as its typeahead buffer. It holds the characters that have been received from the terminal but not yet read by any process.
The size of the input queue is described by the MAX_INPUT and _POSIX_MAX_INPUT parameters;
By default, your terminal is in Canonical mode.
In canonical mode, all input stays in the queue until a newline character is received, so the terminal input queue can fill up when you type a very long line.
We can change the input mode of terminal from canonical mode to non-canonical mode.
You can do it from terminal:
$ stty -icanon (change the input mode to non-canonical)
$ ./a.out (run your program)
$ stty icanon (change it back to canonical)
Or you can also do it programatically,
To change the input mode programatically we have to use low level terminal interface.
So you can do something like:
#include <iostream>
#include <string>
#include <stdio.h>
#include <termios.h>
#include <unistd.h>
int clear_icanon(void)
{
struct termios settings;
int result;
result = tcgetattr (STDIN_FILENO, &settings);
if (result < 0)
{
perror ("error in tcgetattr");
return 0;
}
settings.c_lflag &= ~ICANON;
result = tcsetattr (STDIN_FILENO, TCSANOW, &settings);
if (result < 0)
{
perror ("error in tcsetattr");
return 0;
}
return 1;
}
int main()
{
clear_icanon(); // Changes terminal from canonical mode to non canonical mode.
std::string a;
std::cin >> a;
std::cout << a.length() << std::endl;
}
Using this test-program based on what you posted:
#include <iostream>
#include <string>
int main()
{
std::string a;
std::cin >> a;
std::cout << a.length() << std::endl;
}
I can do:
./a.out < fact100000.txt
and get the output:
456574
However, if I copy'n'paste from an editor to the console, it stops at 4095. I expect that's a limit somewhere in the consoles copy'n'paste handling. The easy solution to that is of course to not use copy'n'paste, but redirect from a file. On some other systems, the restruction to 4KB of input may of course reside somewhere else. (Note that, at least on my system, I can happily copy and paste the 450KB of factorial result to another editor window, so in my system it's simply the console buffer that is the problem).
This is much more likely to be a platform/OS problem than a C++ problem. What OS are you using, and what method are you using to get the string fed to stdin? It's pretty common for command-line arguments to be capped at a certain size.
In particular, given that you've tried reading one character at a time, and it still didn't work, this seems like a problem with getting the string to the program, rather than a C++ issue.

C++ Keylog Not Working Properly

I'm trying to make a simple keylogger in C++ (for learning only) and it's not quite working how I would like it to. My goal is to have it write to a txt. Here's the code I have so far:
#include <iostream>
#include <fstream>
#include <conio.h>
#define LOG(x) logger << x;
int main()
{
using std::ofstream;
using std::fstream;
ofstream logger("logger.txt", fstream::app);
char ascii;
bool typing;
for(;;)
{
if(_kbhit())
{
typing = true;
ascii = getch();
while(typing == true) //tried 'if', doesn't work
{
LOG(ascii);
std::cout << ascii << std::endl;
//typing = false;
//break
//tried using the above two and didn't work
}
}
else typing = false;
}
logger.close();
}
When I make while(typing == true) continuous, the key that is pressed continuously gets printed, but at least it actually gets saved to the txt. When I try to make the loop stop after one keyboard click, nothing gets saved to the txt.
So what am I doing wrong? Thanks for any help!
The variable typing is never set to false, so it stays true and your loop continues. The following code works:
#include <fstream>
#include <conio.h>
int main()
{
std::ofstream logger("logger.txt", std::fstream::app);
for(char ascii; ascii != 3;)
{
ascii = getche();
logger << ascii;
}
return 0;
}
getche() prints the character typed, and 3 is the ASCII code for Ctrl+C. This logs all characters, even non-printable ones.
A few comments on your code:
Don't use macros (#define) unless you are substituting a large amount of code and using it often, or plan on changing what something does.
You use loops and variables where you don't need to. getch and related functions wait for input.
logger.close() is automatically done when logger goes out of scope and is destructed.
return 0 should be at the end of main. It's not necessary, but it is used to return to the OS and return 0, although automatically put in, is important to have in for clarity.
I personally don't use using statements. Just write out the namespace, it helps avoid collisions. That's why it's in a namespace.

CMD Prompt C++: Limiting literals entered on screen

I hope the question isn't to ambiguous.
when I ask:
int main()
{
string name = {""};
cout << "Please enter a name: " << endl;
getline(cin, name);
//user enters 12 characters stop displaying next literal keypresses.
enter code here
}
I would like to be able to limit the amount of times the user can enter a char on screen. Example, the screen stops displaying characters after length 12?
If so what would be the library and command line for doing something like this?
Wanting to this as, I have a ascii art drawn on the CMD, and when I cout the statement at x,y anything over 12 characters long inputed draws over the ascii art.
I hope this makes sense :'{ Thank you!
By default the console is in cooked mode (canonical mode, line mode, ...). This means
that the console driver is buffering data before it hands it to your application
characters will be automatically echoed back to the console by the console driver
Normally, this means that your program only ever gets hold of the input after a line ends, i.e. when enter is pressed. Because of the auto-echo, those character are then already on screen.
Both settings can be changed independently, however the mechanism is --unfortunately-- an OS-specific call:
For Window it's SetConsoleMode():
HANDLE h_stdin = GetStdHandle(STD_INPUT_HANDLE);
DWORD mode = 0;
// get chars immediately
GetConsoleMode(hStdin, &mode);
SetConsoleMode(hStdin, mode & ~ENABLE_LINE_INPUT));
// display input echo, set after 12th char.
GetConsoleMode(hStdin, &mode);
SetConsoleMode(hStdin, mode & ~ENABLE_ECHO_INPUT));
As noted by yourself, Windows still provides conio.h including a non-echoing _getch() (with underscore, nowadays). You can always use that and manually echo the characters. _getch() simply wraps the console line mode on/off, echo on/off switch into a function.
Edit: There is meant to be an example on the use of _getch(), here. I'm a little to busy to get it done properly, I refrained from posting potentially buggy code.
Under *nix you will most likely want to use curses/termcap/terminfo. If you want a leaner approach, the low level routines are documented in termios/tty_ioctl:
#include <sys/types.h>
#include <termios.h>
struct termios tcattr;
// enable non-canonical mode, get raw chars as they are generated
tcgetattr(STDIN_FILENO, &tcattr);
tcattr.c_lflag &= ~ICANON;
tcsetattr(STDIN_FILENO, TCSAFLUSH, &tcattr);
// disable echo
tcgetattr(STDIN_FILENO, &tcattr);
tcattr.c_lflag &= ~ECHO;
tcsetattr(STDIN_FILENO, TCSAFLUSH, &tcattr);
You can use scanf("%c",&character) on a loop from 1 to 12 and append them to a pre-allocated buffer.
As in my comments, I mentioned a method I figured out using _getch(); and
displaying each char manually.
simplified version:
#include <iostream>
#include <string>
#include <conio.h>
using namespace std;
string name = "";
int main()
{
char temp;
cout << "Enter a string: ";
for (int i = 0; i < 12; i++) { //Replace 12 with character limit you want
temp = _getch();
name += temp;
cout << temp;
}
system("PAUSE");
}
This lets you cout each key-press as its pressed,
while concatenating each character pressed to a string called name.
Then later on in what ever program you use this in, you can display the full name as a single string type.

call gnuplot via fork and pipe and update plot

I want to do some realtime-plots during a simulation. For this, i would like to use octave or gnuplot. My current approach is to use a frontend to gnuplot, feedgnuplot, which fits actually very well.
The Simulation is written in C++, so i thought about forking (new process for feedgnuplot) and piping the relevant data to feedgnuplot.
The problem i have is that the output is only visible after the simulation.
But i want to see the plot updated during simulation.
Here is a MWE:
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
int main()
{
FILE* stream = popen("feedgnuplot", "w");
for(int i = 0; i < 10; ++i)
{
fprintf(stream, "%d\n", i * i);
fflush(stream);
sleep(1);
}
}
The program stops after 10 secons, showing the plot.
When using feedgnuplot directly in the shell, everything works as expected.
(That is, newly added data is plotted without the need to end the process)
What am i doing wrong? I think i lack some understanding of how popen really works.
First, let's write a fake feedgnuplot.c:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *buf = NULL;
size_t n = 0;
while (getline (&buf, &n, stdin) != -1) {
printf ("%s", buf);
}
free (buf);
return 0;
}
With this, your code works, i.e. the lines are printed as they arrive.
I suspect the problem lies in the way your feedgnuplot program reads incoming data. You should show the relevant part of that code.
If I had to take a guess, you probably need to add
setvbuf (stdin, NULL, _IOLBF, 0);
in feedgnuplot before you start to read from stdin.
That is because by default, when stdin corresponds to a terminal it is line buffered, whereas when it corresponds to a pipe it is fully buffered. The code above makes stdin line buffered no matter what so there should be no difference between reading from a terminal or a pipe.

loop to discard spurious variable input from stdin, Linux C++

I am trying to write a c++ program for my linux machine that can interact with some instrumentation that responds to simple ascii commands. The problem I'm running into, I would think, would be a fairly common request but my searches of various forums came up with nothing quite the same.
My problem is this: When I connect to the instrument, due to some communication issues, it often pukes up a bunch of data of varying length that I don't want. The data the machine prints has line endings with '\r'. I have been trying to write a simple loop what will keep reading and ignoring data until the machine is quiet for two seconds, then carry on to perform some data requests once the storm is over.
When searching forums, I found gobs and gobs of threads about cin.ignore, cin.sync, getline and cin.getline. These all seemed quite useful but when I attempted to implement them in a way that should be simple, they never behaved quite as I expected them to.
I apologize in advance if this is a duplicate post as I would have thought I wasn't the first person to want to throw away garbage input but I have found no such post.
The code I have been trying a few different arrangements of looks something like this:
sleep(2);
cin.clear();
while ( cin.peek() != char_traits<char>::eof()) {
//cin.sync();
//cin.ignore(numeric_limits<streamsize>::max(),char_traits<char>::eof());
cin.clear();
char tmp[1];
while ( cin.getline(tmp,80,'\r') ) {}
cin.clear();
sleep(2);
}
I understand from my searches that doing some sort of while(!cin.eof()) is bad practice but tried it anyway for grins as well as while(getline(cin,str,'\r')) and while(cin.ignore()). I am at a loss here as there is clearly something I'm missing.
Thoughts?
EDIT: --final code--
Alright! This did it! Thanks for point me to termios #MatsPetersson! I wound up stealing quite a lot of your code, but I'm glad I had the opportunity to figure out what was going on. This website helped me make sense of the tcassert manual page: http://en.wikibooks.org/wiki/Serial_Programming/termios
#include <cstdlib>
#include <iostream>
#include <stdio.h>
#include <unistd.h>
#include <limits>
#include <termios.h>
#include <errno.h>
#include <cassert>
using namespace std;
const int STDIN_HANDLE=fileno(stdin);
int main()
{
string str;
//Configuring terminal behavior
termios tios, original;
assert( tcgetattr(STDIN_HANDLE, &tios)==0 );
original = tios;
tios.c_lflag &= ~ICANON; // Don't read a whole line at a time.
tios.c_cc[VTIME] = 20; // 0.5 second timeout.
tios.c_cc[VMIN] = 0; // Read single character at a time.
assert( tcsetattr(STDIN_HANDLE, TCSAFLUSH, &tios)==0 );
const int size=999; //numeric_limits<streamsize>::max() turns out to be too big.
char tmp[size];
int res;
cerr << "---------------STDIN_HANDLE=" << STDIN_HANDLE << endl;
cerr << "---------------enter loop" << endl;
while ( res=read(STDIN_HANDLE, tmp, sizeof(tmp)) ) {
cerr << "----read: " << tmp << endl;
}
cerr << "--------------exit loop" << endl;
cout << "END";
assert( tcsetattr(STDIN_HANDLE, TCSANOW, &original)==0 );
return 0;
}
That wasn't as bad as I began to fear it would be! Works perfectly! Obviously all the cerr << -- lines are not necessary. As well as some of the #include's but I'll use them in the full program so I left them in for my own purposes.
Well... It mostly works anyway. It works fine so long as I don't redirect the stdio for the program to a tcp-ip address using socat. Then it gives me a "Not a Typewriter" error which is what I guess happens when it attempts to control something that isn't a tty. That sounds like a different question though, so I'll have to leave it here and start again I guess.
Thanks folks!
Here's a quick sample of how to do console input (and can easily be adapted to do input from another input source, such as a serial port).
Note that it's hard to "type fast enough" for this to read more than one character at a time, but if you copy'n'paste, it will indeed read 256 characters at once, so assuming your machine that you are connecting to is indeed feeding out a large amount of stuff, it should work just fine to read large-ish chunks - I tested it by marking a region in one window, and middle-button-clicking in the window running this code.
I have added SOME comments, but for FULL details, you need to do man tcsetattr - there are a whole lot of settings that may or may not help you. This is configured to read data of "any" kind, and exit if you hit escape (it also exits if you hit an arrow-key or similar, because those translate to an ESC-something sequence, and thus will trigger the "exit" functionality. It's a good idea to not crash out of, or set up some handler to restore the terminal behaviour, as if you do accidentally exit before you've restored to original setting, the console will act a tad weird.
#include <termios.h>
#include <unistd.h>
#include <cassert>
#include <iostream>
const int STDIN_HANDLE = 0;
int main()
{
termios tios, original;
int status;
status = tcgetattr(STDIN_HANDLE, &tios);
assert(status >= 0);
original = tios;
// Set some input flags
tios.c_iflag &= ~IXOFF; // Turn off XON/XOFF...
tios.c_iflag &= ~INLCR; // Don't translate NL to CR.
// Set some output flags
// tios.c_oflag = ... // not needed, I think.
// Local modes flags.
tios.c_lflag &= ~ISIG; // Don't signal on CTRL-C, CTRL-Z, etc.
tios.c_lflag &= ~ICANON; // Don't read a whole line at a time.
tios.c_lflag &= ~(ECHO | ECHOE | ECHOK); // Don't show the input.
// Set some other parameters
tios.c_cc[VTIME] = 5; // 0.5 second timeout.
tios.c_cc[VMIN] = 0; // Read single character at a time.
status = tcsetattr(STDIN_HANDLE, TCSANOW, &tios);
assert(status >= 0);
char buffer[256];
int tocount = 0;
for(;;)
{
int count = read(STDIN_HANDLE, buffer, sizeof(buffer));
if (count < 0)
{
std::cout << "Error..." << std::endl;
break;
}
if (count == 0)
{
// No input for VTIME * 0.1s.
tocount++;
if (tocount > 5)
{
std::cout << "Hmmm. No input for a bit..." << std::endl;
tocount = 0;
}
}
else
{
tocount = 0;
if (buffer[0]== 27) // Escape
{
break;
}
for(int i = 0; i < count; i++)
{
std::cout << std::hex << (unsigned)buffer[i] << " ";
if (!(i % 16))
{
std::cout << std::endl;
}
}
std::cout << std::endl;
}
}
status = tcsetattr(STDIN_HANDLE, TCSANOW, &original);
return 0;
}
If your instrumentation offers a stream interface, and assuming that it would wait before returning whenever no input is available, I'd suggest to simply use :
cin.ignore(numeric_limits<streamsize>::max(),'\r'); // ignore everything until '\r'
Another alternative could be to use poll, which provides a mechanism for multiplexing (and waiting for) input/output over a set of file descriptors. This has the advantage of letting you read several instrumentation devices if you'd need.