Split a Large File In C++ - c++

I'm trying to write a program that takes a large file (of any type) and splits it into many smaller "chunks". I think I have the basic idea down, but for some reason I cannot create a chunk size over 12 kb. I know there are a few solutions on google, etc. but I am more interested in learning what the origin of this limitation is then actually using the program to split files.
//This file splits are larger into smaller files of a user inputted size.
#include<iostream>
#include<fstream>
#include<string>
#include<sstream>
#include <direct.h>
#include <stdlib.h>
using namespace std;
void GetCurrentPath(char* buffer)
{
_getcwd(buffer, _MAX_PATH);
}
int main()
{
// use the function to get the path
char CurrentPath[_MAX_PATH];
GetCurrentPath(CurrentPath);//Get the current directory (used for displaying output)
fstream bigFile;
string filename;
int partsize;
cout << "Enter a file name: ";
cin >> filename; //Recieve target file
cout << "Enter the number of bites in each smaller file: ";
cin >> partsize; //Recieve volume size
bigFile.open(filename.c_str(),ios::in | ios::binary);
bigFile.seekg(0, ios::end); // position get-ptr 0 bytes from end
int size = bigFile.tellg(); // get-ptr position is now same as file size
bigFile.seekg(0, ios::beg); // position get-ptr 0 bytes from beginning
for (int i = 0; i <= (size / partsize); i++)
{
//Build File Name
string partname = filename; //The original filename
string charnum; //archive number
stringstream out; //stringstream object out, used to build the archive name
out << "." << i;
charnum = out.str();
partname.append(charnum); //put the part name together
//Write new file part
fstream filePart;
filePart.open(partname.c_str(),ios::out | ios::binary); //Open new file with the name built above
//Check if near the end of file
if (bigFile.tellg() < (size - (size%partsize)))
{
filePart.write(reinterpret_cast<char *>(&bigFile),partsize); //Write the selected amount to the file
filePart.close(); //close file
bigFile.seekg(partsize, ios::cur); //move pointer to next position to be written
}
//Changes the size of the last volume because it is the end of the file
else
{
filePart.write(reinterpret_cast<char *>(&bigFile),(size%partsize)); //Write the selected amount to the file
filePart.close(); //close file
}
cout << "File " << CurrentPath << partname << " produced" << endl; //display the progress of the split
}
bigFile.close();
cout << "Split Complete." << endl;
return 0;
}
Any ideas?

You are writing to the split file, but not reading from the bigfile. What you are writing it the in-memory structure of the bigfile, not the contents of bigfile. You need to allocate a buffer, read into it from bigfile and write it to the splitfile(s).

Related

How would I go about storing words from a plain text file to an array using C++?

I've been tasked with writing a C++ program that opens a text file, determines the length of each word, then produces output stating how many times a particular word length occurs.
I've figured how to open and read the contents of the file.
How would I take each word and store them in an array?
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
void getFile(string);
void getFile(string filename)
{
string array[2];
short loop = 0;
string line;
ifstream myfile (filename);
if (myfile.is_open())
{
while (!myfile.eof() )
{
getline (myfile,line);
array[loop] = line;
cout << array[loop] << endl;
loop++;
}
myfile.close();
}
else{
cout << "can't open the file";
system("PAUSE");
}
}
int main(){
string fileName;
while (true){
cout << "\nEnter the name of a file: ";
getline(cin, fileName);
if (fileName == ""){
cout << "Invaled file name, enter another!!!"<<endl;
main();
}
else{
getFile(fileName);
}
}
return 0;
}
You do not store words in an array.
You only need to store the word lengths and how often each of them occurred.
If you have a guaranteed and low maximum word length you can even simplify by using an array where the length of the current word is used as an index. Init all entries with 0. Then count entries up when the corresponding word length occurs.

Way to deal with multiple user arguments in C++?

Task Commands Picture Ok, I've progressed a bit more from my previous work but I just can't get past this issue which probably isn't that big. My task, which is basically a text file editor, requires multiple user arguments after the program begins running, ranging from let's say 1 such as "details" to 3 or 4 with random user input like display x y or savepart x y (see attached image). My code currently takes in the first line the user inputs using cin and getline but it can only read known strings such as load userinput.txt (which is the filename), and quit and I don't know how I can store the user's entered values for variables. How do I solve this?
What basically needs to probably change is the getline under the while loop or what's written in the if statements but I've tried everything.
#include <iostream>
#include <fstream>
#include <string>
#include <stdio.h>
using namespace std;
string secondword;
string fword;
string line;
ifstream myfile;
fstream fFile;
bool running = true;
int x;
int length;
int main(int argc, char* argv[]) {
while (running) {
getline(cin, fword); //Reads first line user inputs
//if first word equals laod command load file to memory
if (fword == "load userinput.txt") {
ifstream myfile("userinput.txt", ifstream::binary);
myfile.seekg(0, myfile.end); //Searches till end of file
length = myfile.tellg(); //Sets length as total number of chars in file
myfile.seekg(0, myfile.beg); //Searches from the beginning
char* buffer = new char[length]; //Allocates file memory to pointer
cout << "Reading " << length << " characters... " << endl;
myfile.read(buffer,length); //reads file and compares buffer size with legnth size
if (myfile)
cout << "All characters read and stored in memory sucessfully" << endl;
else
cout << "error: only " << myfile.gcount() << " can be read" << endl;;
myfile.close();
}
//Quit
if (fword == "quit") {
running = false; //Breaks while loop statement
cout << "User has quit the program" << endl;
}
//Clear all lines in text
if (fword == "clear all") {
fFile.open("userinput.txt", ios::out | ios::trunc);
fFile.close();
cout << "All lines have been cleared" << endl;
}
//Display All Text
if (fword == "display text") {
myfile.open("userinput.txt", ios::in);
while (getline(myfile, line)) { //read data from file object and put it into string.
cout << line << endl; //print the data of the string
}
}
}
return 0;
}

Read from binary file > data into class object, missing data

I'm currently making a file that reads ASCII file to read binary file instead. It has to put data into an object named Arragements and Customers. However, something is missing. It does not read the first line of the files.
It is reading everything correctly (it seems), except the fact that it doesn't read the first line of the file at all, and instead it adds 5 zeros at the end when it is supposed to not be anything there. In other words, it is not reading everything. Any big mistakes made? I'm changing the code from reading a ASCII file to read a BINARY file instead. If it make any changes, I've added how the readTicketsFromFile looked like when it was written to read ASCII file blow.
Here's the code when reading from ASCII: https://pastebin.com/WswHzpxM
The file format is (10 lines total):
< nr (2) > < amount tickets(4) > < price(3) > < title(30) >
Here's what I ried to make, "translating" the code to read same file in BINARY (It is obviously converted to BINARY file):
#include <fstream> // ifstream, ofstream
#include <iostream> // cout
#include <cstring> // strcpy
#include <cstdlib> // (s)rand
using namespace std;
const int ARRLEN = 35;
// CLASSES
class Arrangement {
private:
char title[ARRLEN]; // Arrangement title/name
int price; // Price pr. ticket
int amountSpaces; // Amounts of spaces/tickets left
int amountSold; // Amounts of tickets sold so far
int totalTicketsOrdered; // Total amount of tickets
int totalCustomersOrdered; // Amount of customers wanting a ticket
public:
Arrangement();
void update(char titt[], int ant, int pri);
void updateAmountWishedTickets(int antBill);
void updateAmountSold(int ant);
bool needPull();
int amountLeft();
int amountOrdered();
void write()
{
cout
<< '\n' << price << "\t"
<< amountSpaces << "\t"
<< amountSold << "\t"
<< totalTicketsOrdered
<< "\t" << totalCustomersOrdered
<< '\t' << title;
}
};
// READ FROM FILE 1
void readTicketsFromFile() {
char title[ARRLEN];
int nr, amount, price; // Variables 3 first "spaces" of the file
int size = 0;
// post/linje på filen.
ifstream infile;
infile.open("tickets.res", ios::in | ios::binary);
if (infile.is_open()) {
infile.seekg(0, ios::end);
size = (int)infile.tellg() / sizeof(Arrangement);
cout << "\n# in file: " << size << endl; // Prints out: 10
// Read from file, supposed to read everything from the file
for (int i = 1; i <= size; i++) {
infile.seekg(i * sizeof(Arrangement));
infile.read((char *)& arrangementer[i], sizeof(Arrangement));
cout << '\n' << i << " read: \n";
arrangementer[i].write(); // Used to see what what's inside the object, 1. missing.
}
}
else cout << "Couldn't open the file";
infile.close();
}
int main() {
readTicketsFromFile();
return 0;
}
What it shoud look like:
https://i.gyazo.com/69c5560766c33ebc15cac25e13b4de72.png
What it looks like:
https://i.gyazo.com/a9081f65ff629f3b57cb2c20750087b5.png

How to read a binary files and print 3's on screen?

I want to read a binary file of integer type and print the occurrence of the number of 3's in the file. I somehow wrote a program to open and read a binary file.
Here is the couple of problems I am facing:
If I try to print the file on my terminal, the execution continues
forever and the loop never ends.
I have no idea of how to filter out 3's from it.
Here is my code:
#include <iostream>
#include <fstream>
using namespace std;
int main () {
streampos size;
char * memblock;
ifstream file ("threesData.bin", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = file.tellg();
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
file.close();
cout << "the entire file content is in memory";
for (int i = 0; i < size; i += sizeof(int))
{
cout << *(int*)&memblock[i] << endl;
}
delete[] memblock;
}
else
cout << "Unable to open file";
return 0;
}
Here is a way to implement your requirements:
int main()
{
unsigned int quantity = 0U;
ifstream file ("threesData.bin", ios::in|ios::binary|ios::ate);
uint8_t byte;
while (file >> byte)
{
if (byte == 3U)
{
++ quantity;
}
}
cout << "The quantity of 3s is: " << quantity << endl;
return 0;
}
The first step should always get a simple version working first. Only optimize if necessary.
Allocating memory for a file and reading the entire file is an optimization. For example, your platform may not have enough available memory to read the entire file into memory before processing.

Checksums, Data Integrity

the pseudocode for this assignment is essentually:
1. Open the specified file in binary mode
2. Save the file name in the fileNames array.
3. Determine the file size using seekg and tellg
4. Read the file contents into the character array in one statement
5. Close the file
6. Loop through the array, one character at a time and accumulate the sum of each byte
7. Store the sum into the checkSums array.
#include <iostream>
#include <string>
#include <iomanip>
#include <fstream>
#include <cstring>
using namespace std;
int main()
{
//declare variables
string filePath;
void savefile();
char choice;
int i, a, b, sum;
sum = 0;
a = 0;
b = 0;
ifstream inFile;
//arrays
const int SUM_ARR_SZ = 100;
string fileNames[SUM_ARR_SZ];
unsigned int checkSums[SUM_ARR_SZ];
do {
cout << "Please select: " << endl;
cout << " A) Compute checksum of specified file" << endl;
cout << " B) Verify integrity of specified file" << endl;
cout << " Q) Quit" << endl;
cin >> choice;
if (choice == 'a' || choice == 'A')
{
//open file in binary mode
cout << "Specify the file path: " << endl;
cin >> filePath;
inFile.open(filePath.c_str(), ios::binary);
//save file name
fileNames[a] = filePath;
a++;
//use seekg and tellg to determine file size
char Arr[100000];
inFile.seekg(0, ios_base::end);
int fileLen = inFile.tellg();
inFile.seekg(0, ios_base::beg);
inFile.read(Arr, fileLen);
inFile.close();
for (i = 0; i < 100000; i++)
{
sum += Arr[i];
}
//store the sum into checkSums array
checkSums[b] = sum;
b++;
cout << " File checksum = " << sum << endl;
}
if (choice == 'b' || choice == 'B')
{
cout << "Specify the file path: " << endl;
cin >> filePath;
if (strcmp(filePath.c_str(), fileNames[a].c_str()) == 0)
{
}
}
} while (choice != 'q' && choice != 'Q');
system("pause");
}
I'm getting values like "-540000" and I'm not sure how to fix this. Any help is greatly appreciated!
You're creating an array on the stack without zeroing its contents, so Arr will contain "garbage" data.
You're creating the buffer with a fixed size, which means you're wasting space if the file is smaller than 100,000 bytes and you can't process a file that's larger than 100,000 bytes (without reusing the buffer)
You iterate over every byte in the buffer instead of those bytes that represent the file if it's smaller than 100,000 bytes.
I also note you're mixing C and C++ string functions. You don't need to call C's strcmp if you're using string then use string::compare.
C++ does not require forward-declaration of local variables, your code would be cleaner if you only declared local variables when they're used, instead of all-at-once.