C++ reading from file puts three weird characters

C++ reading from file puts three weird characters - c++

When i read from a file string by string, >> operation gets first string but it starts with "ï»¿i" . Assume that first string is "street", than it gets as "ï»¿istreet".
Other strings are okay. I tried for different txt files. The result is same. First string starts with "ï»¿i". What is the problem?
Here is my code :
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int cube(int x){ return (x*x*x);}
int main(){
int maxChar;
int lineLength=0;
int cost=0;
cout<<"Enter the max char per line... : ";
cin>>maxChar;
cout<<endl<<"Max char per line is : "<<maxChar<<endl;
fstream inFile("bla.txt",ios::in);
if (!inFile) {
cerr << "Unable to open file datafile.txt";
exit(1); // call system to stop
}
while(!inFile.eof()) {
string word;
inFile >> word;
cout<<word<<endl;
cout<<word.length()<<endl;
if(word.length()+lineLength<=maxChar){
lineLength +=(word.length()+1);
}
else {
cost+=cube(maxChar-(lineLength-1));
lineLength=(word.length()+1);
}
}
}

You're seeing a UTF-8 Byte Order Mark (BOM). It was added by the application that created the file.
To detect and ignore the marker you could try this (untested) function:
bool SkipBOM(std::istream & in)
{
char test[4] = {0};
in.read(test, 3);
if (strcmp(test, "\xEF\xBB\xBF") == 0)
return true;
in.seekg(0);
return false;
}

With reference to the excellent answer by Mark Ransom above, adding this code skips the BOM (Byte Order Mark) on an existing stream. Call it after opening a file.
// Skips the Byte Order Mark (BOM) that defines UTF-8 in some text files.
void SkipBOM(std::ifstream &in)
{
char test[3] = {0};
in.read(test, 3);
if ((unsigned char)test[0] == 0xEF &&
(unsigned char)test[1] == 0xBB &&
(unsigned char)test[2] == 0xBF)
{
return;
}
in.seekg(0);
}
To use:
ifstream in(path);
SkipBOM(in);
string line;
while (getline(in, line))
{
// Process lines of input here.
}

Here is another two ideas.
if you are the one who create the files, save they length along with them, and when reading them, just cut all the prefix with this simple calculation: trueFileLength - savedFileLength = numOfByesToCut
create your own prefix when saving the files, and when reading search for it and delete all what you found before.

Related

How can you encode a string using a number cipher (A = 1, B = 2, etc)

In my second year of University and to be honest I haven't been taught in the most effective way possible. My task is to take a word from a vector list, convert it using a substitute cipher, so A = 1, B = 2, and so on and then return that newly substituted word back for display and the user has to guess what the word might be. I'm struggling to understand how to create a cipher, please could someone check over the code and maybe give any comments on how to improve it perhaps? Any feedback is greatly appreciated.
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <vector>
#include <string>
#include <ctime>
using namespace std;
vector<string> GetCategory(int categoryChoice)
{
ifstream ifsTeams("Premier League Teams.txt"); //gets an input file stream to read file
vector<string> linesOne; //vector to store each line from the file
string tempLine; //temp storage for each line
while (getline(ifsTeams, tempLine)) //getline returns false at end of file
{
linesOne.push_back(tempLine);
}
ifstream ifsCharacters("Hobbit Characters.txt"); //gets an input file stream to read file
vector<string> linesTwo; //vector to store each line from the file
string tempLineTwo; //temp storage for each line
while (getline(ifsCharacters, tempLineTwo)) //getline returns false at end of file
{
linesTwo.push_back(tempLineTwo);
}
ifstream ifsCountries("South American Countries.txt"); //gets an input file stream to read file
vector<string> linesThree; //vector to store each line from the file
string tempLineThree; //temp storage for each line
while (getline(ifsCountries, tempLineThree)) //getline returns false at end of file
{
linesThree.push_back(tempLineThree);
}
if (categoryChoice == 1)
{
return linesOne;
}
else if (categoryChoice == 2)
{
return linesTwo;
}
else if (categoryChoice == 3)
{
return linesThree;
}
}
void Substitute(string answer)
{
}
int main()
{
srand((unsigned)time(0));
int categoryNum = (rand() % 3) + 1; //Random number for choosing one of the categories
vector<string> category = GetCategory(categoryNum); //Stores the return of one of the categories' vector list
int listSize = (int)category.size();
int answerChoice = (rand() % listSize) + 1;
string answer = category[answerChoice];
Substitute(answer);
return 0;
}

If your mapping is linear then why not using the ascii value of your character as:
printf("%d", answer-'A');
You can use bounds if your cipher uses symbols that are non linearly mapped to ascii:
if(answer >= '0' and answer < '9') {
;
}
else if (answer >= 'A' and answer < 'Z')
...
I strongly reccommend to work with linux so that issuing the cmd 'ascii' in the shell gives you the table for instance, also all c/c++ libraries are part of the man command and 'man strtoul' gives you the manual for that library. These are details that when combined together put linux systems on another world when it comes to development/hacking.

Your GetCategoryFile is wasteful because it opens and reads all three files even though you only need to open one file. Here's one way to rewrite the function so it only opens one file.
vector<string> GetCategory(int categoryChoice)
{
// choose the filename depending on categoryChoice
const char* filename;
if (categoryChoice == 1)
{
filename = "Premier League Teams.txt";
}
else if (categoryChoice == 2)
{
filename = "Hobbit Characters.txt";
}
else
{
filename = "South American Countries.txt";
}
ifstream ifs(filename); //gets an input file stream to read file
vector<string> lines; //vector to store each line from the file
string tempLine; //temp storage for each line
while (getline(ifs, tempLine)) //getline returns false at end of file
{
lines.push_back(tempLine);
}
return lines;
}

c++ How to read from a file into array one word at a time

I know this is a dumb question!
But I just CAN NOT get my head around how to read my file into an array one word at a time using c++
Here is the code for what I was trying to do - with some attempted output.
void readFile()
{
int const maxNumWords = 256;
int const maxNumLetters = 32 + 1;
int countWords = 0;
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
string word;
while (fin >> word)
{
countWords++;
assert (countWords <= maxNumWords);
}
char listOfWords[countWords][maxNumLetters];
for (int i = 0; i <= countWords; i++)
{
while (fin >> listOfWords[i]) //<<< THIS is what I think I need to change
//buggered If I can figure out from the book what to
{
// THIS is where I want to perform some manipulations -
// BUT running the code never enters here (and I thought it would)
cout << listOfWords[i];
}
}
}
I am trying to get each word (defined by a space between words) from the madLib.txt file into the listOfWords array so that I can then perform some character by character string manipulation.
Clearly I can read from a file and get that into a string variable - BUT that's not the assignment (Yes this is for a coding class at college)
I have read from a file to get integers into an array - but I can't quite see how to apply that here...

The simplest solution I can imagine to do this is:
void readFile()
{
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
vector<string> listOfWords;
std::copy(std::istream_iterator<string>(fin), std::istream_iterator<string>()
, std::back_inserter(listOfWords));
}
Anyways, you stated in your question you want to read one word at a time and apply manipulations. Thus you can do the following:
void readFile()
{
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
vector<string> listOfWords;
string word;
while(fin >> word) {
// THIS is where I want to perform some manipulations
// ...
listOfWords.push_back(word);
}
}

On the suggestion of πάντα ῥεῖ
I've tried this:
void readFile()
{
int const maxNumWords = 256;
int const maxNumLetters = 32 + 1;
int countWords = 0;
ifstream fin;
fin.open ("madLib.txt");
if (!fin.is_open()) return;
string word;
while (fin >> word)
{
countWords++;
assert (countWords <= maxNumWords);
}
fin.clear();
fin.seekg(0);
char listOfWords[countWords][maxNumLetters];
for (int i = 0; i <= countWords; i++)
{
while (fin >> listOfWords[i]) //<<< THIS did NOT need changing
{
// THIS is where I want to perform some manipulations -
cout << listOfWords[i];
}
}
and it has worked for me. I do think using vectors is more elegant, and so have accepted that answer.
The suggestion was also made to post this as a self answer rather than as an edit - which I kind of agree is sensible so I've gone ahead and done so.

The most simple way to do that is using the STL algorithm... Here is an example:
#include <iostream>
#include <iomanip>
#include <iterator>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<string> words;
auto beginStream = istream_iterator<string>{cin};
auto eos = istream_iterator<string>{};
copy(beginStream, eos, back_inserter(words));
// print the content of words to standard output
copy(begin(words), end(words), ostream_iterator<string>{cout, "\n"});
}
Instead of cin of course, you can use any istream object (like file)

c++ - std::getline reads non existing characters [duplicate]

When i read from a file string by string, >> operation gets first string but it starts with "ï»¿i" . Assume that first string is "street", than it gets as "ï»¿istreet".
Other strings are okay. I tried for different txt files. The result is same. First string starts with "ï»¿i". What is the problem?
Here is my code :
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
int cube(int x){ return (x*x*x);}
int main(){
int maxChar;
int lineLength=0;
int cost=0;
cout<<"Enter the max char per line... : ";
cin>>maxChar;
cout<<endl<<"Max char per line is : "<<maxChar<<endl;
fstream inFile("bla.txt",ios::in);
if (!inFile) {
cerr << "Unable to open file datafile.txt";
exit(1); // call system to stop
}
while(!inFile.eof()) {
string word;
inFile >> word;
cout<<word<<endl;
cout<<word.length()<<endl;
if(word.length()+lineLength<=maxChar){
lineLength +=(word.length()+1);
}
else {
cost+=cube(maxChar-(lineLength-1));
lineLength=(word.length()+1);
}
}
}

You're seeing a UTF-8 Byte Order Mark (BOM). It was added by the application that created the file.
To detect and ignore the marker you could try this (untested) function:
bool SkipBOM(std::istream & in)
{
char test[4] = {0};
in.read(test, 3);
if (strcmp(test, "\xEF\xBB\xBF") == 0)
return true;
in.seekg(0);
return false;
}

With reference to the excellent answer by Mark Ransom above, adding this code skips the BOM (Byte Order Mark) on an existing stream. Call it after opening a file.
// Skips the Byte Order Mark (BOM) that defines UTF-8 in some text files.
void SkipBOM(std::ifstream &in)
{
char test[3] = {0};
in.read(test, 3);
if ((unsigned char)test[0] == 0xEF &&
(unsigned char)test[1] == 0xBB &&
(unsigned char)test[2] == 0xBF)
{
return;
}
in.seekg(0);
}
To use:
ifstream in(path);
SkipBOM(in);
string line;
while (getline(in, line))
{
// Process lines of input here.
}

Here is another two ideas.
if you are the one who create the files, save they length along with them, and when reading them, just cut all the prefix with this simple calculation: trueFileLength - savedFileLength = numOfByesToCut
create your own prefix when saving the files, and when reading search for it and delete all what you found before.

How to copy text from one file to another and then turning the first letters of the text string into uppercase

I am trying to build a program that copies text from one .txt file to another and then takes the first letter of each word in the text and switches it to an uppercase letter. So far, I have only managed to copy the text with no luck or idea on the uppercase part. Any tips or help would be greatly appreciated. This is what I have so far:
int main()
{
std::ifstream fin("source.txt");
std::ofstream fout("target.txt");
fout<<fin.rdbuf(); //sends the text string to the file "target.txt"
system("pause");
return 0;
}

Try this, Take the file content to a string, then process it, and again write to the traget file.
int main()
{
std::ifstream fin("source.txt");
std::ofstream fout("target.txt");
// get pointer to associated buffer object
std::filebuf* pbuf = fin.rdbuf();
// get file size using buffer's members
std::size_t size = pbuf->pubseekoff (0,fin.end,fin.in);
pbuf->pubseekpos (0,fin.in);
// allocate memory to contain file data
char* buffer=new char[size];
// get file data
pbuf->sgetn (buffer,size);
fin.close();
locale loc;
string fileBuffer = buffer;
stringstream ss;
for (std::string::size_type i=0; i<fileBuffer.length(); ++i){
if(i==0)
ss << toupper(fileBuffer[i],loc);
else if (isspace(c))
ss << fileBuffer[i] << toupper(fileBuffer[++i],loc);
else
ss << fileBuffer[i];
}
string outString = ss.str();
fout << outString;
fout.close();
}

Instead of copying the entire file at once, you'll need to read part or all of it into a local "buffer" variable - perhaps using while (getline(in, my_string)), then you can simply iterate along the string capitalising letters that are either in position 0 or preceeded by a non-letter (you can use std::isalpha and std::toupper), then stream the string to out. If you have a go at that and get stuck, append your new code to the question and someone's sure to help you out....

I think for this copying the whole file is not going to let you edit it. You can use get() and put() to process the file one character at a time. Then figure out how to detect the start of a word and make it uppercase:
Something like this:
int main()
{
std::ifstream fin("source.txt");
std::ofstream fout("target.txt");
char c;
while(fin.get(c))
{
// figure out which chars are the start
// of words (previous char was a space)
// and then use std::toupper(c)
fout.put(c);
}
}

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>
int main() {
FILE* fpin;
FILE* fpout;
int counter = 0;
char currentCharacter;
char previousCharacter=' ';
fpin = fopen("source.txt", "r"); /* open for reading */
if (fpin == NULL)
{
printf("Fail to open source.txt!\n");
return 1;
}
fpout = fopen("target.txt", "w");/* open for writing */
if (fpout == NULL)
{
printf("Fail to open target.txt!\n");
return 1;
}
/* read a character from source.txt until END */
while((currentCharacter = fgetc(fpin)) != EOF)
{
/* find first letter of word */
if(!isalpha(previousCharacter) && previousCharacter != '-' && isalpha(currentCharacter))
{
currentCharacter = toupper(currentCharacter); /* lowercase to uppercase */
counter++; /* count number of words */
}
fputc(currentCharacter, fpout); /* put a character to target.txt */
/* printf("%c",currentCharacter); */
previousCharacter = currentCharacter; /* reset previous character */
}
printf("\nNumber of words = %d\n", counter);
fclose(fpin); /* close source.txt */
fclose(fpout); /* close target.txt */
return 0;
}

Trouble with seekp() to replace portion of file in binary mode

I'm having some trouble with replacing a portion of a file in binary mode. For some reason my seekp() line is not placing the file pointer at the desired position. Right now its appending the new contents to the end of the file instead of replacing the desired portion.
long int pos;
bool found = false;
fstream file(fileName, ios::binary|ios::out|ios::in);
file.read(reinterpret_cast<char *>(&record), sizeof(Person));
while (!file.eof())
{
if (record.getNumber() == number) {
pos=file.tellg();
found = true;
break;
}
// the record object is updated here
file.seekp(pos, ios::beg); //this is not placing the file pointer at the desired place
file.write(reinterpret_cast<const char *>(&record), sizeof(Person));
cout << "Record updated." << endl;
file.close();
Am I doing something wrong?
Thanks a lot in advance.

I don't see how your while() loop can work. In general, you should not test for eof() but instead test if a read operation worked.
The following code writes a record to a file (which must exist) and then overwrites it:
#include <iostream>
#include <fstream>
using namespace std;
struct P {
int n;
};
int main() {
fstream file( "afile.dat" , ios::binary|ios::out|ios::in);
P p;
p.n = 1;
file.write( (char*)&p, sizeof(p) );
p.n = 2;
int pos = 0;
file.seekp(pos, ios::beg);
file.write( (char*)&p, sizeof(p) );
}

while (!file.eof())
{
if (record.getNumber() == number) {
pos=file.tellg();
found = true;
break;
}
here -- you`re not updating number nor record -- so basically you go through all file and write in "some" location (pos isn't inited)
And Neil Butterworth is right (posted while i typed 8)) seems like you omitted smth

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ reading from file puts three weird characters - c++

Related

How can you encode a string using a number cipher (A = 1, B = 2, etc)

c++ How to read from a file into array one word at a time

c++ - std::getline reads non existing characters [duplicate]

How to copy text from one file to another and then turning the first letters of the text string into uppercase

Trouble with seekp() to replace portion of file in binary mode

Categories

Resources