I need to pick numbers out of a long string

I need to pick numbers out of a long string - c++

I have a string of about a thousand digits in a .txt file. I need to evaluate one digit at a time, compare it with adjacent digits, then move down the list and do it again. I'm using C++ and the get() function. Here's what I have so far:
int element[5];
ifstream file;
file.open("theNumber.txt", ios::in);
for(int i=0;i<5;i++)
{
file.seekg(1);
element[i]=file.get();
}
//read first 5 numbers.
Right now my code won't compile, and showing it all would make most of you cry, but I wanted to check to see if This part was correct. Will this give me an array with the first five digits of the number in the file?

Will this give me an array with the first five digits of the number in the file?
No, your seekg call is setting the read position to the second character every time you call it; just throw that call away (get() automatically advances the read position).
You also need to handle the text to binary conversion. Easiest to do like this:
int ch = file.get();
if (ch < '0' || ch > '9')
{
// Handle invalid input or EOF/error...
}
element[i] = ch - '0';

Will this give me an array with the first five digits of the number in the file?
No, sorry. It will give you the second digit of the file, five times over.
There are two versions of seekg: one that sets the file pointer's position from the beginning and one that sets it relative to some other position. The line file.seekg(1); sets the file pointer to absolute position 1: the second byte of the file. Thus your array contains the same digit repeated.
Consider changing the 1 to i in the call, if you want to use that particular seekg overload.
Good luck.
Also, as Brendan and spencercw note, you'll still have to convert the ASCII code.

Related

Why is the length of a string off by one on the first read of a file?

I am perplexed with the way my program is performing. I am looping the following process:
1) take the name of a course from an input file
2) output the length of the name of the course
The problem is that the first value is always one less than the actual value of the string.
My first string contains 13 characters (including the colon), but nameOfClass.length() returns 12. The next string, the character count is 16 and indeed, nameOfClass.length() returns 16.
Every value after that also returns the expected value, it is only the first that returns the expected value minus 1.
Here's the (reduced) code:
std::ifstream inf("courseNames.txt");
int numberOfClasses = 10;
string nameOfClass;
for (int i = 0; i < numberOfClasses; i++) {
std::getline(inf, nameOfClass,':');
std::cout << nameOfClass.length() << "\n";
}
The file looks like this (courseNames.txt):
Pre-Calculus:
Public-Speaking:
English I:
Calculus I:
...etc. (6 more classes)
This is what I get:
12
16
10
11
Can anyone explain this behavior of the .length() function?

You have a problem, but you have the wrong conclusion. std::getline reads but doesn't output the delimiter, and so the first result is indeed 12.
It also doesn't output the delimiter for any subsequent lines, so why is there always one more? Well, look what is after that :. That's right, a new line!
Pre-Calculus:
^ a new line
So your nameOfClass variable, except for the first string, always stores an extra newline before the other characters.
The fix is easy enough, just ignore the newline after reading the string.
inf.ignore(); // ignore one character
So, not the first result was wrong, it was the only one right :)

Reading integers from a file with mixed integers, letters, and spaces C++

This is a sort of self-imposed extra credit problem I'm adding to my current programming assignment which I finished a week early. The assignment involved reading in integers from a file with multiple integers per line, each separated by a space. This was achieved easily using while(inFile >> val) .
The challenge I put myself up to was to try and read integers from a file of mixed numbers and letters, pulling out all contiguous digits as separate integers composed of those digits. For examples if I was reading in the following line from a text file:
12f 356 48 r56 fs6879 57g 132e efw ddf312 323f
The values that would be read in (and stored) would be
12f 356 48 r56 fs6879 57g 132e efw ddf312 323f
or
12, 356, 48, 56, 6879, 57, 132, 312, and 323
I've spent all afternoon digging through cplusplus.com and reading cover to cover the specifics of get, getline, cin etc. and I am unable to find an elegant solution for this. Every method I can deduce involves exhaustive reading in and storing of each character from the entire file into a container of some sort and then going through one element at a time and pulling out each digit.
My question is if there is a way to do this during the process of reading them in from a file; ie does the functionality of get, getline, cin and company support that complex of an operation?

Read one character at a time and inspect it. Have a variable that maintains the number currently being read, and a flag telling you if you are in the middle of processing a number.
If the current character is a digit then multiple the current number by 10 and add the digit to the number (and set the "processing a number" flag).
If the current character isn't a digit and you were in the middle of processing a number, you have reached the end of the number and should add it to your output.
Here is a simple such implementation:
std::vector<int> read_integers(std::istream & input)
{
std::vector<int> numbers;
int number = 0;
bool have_number = false;
char c;
// Loop until reading fails.
while (input.get(c)) {
if (c >= '0' && c <= '9') {
// We have a digit.
have_number = true;
// Add the digit to the right of our number. (No overflow check here!)
number = number * 10 + (c - '0');
} else if (have_number) {
// It wasn't a digit and we started on a number, so we hit the end of it.
numbers.push_back(number);
have_number = false;
number = 0;
}
}
// Make sure if we ended with a number that we return it, too.
if (have_number) { numbers.push_back(number); }
return numbers;
}
(See a live demo.)
Now you can do something like this to read all integers from standard input:
std::vector<int> numbers = read_integers(std::cin);
This will work equally well with an std::ifstream.
You might consider making the function a template where the argument specifies the numeric type to use -- this will allow you to (for example) switch to long long int without altering the function, if you know the file is going to contain large numbers that don't fit inside of an int.

getline() Adding Character to Front of String? -- Actually substr syntax error

I'm writing a program that will balance Chemistry Equations; I thought it'd be a good challenge and help reinforce the information I've recently learned.
My program is set up to use getline(cin, std::string) to receive the equation. From there it separates the equation into two halves: a left side and right side by making a substring when it encounters a =.
I'm having issues which only concerns the left side of my string, which is called std::string leftSide. My program then goes into a for loop that iterates over the length of leftSide. The first condition checks to see if the character is uppercase, because chemical formulas are written with the element symbols and a symbol consists of either one upper case letter, or an upper case and one lower case letter. After it checks to see if the current character is uppercase, it checks to see if the next character is lower case; if it's lower case then I create a temporary string, combine leftSide[index] with leftSide[index+1] in the temp string then push the string to my vector.
My problem lies on the first iteration; I've been using CuFe3 = 8 (right side doesn't matter right now) to test it out. The only thing stored in std::string temp is C. I'm not sure why this happening; also, I'm still getting numbers in my final answer and I don't understand why. Some help fixing these two issues, along with an explanation, would be greatly appreciated.
[CODE]
int index = 0;
for (it = leftSide.begin(); it!=leftSide.end(); ++it, index++)
{
bool UPPER_LETTER = isupper(leftSide[index]);
bool NEXT_LOWER_LETTER = islower(leftSide[index+1]);
if (UPPER_LETTER)// if the character is an uppercase letter
{
if (NEXT_LOWER_LETTER)
{
string temp = leftSide.substr(index, (index+1));//add THIS capital and next lowercase
elementSymbol.push_back(temp); // add temp to vector
temp.clear(); //used to try and fix problem initially
}
else if (UPPER_LETTER && !NEXT_LOWER_LETTER) //used to try and prevent number from getting in
{
string temp = leftSide.substr(index, index);
elementSymbol.push_back(temp);
}
}
else if (isdigit(leftSide[index])) // if it's a number
num++;
}
[EDIT] When I entered in only ASDF, *** ***S ***DF ***F was the output.

string temp = leftSide.substr(index, (index+1));
substr takes the first index and then a length, rather than first and last indices. You want substr(index, 2). Since in your example index is 0 you're doing: substr(index, 1) which creates a string of length 1, which is "C".
string temp = leftSide.substr(index, index);
Since index is 0 this is substr(index, 0), which creates a string of length 0, that is, an empty string.
When you're processing parts of the string with a higher index, such as Fe in "CuFe3" the value you pass in as the length parameter is higher and so you're creating strings that are longer. F is at index 2 and you call substr(index, 3), which creates the string "Fe3".
Also the standard library usually uses half open ranges, so even if substr took two indices (which, again, it doesn't) you would do substr(index, index+2) to get a two character string.
bool NEXT_LOWER_LETTER = islower(leftSide[index+1]);
You might want to check that index+1 is a valid index. If you don't want to do that manually you might at least switch to using the bounds checked function at() instead of operator[].

Count occurrences of each letter in a file?

How to find the occurrence of letters A-Z regardless(ignore case) in a optimized way even if the file size is as large as 4GB or more ? What could be the different implementations possible in C++/C ?
One implementation is :
Pseudocode
A[26]={0}
loop through each character ch in file
If isalpha(ch)
A[tolower(ch)-'A']+ = 1
End If
end loop

Not much optimization left, I think.
Instead of computing tolower()-'A' for each element, just count occurrences of each character (in a char[256] accumulator), and do the case-aware computation afterwards (Might be more efficient or not, just try).
Be sure to use buffered input (fopen, perhaps assign larger buffer with setvbuf).
Eg:
acum[256]={0}
loop through each character 'c' in file
acum[c]++
end loop
group counts corresponding to same lowercase/uppercase letters
Also, bear in mind that this assumes ASCII or derived (one octet = one character) encoding.

This is not going to be instantaneous with 4GB. I see know way to do what you are doing much faster.
In addition, your code wouldn't handle tabs, spaces or other characters. You need to use isalpha() and only increment the count if it returns true.
Note that isalpha() is extremely fast. But, again, this code would not be instantaneous with a very large input.
TCHAR a[26] = { 0 };
for (int i = 0; i < length; i++)
{
if (isalpha(text[i]))
{
a[tolower(text[i]) - 'a']++;
}
}

How to remove a character from the string and change data if need it?

I have possible inputs 1M 2M .. 11M and 1Y (M and Y stand for months ) and I want to output "somestring1 somestring2.... and somestring12" note M and Y are removed and the last string is changed to 12
Example: input "11M" "hello" output: hello11
input "1Y" "hello" output: hello1
char * (const char * date, const char * somestr)
{
// just need to output final string no need to change the original string
cout<< finalStr<<endl;
}

The second string is getting output as a whole itself. So no change in its output.
The second string would be output as long as M or Y are encountered. As Stack Overflow discourages providing exact source codes, so I can give you some portion of it. There is a condition to be placed which is up to you to figure out.(The second answer gives that as well)
Code would be somewhat like this.
//Code for first string. Just for output.
for (auto i = 0 ; date[i] != '\0' ; ++i)
{
// A condition comes here.
cout << date[i] ;
}
And note that this is considering you just output the string. Otherwise you can create another string and add up the two or concatenate the existing ones.

is this homework? If not, here's what i'd suggest. (i ask about homework because you may have restrictions, not because we're not here to help)
1) do a find on 'M' in your string (using find), insert a '\0' at that position if one is found (btw i'm assuming you have well formatted input)
2) do a find on 'Y'. if one is found, insert a '\0' at that position. then do an atoi() or stringstream conversion on your string to convert to number. multiply by 12.
3) concatenate your string representation of part 1 or part 2 to your somestr
4) output.
This can probably be done in < 10 lines if i could be bothered.
the a.find('M') part and its checks can be conditional operator, then the conversion/concatenation in two or three lines at most.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

I need to pick numbers out of a long string - c++

Related

Why is the length of a string off by one on the first read of a file?

Reading integers from a file with mixed integers, letters, and spaces C++

getline() Adding Character to Front of String? -- Actually substr syntax error

Count occurrences of each letter in a file?

How to remove a character from the string and change data if need it?

Categories

Resources