Reading from a file, multiple delimeters - c++

So I have some code which reads from a file and separates by the commas in the file. However some things in the file often have spaces after or before the commas so it's causing a bit of a problem when executing the code.
This is the code that I which reads in the data from the file. Using the same kind of format I was wondering if there was a way to prepare for this spaces
while(getline(inFile, line)){
stringstream linestream(line);
// each field of the inventory file
string type;
string code;
string countRaw;
int count;
string priceRaw;
int price;
string other;
//
if(getline(linestream,type,',') && getline(linestream,code,',')
&& getline(linestream,countRaw,',')&& getline(linestream,priceRaw,',')){
// optional other
getline(linestream,other,',');
count = atoi(countRaw.c_str());
price = atoi(priceRaw.c_str());
StockItem *t = factoryFunction(code, count, price, other, type);
list.tailAppend(t);
}
}

The better approach for those kind of problems is a state machine. Each character that you get should act in a simple way. You don't state if you need spaces between words non delimited by commas, so I suppose you need them. I don't know what you need to do with double spaces, I suppose you need to keep things as are. So start reading one character at a time, there are two variables the start positions and the limit position. When you start you are determining the start position ( state 1 ). If you find any character different than the space character you set that start position to that character and you change your state to ( state 2 ). When in state 2 when you find a non space character you set the limit position to the next position than the character you found. If you find a comma character you get the string that begins form start to limit and you change again into state 1.

Related

Parsing log files containing NMEA sentences C++

I have multiple log files of NMEA sentences that contain geographical positions captured by a camera.
Example of one of the sentences: $GPRMC,100101.000,A,3723.1741,N,00559.5624,W,0.000,0.00,150914,,A*63
My question is, how do you reckon I can start on that? Just need someone to push me to the right direction, thanks.
I use this checksum function in my GPSReverse driver.
string chk(const char* data)
{
// Assuming data contains a NMEA sentence (check it)
// Variables for keeping track of data index and checksum
const char *datapointer = &data[1];
char checksum = 0;
// Loop through entire string, XORing each character to the next
while (*datapointer != '\0')
{
checksum ^= *datapointer;
datapointer++;
}
// Print out the checksum in ASCII hex nybbles
char x[100] = {0};
sprintf_s(x,100,"%02X",checksum);
return x;
}
And after that, some append to the NMEA string (say, GGA) :
string re = chk(gga.c_str());
gga += "*";
gga += re;
gga += "\r\n";
So you can read up to the *, calculate the checksum, and see if it matches the string after the *.
Read more here.
Each sentence begins with a '$' and ends with a carriage return/line
feed sequence and can be no longer than 80 characters of visible text
(plus the line terminators). The data is contained within this single
line with data items separated by commas. The data itself is just
ascii text and may extend over multiple sentences in certain
specialized instances but is normally fully contained in one variable
length sentence. The data may vary in the amount of precision
contained in the message. For example time might be indicated to
decimal parts of a second or location may be show with 3 or even 4
digits after the decimal point. Programs that read the data should
only use the commas to determine the field boundaries and not depend
on column positions. There is a provision for a checksum at the end of
each sentence which may or may not be checked by the unit that reads
the data. The checksum field consists of a '' and two hex digits
representing an 8 bit exclusive OR of all characters between, but not
including, the '$' and ''. A checksum is required on some sentences.

Splitting of string by spaces and outputting the columns into different arrays

So i have this text file which basically has 2 columns of letters and numbers separated spaces. I want to split these 2 columns and place them in separate arrays.
I tried using the getLine method with space as the delimiter but I am only able to place them in the same array. I can do this with fileOpen.eof method but that causes too many problems in my program
while(getline(openFile, letters, ' ')){
index++;
lettersArray[index] = letters;
}
I expect the output of lettersArray[index] to be a column of letters only.
I think you are using the getline function in the wrong way. Take a look at how it works here: http://www.cplusplus.com/reference/string/string/getline/
You are basically telling the getline function to use the space character to use as the delimiter. So it is processing the letters in the file in the odd numbered iterations of the while loop and the numbers in the file in the even numbered iterations of the while loop.
If you want to stick to using the getline function, here is a possible modification to make it work.
while(getline(openFile, letters, ' ')){
index++;
lettersArray[index] = letters;
getline(openFile, letters);
}
The call to the getline function on the last line of the while loop, gets rid of the remaining part of the current line.

Input filtering using scanf

I want to filter input. I don't know what is the best way. I want words starting with alpha-bates to be read. For example, if the input is:
This is 1 EXAMPLE1 input.
The string should be like this:
This is EXAMPLE1 input
What is the easiest way to filter input like this?
I tried using "%[a-zA-Z]s", but it not working.
Your scan string "%[a-zA-Z]s" probably isn't want you think it is. Drop that trailing s.
"%[a-zA-Z]" will scan a string consisting entirely of lower and uppercase letters. So numbers will be discounted. However, you want to scan alpha-numeric strings that begin with a lower or uppercase letter. scanf doesn't provide a facility to look for a string in that way. You can, instead, scan for an alpha-numeric string with "%[a-zA-Z0-9]", and then drop the scanned input if it the first character of the string is numeric.
Using scanf is tricky for various reasons. The string may be longer than you expect, and cause buffer overflow. If the input isn't in the format you expect, then scanf may fail to advance past the unexpected input. It is usually more reliable to read the input into a buffer unconditionally, and parse the buffer. For example:
const char *wants
= "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
std::string word;
while (std::cin >> word) {
if (!isalpha(word[0])) continue;
std::string::size_type p = word.find_first_not_of(wants);
word = word.substr(0, p);
//... do something with word
}

Wrap String with For Loop or If Statement

Let's say that we have a string declared...
string paragraphy = "This is a really really long string containing a paragraph long content. I want to wrap this text by using a for loop to do so.";
With this string variable I want to wrap the text if it is more than 60 width and if there is a space after those the 60 width.
Can someone please provide me with the code or any help in creating something like this.
A basic idea to solving this is to keep track of the last space in a segment of the string before the 60th character in that segment.
Since this is homework, I'll let you come up with the code, however here's some rough pseudo-code of the above suggestion:
- current_position = start of the string
- WHILE current_position NOT past the end of the string
- LOOP 1...60 from the current_position (also don't go past the end of the string)
- IF the character at current_position is a space, set the space_position to this position
- Replace the character (the space) at the space_position with a newline
- Set the current_position to the next character after the space_position
- If you're printing the string rather than inserting newline characters into it, you would print any remaining part of the string here.
You might also want to consider the case where you don't have any spaces in a block of 60 characters.

How to replace a string between two substrings in a string in VC++/MFC?

Say I have a CString object strMain="AAAABBCCCCCCDDBBCCCCCCDDDAA";
I also have two smaller strings, say strSmall1="BB";
strSmall2="DD";
Now, I want to replace all occurence of strings which occur between strSmall1("BB") and strSmall2("DD") in strMain, with say "KKKKKKK"
Is there a way to do it without Regex. I cannot use regex as adding another file to the project is prohibited.
Is there a way in VC++/MFC to do it? Or any easy algorithm you can point me to?
int length = strMain.GetLength();
int begin = strMain.Find(strSmall1, 0) + strSmall1.GetLength();
int end = strMain.Find(strSmall2, 0);
CStringT left = strMain.Left(begin);
CStringT right = strMain.Right(length - end);
strMain = left + "KKKKKKK" + right
The easiest way is probably to handle the replacement recursively. Search for the starting delimiter and the ending delimiter. If you find them, put together a new string consisting of the string up to the starting delimiter, followed by the replacement string, followed by the return from recursively doing the replacement in the remainder of the string following the ending delimiter.
That, of course, assumes you want to replace all the occurrences in the main string -- if you only want to replace the first one, John Weldon's solution (for one example) will work quite nicely.
psudocode:
loop over string
if curlocation matches string strsmall1 save index break
loop over remaining string
replace till curlocation matches string strsmall2
Extra credit:
What will the next assignment be?
My answer:
Speed it up by jumping the length of strsmall1 and strsmall2 in loop iterations