C++ : Pick portions/data from a string having fixed format - c++

I have a string with some fixed format. Let's say :
This is 24 day of Aug of 2016
So, is there a simple way in C++ (similar to strtol in C), so that I can extract the data into variables as below :
day = 24;
month = "Aug";
year = 2016;

You can accomplish this with a std::stringstream. You can load the string into the stringstream and then read it into the variables that you want. It will do the conversions for you into the data types that you are using. For example you could use
std::string input = "This is 24 day of Aug of 2016";
std::stringstream ss(input)
std::string eater; // used to eat non needed input
std::string month;
int day, year;
ss >> eater >> eater >> day >> eater >> eater >> month >> eater >> year;
It looks a little verbose but now you don't need to use find and substr and conversion functions.

You can use the substr() function.
Sample code snippet as below:
string str = "This is 24 day of Aug of 2016";
std::string day = str.substr (8,2); //day = 24
std::string month = str.substr (18,3); //month = Aug
std::string year = str.substr (25,4); //year = 2016
The first parameter of substr() is the start position of the substring; while the second parameter specifies the number of characters to be read from that position.

Related

Reading specific column from .csv in C++

I have a .csv file that has around 5 rows and it looks something like this:
"University of Illinois, Chicago","1200, West Harrison","41.3233313","88.221376"
The first column is the name of the building, the second is the address and the third and fourth column represent the latitude and longitude.
I want to take only the values in the 3rd and 4th column for every row.
If I use the getline method and separate every entry with , I do not get the desired result. Here is a sample of what I am doing:
ifstream file("Kiosk Coords.csv");
double latt[num_of_lines];
double longg[num_of_lines];
string name;
string address;
string latitude;
string longitude;
flag = 0;
while(file.good()){
getline(file,name,',');
getline(file,address,',');
getline(file,latitude,',');
getline(file,longitude,'\n');
//cout<<name<<" "<<address<<" "<<latitude<<" "<<longitude<<endl;
cout<<longitude<<endl;
}
For the above given input, I get the following values in the variable if I use my method:
name = "University of Illinois"
address = "Chicago
latitude = "1200"
longitude = "West Harrison,41.3233313,88.221376"
What I specifically want is this:
latitude = "41.3233313"
longitude = "88.221376"
Please help
C++14's std::quoted to the rescue:
char comma;
file >> std::quoted(name) >> comma // name: University of Illinois, Chicago
>> std::quoted(address) >> comma // address: 1200, West Harrison
>> std::quoted(latitude) >> comma // latitude: 41.3233313
>> std::quoted(longitude) >> std::ws; // longitude: 88.221376
DEMO
I think you have to manually parse it. Given that all elements are wrapped in quotes, you can easily extract them by just looking for the quotes.
Read a whole line then look for a pair quotes, and take the content between them.
std::string str;
std::getline(file, str);
std::vector<std::string> cols;
std::size_t a, b;
a = str.find('\"', 0);
while (true) {
b = str.find('\"', a + 1);
if (b != std::string::npos){
cols.push_back(str.substr(a, b-a));
}
a = str.find('\"', b + 1);
}
Results (double quote included):
cols[0]: "University of Illinois, Chicago"
cols[1]: "1200, West Harrison"
cols[2]: "41.3233313"
cols[3]: "88.221376"

Read file with multiple spaces word by word c++

I want to read .txt file word by word and save words to strings. The problem is that the .txt file contains multiple spaces between each word, so I need to ignore spaces.
Here is the example of my .txt file:
John Snow 15
Marco Polo 44
Arya Stark 19
As you see, each line is another person, so I need to check each line individualy.
I guess the code must look something like this:
ifstream file("input.txt");
string line;
while(getline(file, line))
{
stringstream linestream(line);
string name;
string surname;
string years;
getline(linestream, name, '/* until first not-space character */');
getline(linestream, surname, '/* until first not-space character */');
getline(linestream, years, '/* until first not-space character */');
cout<<name<<" "<<surname<<" "<<years<<endl;
}
Expected result must be:
John Snow 15
Marco Polo 44
Arya Stark 19
You can just use operator>> of istream, it takes care of multiple whitespace for you:
ifstream file("input.txt");
string line;
while(!file.eof())
{
string name;
string surname;
string years;
file >> name >> surname >> years;
cout<<name<<" "<<surname<<" "<<years<<endl;
}

String tokenisation, split by token not separator

I see how to tokenise a string in the traditional manner (i.e. this answer here How do I tokenize a string in C++?) but how can I split a string by its tokens, also including them?
For example given a date/time picture such as yyyy\MMM\dd HH:mm:ss, I would like to split into an array with the following:
"yyyy", "\", "MMM", "\", "dd", " " , "HH", ":", "mm", ":", "ss"
The "tokens" are yyyy, MMM, dd, HH, mm, ss in this example. I don't know what the separators are, only what the tokens are. The separators need to appear in the final result however. The complete list of tokens is:
"yyyy" // – four-digit year, e.g. 1996
"yy" // – two-digit year, e.g. 96
"MMMM" // – month spelled out in full, e.g. April
"MMM" // – three-letter abbreviation for month, e.g. Apr
"MM" // – two-digit month, e.g. 04
"M" // – one-digit month for months below 10, e.g. 4
"dd" // – two-digit day, e.g. 02
"d" // – one-digit day for days below 10, e.g. 2
"ss" // - two digit second
"s" // - one-digit second for seconds below 10
"mm" // - two digit minute
"m" // - one-digit minute for minutes below 10
"tt" // - AM/PM designator
"t" // - first character of AM/PM designator
"hh" // - 12 hour two-digit for hours below 10
"h" // - 12 hour one-digit for hours below 10
"HH" // - 24 hour two-digit for hours below 10
"H" // - 24 hour one-digit for hours below 10
I've noticed the standard library std::string isn't very strong on parsing and tokenising and I can't use boost. Is there a tight, idiomatic solution? I'd hate to break out a C-style algorithm for doing this. Performance isn't a consideration.
Perhaps http://www.cplusplus.com/reference/cstring/strtok/ is what you're looking for, with a useful example.
However, it eats the delimiters. You could solve that problem with comparing the base pointer and the resulting string, moving forward by the string length.
#include <iostream>
#include <cstdio>
#include <cstring>
#include <vector>
#include <sstream>
int main()
{
char data[] = "yyyy\\MMM\\dd HH:mm:ss";
std::vector<std::string> tokens;
char* pch = strtok (data,"\\:"); // pch holds 'yyyy'
while (pch != NULL)
{
tokens.push_back(pch);
int delimeterIndex = static_cast<int>(pch - data + strlen(pch)); // delimeter index: 4, 8, ...
std::stringstream ss;
ss << delimeterIndex;
tokens.push_back(ss.str());
pch = strtok (NULL,"\\:"); // pch holds 'MMM', 'dd', ...
}
for (const auto& token : tokens)
{
std::cout << token << ", ";
}
}
This gives output of:
yyyy, 4, MMM, 8, dd HH, 14, mm, 17, ss, 20,

C++ Reading tab-delimited input and skipping blank fields

I am making a program that will take a long string of tab-delimited metadata pasted into the console by the user and split them into their correct variables. I have completed the code to split the line up by tab, but there are empty fields that should be skipped in order to put the correct metadata into the correct string variable, which I can't get to work.
Here is the code that I have so far:
string dummy;
string FAImport;
cin.ignore(1000, '\n');
cout << "\nPlease copy and paste the information from the finding aid and press Enter: ";
getline(cin, FAImport);
cout << FAImport;
stringstream ss(FAImport);
auto temp = ctype<char>::classic_table();
vector<ctype<char>::mask> bar(temp, temp + ctype<char>::table_size);
bar[' '] ^= ctype_base::space;
ss.imbue(locale(cin.getloc(), new ctype<char>(bar.data())));
ss >> coTitle >> altTitle >> description >> dateSpan >> edition >> publisher >>
physicalDescription >> scale >> extentField >> medium >> dimensions >> arrangement >>
degree >> contributing >> names >> topics >> geoPlaceNames >> genre >> occupations >>
functions >> subject >> langIN >> audience >> condition >> generalNotes >> collection >>
linkToFindingAid >> source >> SIRSI >> callNumber;
checkFAImport(); //shows the values of each variable
cout << "\n\nDone";
With this code, I get this output after inputing the metadata:
coTitle = William Gates photograph with Emiliano Zapata
altTitle = 1915
description = 1915
datespan = Electronic version
edition = 1 photograph : sepia ; 11 x 13 cm
publisher = L. Tom Perry Special Collections, Harold B. Lee Library, Brigham Young University
physicalDescription = Photographs
scale = William Gates papers
extentField = http://findingaid.lib.byu.edu/viewItem/MSS%20279/Series%2011/Subseries%205/Item%20979/box%20128/folder%2012
medium = William Gates photograph with Emiliano Zapata; MSS 279; William Gates papers; L. Tom Perry Special Collections; 20th Century Western & Mormon Manuscripts; 1130 Harold B. Lee Library; Brigham Young University; Provo, Utah 84602; http://sc.lib.byu.edu/
dimensions = MSS 279 Series 11 Subseries 5 Item 979 box 128 folder 12
arrangement =
degree =
contributing =
names =
topics =
geoPlaceNames =
genre =
occupations =
functions =
subject =
langIN =
audience =
condition =
generalNotes =
collection =
linkToFindingAid =
source =
SIRSI =
callNumber =
In this example, fields like altTitle and description should be blank and skipped. Any help would be much appreciated.
You've solved the issue with spaces in the fields in an elegant manner. Unfortunately, operator>> will skip consecutive tabs, as if they were one single separator. So, good bye the empty fields ?
One easy way to do it is to use getline() to read individual string fields:
getline (ss, coTitle, '\t');
getline (ss, altTitle, '\t');
getline (ss, description, '\t');
...
Another way is

How to achieve this specific datetime format using boost?

I want to format a datetime like this:
YYYYMMDD_HHMMSS
eg, 4 digit year, followed by 2 digit months, followed by 2 digit day, underscore, 24-hour hour, 2 digit minutes, 2 digit seconds.
e.g.: 16th of February 2011, 8:05 am and 2 seconds would be:
20110216_080502
What format string should I use in the following code to achieve this? (And, if necessary, what code changes are needed):
//...#includes, namespace usings...
ptime now = second_clock::universal_time();
wstringstream ss;
time_facet *facet = new time_facet("???"); //what goes here?
ss.imbue(locale(cout.getloc(), facet));
ss << now;
wstring datetimestring = ss.str();
Here are some strings I've tried so far:
%Y%m%d_%H%M%S : "2011-Feb-16 16:51:16"
%Y%m%d : "2011-Feb-16"
%H%M%S : "16:51:16"
Here's another one:
%Y : "2011-Feb-16 16:51:16" huh??
I believe you need to use wtime_facet, not time_facet. See the working program I posted on your other question.
From date/time facet format flags:
"%Y%m%d_%H%M%S"
time_facet *facet = new time_facet("%Y%m%d_%H%M%S");