how to use istream_iterators to split an equation? - c++

I'm trying to split a string like ( 1 + 2 ) into a vector and when using an istream_iterators<string> it doesn't split the parentheses so I get vector outputs like
(1 , + , 2) when I want ( , 1, + , 2 ,)
Is it possible to use istream_iterators to achieve this?
string eq = "(1 + 2)";
istringstream ss(eq);
istream_iterator<string> begin(ss);
istream_iterator<string> end;
vector<string> vec(begin, end);

You can do this by creating a custom type Token and using it with
istream_iterator. Bonus feature: this code will parse multiple digits, multiple operators, and nested expressions. So enjoy. :)
#include <iterator>
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
#include <cctype>
using namespace std;
class Token {
private:
string val;
public:
Token() : val("") {}
Token(string& v) : val(v) {}
friend istream& operator>>(istream &in, Token& tok);
friend ostream& operator<<(ostream &out, Token& tok);
};
istream& operator>>(istream &in, Token& tok) {
char c;
string v;
if (in >> c) {
if (isdigit(c)) {
v.push_back(c);
while (in >> c && isdigit(c)) {
v.push_back(c);
}
in.putback(c);
} else if (c == ' ') {
while (in >> c && c == ' ') ;
in.putback(c);
} else {
v.push_back(c);
}
}
tok = v;
return in;
}
ostream& operator<<(ostream &out, Token& tok) {
out << tok.val;
return out;
}
int main() {
string eq = "(1 + 2)";
//eq = "(100 + 200)"; // multiple digits
//eq = "(100 + 200 * 300)"; // extra operator
//eq = "(100 + (200 * 300))"; // nested parens
istringstream ss(eq);
istream_iterator<Token> begin(ss);
istream_iterator<Token> end;
vector<Token> vec(begin, end);
for (auto& x : vec) {
cout << "[" << x << "] ";
}
cout << endl;
}

I don't think you can do it using istream_iterator. Instead, simply do it by hand:
vector<string> vec;
vec.reserve(eq.size() / 4); // rough guess
bool in_number = false;
for (char ch : eq) {
if (isspace(ch)) {
in_number = false;
} else if (isdigit(ch)) {
if (in_number) {
vec.back().push_back(ch);
} else {
vec.emplace_back(1, ch);
in_number = true;
}
} else {
vec.emplace_back(1, ch);
in_number = false;
}
}

Related

Tokenize a std::string to a struct

Let's say I have the following string that I want to tokenize as per the delimiter '>':
std::string veg = "orange>kiwi>apple>potato";
I want every item in the string to be placed in a structure that has the following format:
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
};
I know how to do it this way:
pack_item pitem;
std::stringstream veg_ss(veg);
std::string veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it1 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it2 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it3 = veg_item;
std::getline(veg_ss, veg_item, '>')
pitem.it4 = veg_item;
Is there a better and one-liner kind of way to do it?
Something like this:
#include <string>
#include <vector>
#include <sstream>
#include <iostream>
std::string veg = "orange>kiwi>apple>potato";
typedef std::vector<std::string> it_vec;
int main(int argc, char* argv[]) {
it_vec vec;
std::stringstream veg_ss(veg);
std::string veg_item;
while (std::getline(veg_ss, veg_item, '>')) {
vec.push_back(veg_item);
}
for (const std::string& vec_item : vec) {
std::cout << vec_item << std::endl;
}
}
You don't need an intermediate variable.
pack_item pitem;
std::stringstream veg_ss(veg);
std::getline(veg_ss, pitem.it1, '>');
std::getline(veg_ss, pitem.it2, '>');
std::getline(veg_ss, pitem.it3, '>');
std::getline(veg_ss, pitem.it4, '>');
You might want to make that a function, e.g. operator >> (with a similar operator <<)
std::istream& operator >>(std::istream& is, pack_item & pitem) {
std::getline(is, pitem.it1, '>');
std::getline(is, pitem.it2, '>');
std::getline(is, pitem.it3, '>');
std::getline(is, pitem.it4, '>');
return is;
}
std::ostream& operator <<(std::ostream& os, pack_item & pitem) {
return os << pitem.it1 << '>'
<< pitem.it2 << '>'
<< pitem.it3 << '>'
<< pitem.it4 << '>';
}
int main() {
std::stringstream veg_ss("orange>kiwi>apple>potato>");
pack_item pitem;
veg_ss >> pitem;
}
Is there a better and one-liner kind of way to do it?
You can make a type who's >> reads in a string up to a delimiter, and read all four elements in one statement. Is that really "better"?
template <bool is_const>
struct delimited_string;
template<>
struct delimited_string<true> {
const std::string & string;
char delim;
};
template<>
struct delimited_string<false> {
std::string & string;
char delim;
};
delimited_string(const std::string &, char) -> delimited_string<true>;
delimited_string(std::string &, char) -> delimited_string<false>;
std::istream& operator >>(std::istream& is, delimited_string<false> s) {
return std::getline(is, s.string, s.delim);
}
template <bool is_const>
std::ostream& operator <<(std::ostream& os, delimited_string<is_const> s) {
return os << s.string << s.delim;
}
std::istream& operator >>(std::istream& is, pack_item & pitem) {
return is >> delimited_string { pitem.it1, '>' }
>> delimited_string { pitem.it2, '>' }
>> delimited_string { pitem.it3, '>' }
>> delimited_string { pitem.it4, '>' };
}
std::ostream& operator <<(std::ostream& os, const pack_item & pitem) {
return os << delimited_string { pitem.it1, '>' }
<< delimited_string { pitem.it2, '>' }
<< delimited_string { pitem.it3, '>' }
<< delimited_string { pitem.it4, '>' };
}
As suggested in the comments, you could use a for loop as such:
pack_item a;
std::array<std::reference_wrapper<std::string>, 4> arr{a.it1, a.it2, a.it3, a.it4};
constexpr std::string_view veg = "orange>kiwi>apple>potato";
std::istringstream ss(veg.data());
std::string str;
for(std::size_t idx = 0; std::getline(ss, str, '>'); ++idx){
arr[idx].get() = std::move(str);
}
If you meant "one-liner" in its true sense, then you could be nasty and use:
std::getline(std::getline(std::getline(std::getline(ss, a.it1, '>'), a.it2, '>'), a.it3, '>'), a.it4, '>');
Indeed:
#include <iostream>
#include <sstream>
#include <string>
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
};
pack_item pack( const std::string & s )
{
pack_item p;
getline(getline(getline(getline(std::istringstream(s), p.it1,'>'), p.it2,'>'), p.it3,'>'), p.it4);
return p;
}
int main()
{
auto pitem = pack( "orange>kiwi>apple>potato" );
std::cout << pitem.it4 << "<" << pitem.it3 << "<" << pitem.it2 << "<" << pitem.it1 << "\n";
}
BTW, there is nothing wrong with multiple lines of code. The quest for the one-liner is often a distraction to doing things the Right Way™.
What I would do is to create a constructor with std::string_view as argument (the second, which is predefined, would be the separator), and use the find function.
The reason of using std::string_view is posted here: How exactly is std::string_view faster than const std::string&?
struct pack_item
{
std::string it1;
std::string it2;
std::string it3;
std::string it4;
pack_item():it1(){}
pack_item(std::string_view in, char sep = '>'){
auto ptr = in.begin();
auto l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it1 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it2 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it3 = std::string(l_ptr, ptr++);
l_ptr = ptr;
ptr = std::find(ptr, in.end(), sep);
it4 = std::string(l_ptr, ptr++);
}
};
You can see here that this can be easily converted into a loop if you want and stop it by checking:
if(ptr == in.end()) break;

How do I split a string into a list of strings between opening and closing parentheses in c++?

I needed some help with a method on how to parse a string into multiple substrings. The form of the string may be (if (a = b) (a) (b)) or something similar with many opening and closing parentheses. For instance I need a list of strings such as,
element(0) = "(if (a = b) (a) (b))"
element(1) = "(a = b)",
element(2) = "(a)", and
element(3) = "(b)".
I have already tried going through the string by each individual character using String.at() and counting the opening and closing parentheses. However this gets very tricky, and I don't believe its the most efficient or easy way to do this. Any ideas would be greatly appreciated!
You can start from simple algorithm with stack:
#include <iostream>
#include <stack>
#include <string>
#include <deque>
std::deque<std::string> parse(const std::string &str)
{
std::deque<std::string> result;
std::stack<std::string::const_iterator> stack;
for ( auto it = str.begin(); it != str.end();) {
if (*it == '(') {
stack.push(it++);
} else if (*it == ')') {
auto start = stack.top(); stack.pop();
result.push_back(std::string{start, ++it});
} else {
it++;
}
}
return result;
}
int main(int , char **) {
std::string input = "(if (a = b) (a) (b))";
auto output = parse(input);
for(const auto & s:output) {
std::cout << s << " ";
}
std::cout <<std::endl;
return 0;
}
Don't forget to add check if stack underflows
Or, if you want to preserve exact order as in question, use std::map<std::size_t, std::deque<std::string>>:
#include <iostream>
#include <stack>
#include <string>
#include <deque>
#include <map>
std::deque<std::string> parse(const std::string &str)
{
std::map<std::size_t, std::deque<std::string>> map;
std::stack<std::string::const_iterator> stack;
for ( auto it = str.begin(); it != str.end();) {
if (*it == '(') {
stack.push(it++);
} else if (*it == ')') {
auto start = stack.top(); stack.pop();
map[stack.size()].push_back(std::string{start, ++it});
} else {
it++;
}
}
std::deque<std::string> result;
for (const auto & p : map) {
for (const auto & s : p.second) {
result.push_back(s);
}
}
return result;
}
int main(int , char **) {
std::string input = "(if (a = b) (a) (b))";
auto output = parse(input);
for(const auto & s:output) {
std::cout << s << " ";
}
std::cout <<std::endl;
return 0;
}

How to report custom istream failures

I would like to use operator>>() for reading linear algebraic data from console. I would like operator>>() to behave like it does for build-in data (like int, double), but I also wish to report appropriate messages when input cannot be parsed.
I’ve finally constructed a ‘custom_istream_failure’ class, but all together it was quite a hassle. Now I wonder: Is this the way to go, or does another mechanism exist for this purpose? Is this in the spirit of the standard?
I’ve included a small test program which reports a custom failure in the ‘expect’ function. Additionally I’ve included the ‘custom_istream_failure.h’ header file this question is about.
#include <iostream>
#include "custom_istream_failure.h"
struct vector_t { int x, y, z; };
bool expect(std::istream& is, char e)
{
if (is.get() != e)
{
custom_istream_failure(is)
<< "Expected '" << e << '\''
<< and_throw;
return false;
}
return true;
}
std::istream& operator>>(
std::istream& is, vector_t& v)
{
expect(is, '(') &&
(is >> v.x) &&
expect(is, ',') &&
(is >> v.y) &&
expect(is, ',') &&
(is >> v.z) &&
expect(is, ')');
return is;
}
int main()
{
try
{
std::cin.exceptions(std::istream::failbit);
vector_t vector;
std::cin >> vector;
}
catch (std::exception& e)
{
std::cout << e.what() << std::endl;
}
return 0;
}
#ifndef CUSTOM_ISTREAM_FAILURE
#define CUSTOM_ISTREAM_FAILURE
#include <iostream>
#include <string>
#include <sstream>
class custom_istream_failure
: protected std::stringstream
{
public:
explicit custom_istream_failure(std::istream& is)
: m_is(is)
{}
custom_istream_failure& operator<<(
custom_istream_failure&
(*pf)(custom_istream_failure&))
{
return ((*pf)(*this));
}
#define CUSTOM_ISTREAM_FAILURE_SOP(D) \
custom_istream_failure& operator<<(D) \
{ \
*static_cast<std::stringstream*>(this) << v; \
return *this; \
}
CUSTOM_ISTREAM_FAILURE_SOP(bool v)
CUSTOM_ISTREAM_FAILURE_SOP(short v)
CUSTOM_ISTREAM_FAILURE_SOP(unsigned short v)
CUSTOM_ISTREAM_FAILURE_SOP(int v)
CUSTOM_ISTREAM_FAILURE_SOP(unsigned int v)
CUSTOM_ISTREAM_FAILURE_SOP(long v)
CUSTOM_ISTREAM_FAILURE_SOP(unsigned long v)
CUSTOM_ISTREAM_FAILURE_SOP(long long v)
CUSTOM_ISTREAM_FAILURE_SOP(unsigned long long v)
CUSTOM_ISTREAM_FAILURE_SOP(float v)
CUSTOM_ISTREAM_FAILURE_SOP(double v)
CUSTOM_ISTREAM_FAILURE_SOP(long double v)
CUSTOM_ISTREAM_FAILURE_SOP(void* v)
CUSTOM_ISTREAM_FAILURE_SOP(std::streambuf* v)
CUSTOM_ISTREAM_FAILURE_SOP(
std::ostream& (*v)(std::ostream&))
CUSTOM_ISTREAM_FAILURE_SOP(
std::ios& (*v)(std::ios&))
CUSTOM_ISTREAM_FAILURE_SOP(
std::ios_base& (*v)(std::ios_base&))
CUSTOM_ISTREAM_FAILURE_SOP(char v)
CUSTOM_ISTREAM_FAILURE_SOP(signed char v)
CUSTOM_ISTREAM_FAILURE_SOP(unsigned char v)
CUSTOM_ISTREAM_FAILURE_SOP(const char* v)
CUSTOM_ISTREAM_FAILURE_SOP(const signed char* v)
CUSTOM_ISTREAM_FAILURE_SOP(const unsigned char*v);
#undef CUSTOM_ISTREAM_FAILURE_SOP
private:
std::istream& m_is;
friend custom_istream_failure& and_throw(
custom_istream_failure&);
};
inline custom_istream_failure& and_throw(
custom_istream_failure& cif)
{
try { throw std::ios_base::failure(cif.str()); }
catch (...)
{
cif.m_is.setstate(std::ios::failbit, true);
}
return (cif);
}
#endif // CUSTOM_ISTREAM_FAILURE
One approach is to just make the input operator for your type handle the error conditions (example below). If a user wants exceptions to be thrown on error the setstate will trigger an appropriate exception. And for users that don't want exceptions will just check the status of the input stream in the usual manner.
std::istream &operator>>(std::istream &is, vector_t &v) {
char c;
if (is >> c && c == '(') {
int x = 0;
if (is >> x >> c && c == ',') {
int y = 0;
if (is >> y >> c && c == ',') {
int z = 0;
if (is >> z >> c && c == ')') {
v = {x, y, z};
return is;
}
}
}
}
is.setstate(std::ios_base::failbit);
return is;
}

Is there a way to stop std::cin when it encounters a `\n`

I've written code that works pretty well except for one thing. The task that I'm making this code for inputs data into the program as a whitespace seperated string of doubles. And their precicion may be larger than 10^-25. So I made my own class that can handle that.
The problem is, when I was writing the code, I was testing it by entering two values into the console by hand each time pressing enter so that my program can understand where one double ends and another starts (it was looking for a '\n' basically).
Now I really need to adapt this code to make i work with my task's input (whitespace seperated list of doubles like 2.521 32.12334656 23.21 .....). But I'm having a problem with getline in my overloaded >> operator. It simply eats the '\n' character and starts looking for more input. The only way I can get it to work is by manually enetering values and manually entering an additional whitespace after the last value and only then hit enter.
I'm asking for your help.
Here's the full code:
#include <iostream>
#include <string>
#include <algorithm>
class BigNumber {
private:
std::string fullPart;
std::string floatPart;
public:
BigNumber() : fullPart("0"), floatPart("0") {}
friend std::ostream & operator << (std::ostream & os, const BigNumber & bn);
friend std::istream & operator >> (std::istream & os, BigNumber & bn);
void operator+=(BigNumber & bn);
};
int main()
{
BigNumber bn, bntemp;
while (std::cin >> bntemp)
{
bn += bntemp;
if (std::cin.peek() == '\n')
break;
}
std::cout << bn << std::endl;
return 0;
}
void addFullPart(const std::string & add, std::string & add_to)
{
auto addConv = std::stold(add);
auto addToConv = std::stold(add_to);
auto newFull = std::to_string(addConv + addToConv);
add_to = std::string(newFull.begin(), std::find(newFull.begin(), newFull.end(), '.'));
}
bool carryReminder(std::string & add_to, int32_t indx_from)
{
for (auto curr = indx_from; curr >= 0; --curr)
{
if (add_to[curr] != '9')
{
++(add_to[curr]);
return true;
}
else
add_to[curr] = '0';
}
return false;
}
std::pair<std::string, int32_t> addFloatPart(std::string & add, std::string & add_to)
{
std::string resultFloat;
int32_t reminderReturn{};
// don't forget to reverse str
if (add.size() != add_to.size())
{
// add remaining 0's
if (add.size() < add_to.size())
{
while (add.size() != add_to.size())
{
auto tempBigger = add_to.back();
add_to.pop_back();
resultFloat.push_back(tempBigger);
}
}
else
{
while (add.size() != add_to.size())
{
auto tempBigger = add.back();
add.pop_back();
resultFloat.push_back(tempBigger);
}
}
}
// now they are equal and have a form of 120(3921) 595
for (int32_t i = add_to.size() - 1; i >= 0; --i)
{
int32_t add_toDigit = add_to[i] - '0';
int32_t addDigit = add[i] - '0';
if (add_toDigit + addDigit >= 10)
{
resultFloat.append(std::to_string((add_toDigit + addDigit) - 10));
// we have a remainder
if (i == 0 || !carryReminder(add_to, i - 1))
reminderReturn = 1;
}
else
{
resultFloat.append(std::to_string(add_toDigit + addDigit));
}
}
std::reverse(resultFloat.begin(), resultFloat.end());
return std::make_pair(resultFloat, reminderReturn);
}
std::ostream & operator<<(std::ostream & os, const BigNumber & bn)
{
os << bn.fullPart << "." << bn.floatPart;
return os;
}
std::istream & operator>>(std::istream & is, BigNumber & bn)
{
std::string temp;
std::getline(is, temp, ' ');
auto fullPartTemp = std::string(temp.begin(), std::find(temp.begin(), temp.end(), '.'));
auto floatPartTemp = std::string(std::find(temp.begin(), temp.end(), '.') + 1, temp.end());
bn.floatPart = floatPartTemp;
bn.fullPart = fullPartTemp;
return is;
}
void BigNumber::operator+=(BigNumber & bn)
{
auto pair = addFloatPart(bn.floatPart, floatPart);
floatPart = pair.first;
if (pair.second > 0)
addFullPart(std::to_string(std::stoi(bn.fullPart) + 1), fullPart);
else
addFullPart(bn.fullPart, fullPart);
}
I suggest that you first use getline to read a line. Then you can make an istringstream and use your >> on that. Specifically, you could add #include <sstream> and change the main function to the following:
int main()
{
BigNumber bn, bntemp;
std::string temp;
std::getline(std::cin, temp);
std::istringstream ln(temp);
while (ln.good()) {
ln >> bntemp;
bn += bntemp;
}
std::cout << bn << std::endl;
return 0;
}
Two changes are needed. In main
Discarded the peek approach. Too brittle.
int main()
{
BigNumber bn, bntemp;
std::string line;
std::getline(std::cin, line);
std::stringstream stream(line);
while (stream >> bntemp)
{
bn += bntemp;
}
std::cout << bn << std::endl;
return 0;
}
And in operator>>
std::istream & operator >> (std::istream & is, BigNumber & bn)
{
std::string temp;
// also do NOTHING if the read fails!
if (std::getline(is, temp, ' '))
{
// recommend some isdigit testing in here to make sure you're not
// being fed garbage. Set fail flag in stream and bail out.
auto floatPartTemp = std::string(temp.begin(), std::find(temp.begin(), temp.end(), '.'));
// if there is no . you are in for a world of hurt here
auto floatPartTemp = std::string(std::find(temp.begin(), temp.end(), '.') + 1, temp.end());
bn.floatPart = ;
bn.fullPart = fullPartTemp;
}
return is;
}
So it should probably look more like
std::istream & operator >> (std::istream & is, BigNumber & bn)
{
std::string temp;
if (std::getline(is, temp, ' '))
{
if (std::all_of(temp.cbegin(), temp.cend(), [](char ch) { return isdigit(ch) || ch == '.'; }))
{
auto dotpos = std::find(temp.begin(), temp.end(), '.');
bn.fullPart = std::string(temp.begin(), dotpos);
std::string floatPartTemp;
if (dotpos != temp.end())
{
floatPartTemp = std::string(dotpos + 1, temp.end());
}
bn.floatPart = floatPartTemp;
}
else
{
is.setstate(std::ios::failbit);
}
}
return is;
}
Perhaps you could then use
std::string temp;
is >> temp;
instead of std::getline().
If I remember well that breaks on spaces and keeps the newline in the buffer.

Binary Cosine Cofficient

I was given the following forumulae for calculating this
sim=|Q∩D| / √|Q|√|D|
I went ahed and implemented a class to compare strings consisting of a series of words
#pragma once
#include <vector>
#include <string>
#include <iostream>
#include <vector>
using namespace std;
class StringSet
{
public:
StringSet(void);
StringSet( const string the_strings[], const int no_of_strings);
~StringSet(void);
StringSet( const vector<string> the_strings);
void add_string( const string the_string);
bool remove_string( const string the_string);
void clear_set(void);
int no_of_strings(void) const;
friend ostream& operator <<(ostream& outs, StringSet& the_strings);
friend StringSet operator *(const StringSet& first, const StringSet& second);
friend StringSet operator +(const StringSet& first, const StringSet& second);
double binary_coefficient( const StringSet& the_second_set);
private:
vector<string> set;
};
#include "StdAfx.h"
#include "StringSet.h"
#include <iterator>
#include <algorithm>
#include <stdexcept>
#include <iostream>
#include <cmath>
StringSet::StringSet(void)
{
}
StringSet::~StringSet(void)
{
}
StringSet::StringSet( const vector<string> the_strings)
{
set = the_strings;
}
StringSet::StringSet( const string the_strings[], const int no_of_strings)
{
copy( the_strings, &the_strings[no_of_strings], back_inserter(set));
}
void StringSet::add_string( const string the_string)
{
try
{
if( find( set.begin(), set.end(), the_string) == set.end())
{
set.push_back(the_string);
}
else
{
//String is already in the set.
throw domain_error("String is already in the set");
}
}
catch( domain_error e)
{
cout << e.what();
exit(1);
}
}
bool StringSet::remove_string( const string the_string)
{
//Found the occurrence of the string. return it an iterator pointing to it.
vector<string>::iterator iter;
if( ( iter = find( set.begin(), set.end(), the_string) ) != set.end())
{
set.erase(iter);
return true;
}
return false;
}
void StringSet::clear_set(void)
{
set.clear();
}
int StringSet::no_of_strings(void) const
{
return set.size();
}
ostream& operator <<(ostream& outs, StringSet& the_strings)
{
vector<string>::const_iterator const_iter = the_strings.set.begin();
for( ; const_iter != the_strings.set.end(); const_iter++)
{
cout << *const_iter << " ";
}
cout << endl;
return outs;
}
//This function returns the union of the two string sets.
StringSet operator *(const StringSet& first, const StringSet& second)
{
vector<string> new_string_set;
new_string_set = first.set;
for( unsigned int i = 0; i < second.set.size(); i++)
{
vector<string>::const_iterator const_iter = find(new_string_set.begin(), new_string_set.end(), second.set[i]);
//String is new - include it.
if( const_iter == new_string_set.end() )
{
new_string_set.push_back(second.set[i]);
}
}
StringSet the_set(new_string_set);
return the_set;
}
//This method returns the intersection of the two string sets.
StringSet operator +(const StringSet& first, const StringSet& second)
{
//For each string in the first string look though the second and see if
//there is a matching pair, in which case include the string in the set.
vector<string> new_string_set;
vector<string>::const_iterator const_iter = first.set.begin();
for ( ; const_iter != first.set.end(); ++const_iter)
{
//Then search through the entire second string to see if
//there is a duplicate.
vector<string>::const_iterator const_iter2 = second.set.begin();
for( ; const_iter2 != second.set.end(); const_iter2++)
{
if( *const_iter == *const_iter2 )
{
new_string_set.push_back(*const_iter);
}
}
}
StringSet new_set(new_string_set);
return new_set;
}
double StringSet::binary_coefficient( const StringSet& the_second_set)
{
double coefficient;
StringSet intersection = the_second_set + set;
coefficient = intersection.no_of_strings() / sqrt((double) no_of_strings()) * sqrt((double)the_second_set.no_of_strings());
return coefficient;
}
However when I try and calculate the coefficient using the following main function:
// Exercise13.cpp : main project file.
#include "stdafx.h"
#include <boost/regex.hpp>
#include "StringSet.h"
using namespace System;
using namespace System::Runtime::InteropServices;
using namespace boost;
//This function takes as input a string, which
//is then broken down into a series of words
//where the punctuaction is ignored.
StringSet break_string( const string the_string)
{
regex re;
cmatch matches;
StringSet words;
string search_pattern = "\\b(\\w)+\\b";
try
{
// Assign the regular expression for parsing.
re = search_pattern;
}
catch( regex_error& e)
{
cout << search_pattern << " is not a valid regular expression: \""
<< e.what() << "\"" << endl;
exit(1);
}
sregex_token_iterator p(the_string.begin(), the_string.end(), re, 0);
sregex_token_iterator end;
for( ; p != end; ++p)
{
string new_string(p->first, p->second);
String^ copy_han = gcnew String(new_string.c_str());
String^ copy_han2 = copy_han->ToLower();
char* str2 = (char*)(void*)Marshal::StringToHGlobalAnsi(copy_han2);
string new_string2(str2);
words.add_string(new_string2);
}
return words;
}
int main(array<System::String ^> ^args)
{
StringSet words = break_string("Here is a string, with some; words");
StringSet words2 = break_string("There is another string,");
cout << words.binary_coefficient(words2);
return 0;
}
I get an index which is 1.5116 rather than a value from 0 to 1.
Does anybody have a clue why this is the case?
Any help would be appreciated.
You need more parentheses in the final calculation. a / b * c is parsed as (a / b) * c, but you want a / (b * c).
Maybe it's just a precedence matter
coefficient = intersection.no_of_strings() / sqrt((double) no_of_strings()) * sqrt((double)the_second_set.no_of_strings());
doesn't specify that you have to first multiply, then divide. Their precedence is the same but I'm not sure about choosen behaviour.. did you try specifying it:
coefficient = intersection.no_of_strings() / (sqrt((double) no_of_strings()) * sqrt((double)the_second_set.no_of_strings()));