boost spirit qi match multiple elements - c++

I would like to create a parser based on boost spirit qi that will be able to parse a list of integer values. That is obviously extremely easy and there are tons of examples. The list though is a bit smarter than a comma separated list and it could looks like:
17, 5, fibonacci(2, 4), 71, 99, range(5, 7)
the result of the parser should be a std::vector with the following values:
17, 5, 1, 2, 3, 71, 99, 5, 6, 7
Where fibonacci(2, 4) results in 1, 2, 3 and range(5, 7) results in 5, 6, 7
Edit: What I am looking for is if I already have parsers that have an attribute int (say int_) and parsers that have an attribute std::vector fibonacci and range, how I can combine the results in a single parser. Something like:
list %= *(int_ | elements [ fibonacci | range ] );
Where elements to be the magic that will do the necessary magic the results form fibonacci to fit in the list.
Note: I am not looking for solution that includes append functions like
list = *(int_[push_back(_val, _1)] | fibonacci[push_back(_val, _1)] | range[push_back(_val, _1)] ] );

Here's a simplist take: Live On Coliru
typedef std::vector<int64_t> data_t;
value_list = -value_expression % ',';
value_expression = macro | literal;
literal = int_;
macro = (_functions > '(' > value_list > ')')
[ _pass = phx::bind(_1, _2, _val) ];
Where _functions is a qi::symbols table of functions:
qi::symbols<char, std::function<bool(data_t const& args, data_t& into)> > _functions;
Now, note that the input "17, 5, fibonacci(2, 4), 71, 99, range(5, 7)" results in
parse success
data: 17 5 1 2 3 71 99 5 6 7
But you can even get more funky: "range(fibonacci(13, 14))" results in:
parse success
data: 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377
As you can see, it prints the range from [fib(13)..fib(14)] which is [233..377] (Wolfram Alpha).
Full code (including demo implementations of fibonacci and range :)):
//#define BOOST_SPIRIT_DEBUG
#define BOOST_SPIRIT_USE_PHOENIX_V3
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx = boost::phoenix;
typedef std::vector<int64_t> data_t;
template <typename It, typename Skipper = qi::space_type>
struct parser : qi::grammar<It, data_t(), Skipper>
{
parser() : parser::base_type(value_list)
{
using namespace qi;
value_list = -value_expression % ',';
value_expression = macro | literal;
literal = int_;
macro = (_functions > '(' > value_list > ')')
[ _pass = phx::bind(_1, _2, _val) ];
_functions.add("fibonacci", &fibonacci);
_functions.add("range", &range);
BOOST_SPIRIT_DEBUG_NODES((value_list)(value_expression)(literal)(macro));
}
private:
static bool fibonacci(data_t const& args, data_t& into) {
// unpack arguments
if (args.size() != 2)
return false;
auto f = args[0], l = args[1];
// iterate
uint64_t gen0 = 0, gen1 = 1, next = gen0 + gen1;
for(auto i = 0u; i <= l; ++i)
{
switch(i) {
case 0: if (i>=f) into.push_back(gen0); break;
case 1: if (i>=f) into.push_back(gen1); break;
default:
{
next = gen0 + gen1;
if (i>=f) into.push_back(next);
gen0 = gen1;
gen1 = next;
break;
}
}
}
// done
return true;
}
static bool range(data_t const& args, data_t& into) {
// unpack arguments
if (args.size() != 2)
return false;
auto f = args[0], l = args[1];
if (l>f)
into.reserve(1 + l - f + into.size());
for(; f<=l; ++f)
into.push_back(f); // to optimize
return true;
}
qi::rule<It, data_t(), Skipper> value_list ;
qi::rule<It, data_t(), Skipper> value_expression, macro;
qi::rule<It, int64_t(), Skipper> literal;
qi::symbols<char, std::function<bool(data_t const& args, data_t& into)> > _functions;
};
bool doParse(const std::string& input)
{
typedef std::string::const_iterator It;
auto f(begin(input)), l(end(input));
parser<It, qi::space_type> p;
data_t data;
try
{
bool ok = qi::phrase_parse(f,l,p,qi::space,data);
if (ok)
{
std::cout << "parse success\n";
std::cout << "data: " << karma::format_delimited(karma::auto_, ' ', data) << "\n";
}
else std::cerr << "parse failed: '" << std::string(f,l) << "'\n";
if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
return ok;
} catch(const qi::expectation_failure<It>& e)
{
std::string frag(e.first, e.last);
std::cerr << e.what() << "'" << frag << "'\n";
}
return false;
}
int main()
{
assert(doParse("range(fibonacci(13, 14))"));
}

Related

leetcode 295 median in stream, runtime error?

Leetcode 295 is to find median in a data stream.
I want to use two heaps to implement it. which can make add a data from stream in O(logn), get the percentile in O(1).
left_heap is a min_heap which used to save the left data of requied percentile.
right_heap used to save data which is larger than percentile.
In class SortedStream, which can make add data o(logn) and make findMedian o(1)
#include <iostream>
#include <vector>
#include <climits>
#include <algorithm>
using namespace std;
class SortedStream {
public:
SortedStream(double percent, size_t rsize = 65536*16) : percent_(percent), reserve_size_(rsize) {
init();
}
void push(double v) { // time complexity, o(logn)
++size_;
double left_top = left_data_.back();
if (left_data_.empty() || v <= left_top) { left_data_.push_back(v); std::push_heap(left_data_.begin(), left_data_.end(), std::less<double>{}); }
else { right_data_.push_back(v); std::push_heap(right_data_.begin(), right_data_.end(), std::greater<double>{}); }
size_t idx = size_ * percent_ + 1;
size_t left_size = left_data_.size();
if (idx < left_size) {
// pop left top into right
std::pop_heap(left_data_.begin(), left_data_.end(), std::less<double>{});
double left_top = left_data_.back();
left_data_.pop_back();
right_data_.push_back(left_top);
std::push_heap(right_data_.begin(), right_data_.end(), std::less<double>{});
} else if (idx > left_size) {
// pop right top into left
std::pop_heap(right_data_.begin(), right_data_.end(), std::greater<double>{});
double right_top = right_data_.back();
right_data_.pop_back();
left_data_.push_back(right_top);
std::push_heap(left_data_.begin(), left_data_.end(), std::greater<double>{});
}
}
void init() {
size_t lsize = reserve_size_ * percent_ + 2;
left_data_.reserve(lsize);
right_data_.reserve(reserve_size_ - lsize + 2);
max_ = INT_MIN;
min_ = INT_MAX;
std::make_heap(left_data_.begin(), left_data_.end(), std::less<double>{});
std::make_heap(right_data_.begin(), right_data_.end(), std::greater<double>{});
size_ = 0;
}
size_t size() const { return size_; }
double max() const { return max_; }
double min() const { return min_; }
double percentile() const { // time complexity o(1)
return left_data_.back();
}
public:
double percent_;
size_t size_;
double max_, min_;
std::vector<double> left_data_, right_data_;
size_t reserve_size_;
};
class MedianFinder {
public:
MedianFinder() : ss(0.5){}
void addNum(int num) {ss.push(num);}
double findMedian() {return ss.percentile();}
SortedStream ss;
};
int main() {
MedianFinder* obj = new MedianFinder();
for (size_t i = 0; i< 15; ++i) {
obj->addNum(i);
double param_2 = obj->findMedian();
cout << "i = " << i << " median = " << param_2 << endl;
}
}
it's ok to run in my laptop, but when i submit in leetcode, it comes out:
Line 863: Char 45: runtime error: applying non-zero offset 18446744073709551608 to null pointer (stl_iterator.h)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stl_iterator.h:868:45
I never see this error.
can you help on this?
I like your (OP's) idea that Heap can be used to solve the task, heap of smaller and larger values. Also as #ArminMontigny suggested one can use std::priority_queue instead of plain heap, because priority queue is based on heap and adds easy to use helper methods. Regular heap is a kind of low-level backend for priority queue.
Based on these two suggestions and inspired by your interesting question I decide to implement short (30 lines) solution for your task (it uses random numbers as example input):
Try it online!
#include <queue>
#include <random>
#include <iostream>
int main() {
std::mt19937_64 rng{123};
std::priority_queue<int> smaller;
std::priority_queue<int, std::vector<int>, std::greater<int>> larger;
for (size_t i = 0; i < 100; ++i) {
int n = rng() % 1000;
if (smaller.empty() || n <= smaller.top())
smaller.push(n);
else
larger.push(n);
while (smaller.size() + 1 < larger.size()) {
smaller.push(larger.top());
larger.pop();
}
while (larger.size() + 1 < smaller.size()) {
larger.push(smaller.top());
smaller.pop();
}
double median = smaller.size() == larger.size() ?
(smaller.top() + larger.top()) / 2.0 :
smaller.size() < larger.size() ? larger.top() : smaller.top();
std::cout << "n = " << n << " med = " << median << " | ";
if ((i + 1) % 4 == 0)
std::cout << std::endl;
}
}
Output:
n = 504 med = 504 | n = 771 med = 637.5 | n = 101 med = 504 | n = 790 med = 637.5 |
n = 380 med = 504 | n = 388 med = 446 | n = 888 med = 504 | n = 406 med = 455 |
n = 53 med = 406 | n = 240 med = 397 | n = 749 med = 406 | n = 438 med = 422 |
n = 566 med = 438 | n = 238 med = 422 | n = 741 med = 438 | n = 817 med = 471 |
n = 810 med = 504 | n = 376 med = 471 | n = 816 med = 504 | n = 503 med = 503.5 |
n = 599 med = 504 | n = 264 med = 503.5 | n = 704 med = 504 | n = 132 med = 503.5 |
n = 740 med = 504 | n = 391 med = 503.5 | n = 563 med = 504 | n = 778 med = 533.5 |
n = 768 med = 563 | n = 136 med = 533.5 | n = 964 med = 563 | n = 368 med = 533.5 |
n = 653 med = 563 | n = 941 med = 564.5 | n = 976 med = 566 | n = 680 med = 582.5 |
n = 546 med = 566 | n = 200 med = 564.5 | n = 387 med = 563 | n = 698 med = 564.5 |
n = 562 med = 563 | n = 251 med = 562.5 | n = 257 med = 562 | n = 735 med = 562.5 |
n = 822 med = 563 | n = 212 med = 562.5 | n = 576 med = 563 | n = 368 med = 562.5 |
n = 783 med = 563 | n = 964 med = 564.5 | n = 234 med = 563 | n = 805 med = 564.5 |
n = 952 med = 566 | n = 162 med = 564.5 | n = 936 med = 566 | n = 493 med = 564.5 |
n = 88 med = 563 | n = 313 med = 562.5 | n = 580 med = 563 | n = 274 med = 562.5 |
n = 353 med = 562 | n = 701 med = 562.5 | n = 882 med = 563 | n = 249 med = 562.5 |
n = 19 med = 562 | n = 482 med = 554 | n = 327 med = 546 | n = 402 med = 525 |
n = 379 med = 504 | n = 521 med = 512.5 | n = 977 med = 521 | n = 550 med = 533.5 |
n = 434 med = 521 | n = 82 med = 512.5 | n = 581 med = 521 | n = 134 med = 512.5 |
n = 532 med = 521 | n = 860 med = 526.5 | n = 562 med = 532 | n = 225 med = 526.5 |
n = 907 med = 532 | n = 837 med = 539 | n = 671 med = 546 | n = 785 med = 548 |
n = 593 med = 550 | n = 533 med = 548 | n = 471 med = 546 | n = 352 med = 539.5 |
n = 388 med = 533 | n = 532 med = 532.5 | n = 310 med = 532 | n = 135 med = 532 |
n = 323 med = 532 | n = 81 med = 526.5 | n = 849 med = 532 | n = 577 med = 532 |
n = 643 med = 532 | n = 956 med = 532.5 | n = 204 med = 532 | n = 383 med = 532 |
Regarding your question about Sanitizer error - this sanitizer is a part of CLang. You can download Clang yourself and try it out on your home laptop, to reproduce exactly same error.
To get same error add option -fsanitize=undefined, when compiling using CLang at home.
For Windows CLang can be downloaded from this page. Also on Windows if you have great package manager Chocolatey, then you can install CLang + LLVM through short command choco install llvm.
For Linux CLang can be installed through sudo apt install clang.
Also you can use great online website GodBolt, by this link, at given link I already chosen CLang for compilation and put necessary options -std=c++11 -O0 -fsanitize=undefined, so you have just to start coding in the window to the left-handside when you open the link.
You have one minor problem
In the line
double left_top = left_data_.back();
At the very beginning, the std::vector "left_data" will be empty. If you try to access the last element of an empty vector, you will get an runtime error.
If you modify this line to for example:
double left_top = left_data_.empty()?0.0:left_data_.back();
Then your program will work as you expect it to work.
I personally find the approach a a little bit too complicated. Maybe you could use a std::multiset or a std::priority_queue. Especially the std::priority_queue will also implement a max-heap and a min-hep for you, without out the overhead of calling std::vectors heap functions.
But I am still in favor of the std::multiset . . .

Interpreting / Reading text files written for Assembly application

I am just starting out in C++.
I am writing a console application, to "read in" an .evt file (custom, not to be confused with Event viewer files in Windows) and its contents but now I need to write a method to.
a) Store each block of 'EVENT X' including but also ending at 'END'.
b) Make the contents of each block searchable/selectable.
If the content wasn't so 'wildly' varied, I would be happy to put this into some SQL table or experiment with an array but I don't know a starting point to do this as the number of 'fields' or parameters varies. The maximum number of lines I have seen in a block is around 20, the maximum number of parameters per line I have seen is around 13.
I'm not asking for an explicit answer or the whole code to do it although it is welcome, just a generic sample of code to get started that might be appropriate.
This my function to just load the data as it is.
void event_data_loader()
{
string evt_data;
string response2;
cout << "You have chosen to Create/Load Soma events\n\n";
ifstream named_EVT("C:/evts/1.evt");
while (getline(named_EVT, evt_data))
{
// Output the text from the file
cout << evt_data << "\n"; // Iterate out each line of the EVT file including spaces
//name_EVT.close();*/
}
cout << "Does the output look ok?(Y/N)";
cin >> response2;
if (response2 == "Y")
{
// Vectors? Dynamic array? to re-arrange the data?
}
}
The files themselves have content like this. I know what most of the functions do, less so all of the parameters. For some reason putting this on the page it puts them into a single line.
EVENT 01
A CHECK_HUMAN
A CHECK_POSITION 1 250 90 350 90
E BBS_OPEN 1 0
END
EVENT 02
E SELECT_MSG 336 363 314 337 03 338 12 -1 -1
END
EVENT 03
E RUN_EVENT 761
E RUN_EVENT 04
E RUN_EVENT 05
END
EVENT 761
A EXIST_ITEM 373 1
E SELECT_MSG 857 315 762 316 763 -1 -1 -1 -1
E RETURN
END
EVENT 762
A EXIST_ITEM 373 1
E ROB_ITEM 373 1
E SHOW_MAGIC 6
E CHANGE_HP 1 10000
E CHANGE_MP 1 10000
E MESSAGE_NONE 858
E RETURN
END
EVENT 1862
A ABSENT_EVENT 1582
A EXIST_ITEM 1800 1
A EXIST_ITEM 1801 1
A EXIST_ITEM 1802 1
A EXIST_ITEM 1803 1
A EXIST_ITEM 1804 1
A EXIST_ITEM 1805 1
A EXIST_ITEM 1806 1
A EXIST_ITEM 1807 1
A WEIGHT 365 1854 1 1832 1 -1 1 -1 -1 -1 -1
A CHECK_ITEMSLOT 393 1854 1 1832 1 -1 1 -1 -1 -1 -1
A GENDER 1
E ADD_EVENT 1582
E MESSAGE_NONE 3237
E ROB_ITEM 1800 1
E ROB_ITEM 1801 1
E ROB_ITEM 1802 1
E ROB_ITEM 1803 1
E ROB_ITEM 1804 1
E ROB_ITEM 1805 1
E ROB_ITEM 1806 1
E ROB_ITEM 1807 1
E GIVE_ITEM 1854 1
E GIVE_ITEM 1832 1
E RETURN
END
I would do something like this:
struct Subevent {
std::string selector;
std::string name;
std::vector<int> params;
};
struct Event {
int id;
std::vector<Subevent> subevents;
};
std::vector<Event> load_events(std::istream& input_stream) {
std::vector<Event> out;
Event current_event {}; // current event being built
std::string line;
bool inside_event = false; // are we inside the scope of an event?
while (std::getline(input_stream, line)) {
// strip trailing whitespace
while (isspace(line.back())) {
line.pop_back();
}
// skip empty lines
if (line.size() == 0) {
continue;
}
// read first token (until first space)
std::stringstream ss(line);
std::string first_token;
ss >> first_token;
bool is_new_event_line = first_token == "EVENT";
bool is_end_line = first_token == "END";
if (is_new_event_line) {
// line: EVENT <id>
if (inside_event) {
// error: "not expecting new event"
// choose your own error messaging method
}
int id;
ss >> id; // read <id>
// setup new event
current_event.id = id;
inside_event = true;
}
else if (is_end_line) {
// line: END
if (!inside_event) {
// error: "unexpected END"
}
// record and clear current event
out.push_back(current_event);
inside_event = false;
current_event = Event();
}
else {
// line: <selector> <name> <params...>
// e.g.: A GENDER 1
if (!inside_event) {
// error: "unexpected property entry"
}
// read subevent
Subevent subevent {};
subevent.selector = first_token;
ss >> subevent.name;
// copy over the int params from the line
std::copy(
std::istream_iterator<int>(ss),
std::istream_iterator<int>(),
std::back_inserter(subevent.params)
);
// push back subevent
event.subevents.push_back(subevent);
}
}
return out;
}

Tensorflow variable_scope for adam optimizer?

Versions: Python 2.7.13 and TF 1.2.1
Background: I'm trying to create a single LSTM cell and pass an input of N x M and output N x M+1. I want to pass the output through a softmax layer and then through an Adam optimizer with a loss function of negative log likelihood.
Problem: As stated in the title, when I try to set my training_op = optimizer.minimize(nll) it crashes and asks about a variable scope. What should I do?
Code:
with tf.variable_scope('lstm1', reuse=True):
LSTM_cell_1 = tf.nn.rnn_cell.LSTMCell(num_units=n_neurons, activation=tf.nn.relu)
rnn_outputs_1, states_1 = tf.nn.dynamic_rnn(LSTM_cell_1, X_1, dtype=tf.float32)
rnn_outputs_1 = tf.nn.softmax(rnn_outputs_1)
stacked_rnn_outputs_1 = tf.reshape(rnn_outputs_1, [-1, n_neurons])
stacked_outputs_1 = tf.layers.dense(stacked_rnn_outputs_1, n_outputs)
outputs_1 = tf.reshape(stacked_outputs_1, [-1, n_steps, n_outputs])
mu = tf.Variable(np.float32(1))
sigma = tf.Variable(np.float32(1))
def normal_log(X, mu, sigma, left=-np.inf, right=np.inf):
val = -tf.log(tf.constant(np.sqrt(2.0 * np.pi), dtype=tf.float32) * sigma) - \
tf.pow(X - mu, 2) / (tf.constant(2.0, dtype=tf.float32) * tf.pow(sigma, 2))
return val
nll = -tf.reduce_sum(normal_log(outputs, mu, sigma))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(nll)
Error message:
ValueError Traceback (most recent call last)
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.pyc in minimize(self, loss, global_step, var_list, gate_gradients, aggregation_method, colocate_gradients_with_ops, name, grad_loss)
323
324 return self.apply_gradients(grads_and_vars, global_step=global_step,
--> 325 name=name)
326
327 def compute_gradients(self, loss, var_list=None,
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.pyc in apply_gradients(self, grads_and_vars, global_step, name)
444 ([str(v) for _, _, v in converted_grads_and_vars],))
445 with ops.control_dependencies(None):
--> 446 self._create_slots([_get_variable_for(v) for v in var_list])
447 update_ops = []
448 with ops.name_scope(name, self._name) as name:
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/adam.pyc in _create_slots(self, var_list)
126 # Create slots for the first and second moments.
127 for v in var_list:
--> 128 self._zeros_slot(v, "m", self._name)
129 self._zeros_slot(v, "v", self._name)
130
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/optimizer.pyc in _zeros_slot(self, var, slot_name, op_name)
764 named_slots = self._slot_dict(slot_name)
765 if _var_key(var) not in named_slots:
--> 766 named_slots[_var_key(var)] = slot_creator.create_zeros_slot(var, op_name)
767 return named_slots[_var_key(var)]
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.pyc in create_zeros_slot(primary, name, dtype, colocate_with_primary)
172 return create_slot_with_initializer(
173 primary, initializer, slot_shape, dtype, name,
--> 174 colocate_with_primary=colocate_with_primary)
175 else:
176 val = array_ops.zeros(slot_shape, dtype=dtype)
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.pyc in create_slot_with_initializer(primary, initializer, shape, dtype, name, colocate_with_primary)
144 with ops.colocate_with(primary):
145 return _create_slot_var(primary, initializer, "", validate_shape, shape,
--> 146 dtype)
147 else:
148 return _create_slot_var(primary, initializer, "", validate_shape, shape,
/usr/local/lib/python2.7/site-packages/tensorflow/python/training/slot_creator.pyc in _create_slot_var(primary, val, scope, validate_shape, shape, dtype)
64 use_resource=_is_resource(primary),
65 shape=shape, dtype=dtype,
---> 66 validate_shape=validate_shape)
67 variable_scope.get_variable_scope().set_partitioner(current_partitioner)
68
/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, var_store, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
960 collections=collections, caching_device=caching_device,
961 partitioner=partitioner, validate_shape=validate_shape,
--> 962 use_resource=use_resource, custom_getter=custom_getter)
963
964 def _get_partitioned_variable(self,
/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource, custom_getter)
365 reuse=reuse, trainable=trainable, collections=collections,
366 caching_device=caching_device, partitioner=partitioner,
--> 367 validate_shape=validate_shape, use_resource=use_resource)
368
369 def _get_partitioned_variable(
/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in _true_getter(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, use_resource)
350 trainable=trainable, collections=collections,
351 caching_device=caching_device, validate_shape=validate_shape,
--> 352 use_resource=use_resource)
353
354 if custom_getter is not None:
/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape, use_resource)
662 " Did you mean to set reuse=True in VarScope? "
663 "Originally defined at:\n\n%s" % (
--> 664 name, "".join(traceback.format_list(tb))))
665 found_var = self._vars[name]
666 if not shape.is_compatible_with(found_var.get_shape()):
ValueError: Variable lstm1/dense/kernel/Adam_1/ already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
File "<ipython-input-107-eed033b85dc0>", line 11, in <module>
training_op = optimizer.minimize(nll)
File "/usr/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "/usr/local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
if self.run_code(code, result):
So turns out I was executing the section over and over again inside a Python notebook, so to all tf rookies out there, remember to reset your kernel every time

Tensorflow: TypeError: Expected string, got 1 of type 'int64' instead

I'm trying to create a logistic regression model in tensorflow.
When I try to execute model.fit(input_fn=train_input_fn, steps=200) I get the following error.
TypeError Traceback (most recent call last)
<ipython-input-44-fd050d8188b5> in <module>()
----> 1 model.fit(input_fn=train_input_fn, steps=200)
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.pyc in fit(self, x, y, input_fn, steps, batch_size, monitors)
180 feed_fn=feed_fn,
181 steps=steps,
--> 182 monitors=monitors)
183 logging.info('Loss for final step: %s.', loss)
184 return self
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.pyc in _train_model(self, input_fn, steps, feed_fn, init_op, init_feed_fn, init_fn, device_fn, monitors, log_every_steps, fail_on_nan_loss)
447 features, targets = input_fn()
448 self._check_inputs(features, targets)
--> 449 train_op, loss_op = self._get_train_ops(features, targets)
450
451 # Add default monitors.
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/linear.pyc in _get_train_ops(self, features, targets)
105 if self._linear_feature_columns is None:
106 self._linear_feature_columns = layers.infer_real_valued_columns(features)
--> 107 return super(LinearClassifier, self)._get_train_ops(features, targets)
108
109 #property
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.pyc in _get_train_ops(self, features, targets)
154 global_step = contrib_variables.get_global_step()
155 assert global_step
--> 156 logits = self._logits(features, is_training=True)
157 with ops.control_dependencies([self._centered_bias_step(
158 targets, self._get_weight_tensor(features))]):
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.pyc in _logits(self, features, is_training)
298 logits = self._dnn_logits(features, is_training=is_training)
299 else:
--> 300 logits = self._linear_logits(features)
301
302 return nn.bias_add(logits, self._centered_bias())
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.pyc in _linear_logits(self, features)
255 num_outputs=self._num_label_columns(),
256 weight_collections=[self._linear_weight_collection],
--> 257 name="linear")
258 return logits
259
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/feature_column_ops.pyc in weighted_sum_from_feature_columns(columns_to_tensors, feature_columns, num_outputs, weight_collections, name, trainable)
173 transformer = _Transformer(columns_to_tensors)
174 for column in sorted(set(feature_columns), key=lambda x: x.key):
--> 175 transformed_tensor = transformer.transform(column)
176 predictions, variable = column.to_weighted_sum(transformed_tensor,
177 num_outputs,
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/feature_column_ops.pyc in transform(self, feature_column)
353 return self._columns_to_tensors[feature_column]
354
--> 355 feature_column.insert_transformed_feature(self._columns_to_tensors)
356
357 if feature_column not in self._columns_to_tensors:
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/layers/python/layers/feature_column.pyc in insert_transformed_feature(self, columns_to_tensors)
410 mapping=list(self.lookup_config.keys),
411 default_value=self.lookup_config.default_value,
--> 412 name=self.name + "_lookup")
413
414
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/contrib/lookup/lookup_ops.pyc in string_to_index(tensor, mapping, default_value, name)
349 with ops.op_scope([tensor], name, "string_to_index") as scope:
350 shared_name = ""
--> 351 keys = ops.convert_to_tensor(mapping, dtypes.string)
352 vocab_size = array_ops.size(keys)
353 values = math_ops.cast(math_ops.range(vocab_size), dtypes.int64)
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.pyc in convert_to_tensor(value, dtype, name, as_ref)
618 for base_type, conversion_func in funcs_at_priority:
619 if isinstance(value, base_type):
--> 620 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
621 if ret is NotImplemented:
622 continue
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.pyc in _constant_tensor_conversion_function(v, dtype, name, as_ref)
177 as_ref=False):
178 _ = as_ref
--> 179 return constant(v, dtype=dtype, name=name)
180
181
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/constant_op.pyc in constant(value, dtype, shape, name)
160 tensor_value = attr_value_pb2.AttrValue()
161 tensor_value.tensor.CopyFrom(
--> 162 tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))
163 dtype_value = attr_value_pb2.AttrValue(type=tensor_value.tensor.dtype)
164 const_tensor = g.create_op(
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.pyc in make_tensor_proto(values, dtype, shape)
351 nparray = np.empty(shape, dtype=np_dt)
352 else:
--> 353 _AssertCompatible(values, dtype)
354 nparray = np.array(values, dtype=np_dt)
355 # check to them.
/home/praveen/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.pyc in _AssertCompatible(values, dtype)
288 else:
289 raise TypeError("Expected %s, got %s of type '%s' instead." %
--> 290 (dtype.name, repr(mismatch), type(mismatch).__name__))
291
292
TypeError: Expected string, got 1 of type 'int64' instead.
I'm not sure which feature to check. Could somebody tell me how could debug this please? Thanks in advance
I had few categorical columns features whose data types are int64. So, I converted the columns from int to string. After that the fit step ran to completion. Apparently, tensorflow expects the categorical features dtype to be string.

Boost Spirit: parse a section of an input

I have thousands of lines of input, that each line consists of 3 ints and a comma at the and that look like this:
5 6 10,
8 9 45,
.....
How can I create a grammar that parses only a certain section of an input, for example first 100 lines or from line 1000 to 1200 and ignores the rest.
My grammar currently looks like this:
qi::int_ >> qi::int_ >> qi::int_ >> qi::lit(",");
But obviously it parses the whole input.
You could just seek up to the interesting point and parse 100 lines there.
A sketch on how to skip 100 lines from just spirit:
Live On Coliru
#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/adapted/std_tuple.hpp>
#include <tuple>
namespace qi = boost::spirit::qi;
int main() {
using It = boost::spirit::istream_iterator;
using Tup = std::tuple<int, int, int>;
It f(std::cin >> std::noskipws), l;
std::vector<Tup> data;
using namespace qi;
if (phrase_parse(f, l,
omit [ repeat(100) [ *(char_ - eol) >> eol ] ] >> // omit 100 lines
repeat(10) [ int_ >> int_ >> int_ >> ',' >> eol ], // parse 10 3-tuples
blank, data))
{
int line = 100;
for(auto tup : data)
std::cout << ++line << "\t" << boost::fusion::as_vector(tup) << "\n";
}
}
When tested with some random input like
od -Anone -t d2 /dev/urandom -w6 | sed 's/$/,/g' | head -200 | tee log | ./test
echo ============== VERIFY WITH sed:
nl log | sed -n '101,110p'
It'll print something expected, like:
101 (15400 5215 -20219)
102 (26426 -17361 -6618)
103 (-15311 -6387 -5902)
104 (22737 14339 16074)
105 (-28136 21003 -11594)
106 (-11020 -32377 -4866)
107 (-24024 10995 22766)
108 (3438 -19758 -10931)
109 (28839 22032 -7204)
110 (-25237 23224 26189)
============== VERIFY WITH sed:
101 15400 5215 -20219,
102 26426 -17361 -6618,
103 -15311 -6387 -5902,
104 22737 14339 16074,
105 -28136 21003 -11594,
106 -11020 -32377 -4866,
107 -24024 10995 22766,
108 3438 -19758 -10931,
109 28839 22032 -7204,
110 -25237 23224 26189,
Just because I want to learn more about Spirit X3, and because the worlds would like to know more about this upcoming version of the library, here's a more intricate version that shows a way to dynamically filter lines according to some expression.
In this case the lines are handled by this handler:
auto handle = [&](auto& ctx) mutable {
using boost::fusion::at_c;
if (++line_no % 10 == 0)
{
auto& attr = x3::_attr(ctx);
data.push_back({ at_c<0>(attr), at_c<1>(attr), at_c<2>(attr) });
}
};
As you'd expect every 10th line is included.
Live On Coliru
#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
int main() {
using It = boost::spirit::istream_iterator;
It f(std::cin >> std::noskipws), l;
struct Tup { int a, b, c; };
std::vector<Tup> data;
size_t line_no = 0;
auto handle = [&](auto& ctx) mutable {
using boost::fusion::at_c;
if (++line_no % 10 == 0)
{
auto& attr = x3::_attr(ctx);
data.push_back({ at_c<0>(attr), at_c<1>(attr), at_c<2>(attr) });
}
};
if (x3::phrase_parse(f, l, (x3::int_ >> x3::int_ >> x3::int_) [ handle ] % (',' >> x3::eol), x3::blank))
{
for(auto tup : data)
std::cout << tup.a << " " << tup.b << " " << tup.c << "\n";
}
}
Prints e.g.
g++ -std=c++1y -O2 -Wall -pedantic -pthread main.cpp -o test
od -Anone -t d2 /dev/urandom -w6 | sed 's/$/,/g' | head -200 | tee log | ./test
echo ============== VERIFY WITH perl:
nl log | perl -ne 'print if $. % 10 == 0'
-8834 -947 -8151
13789 -20056 -11874
6919 -27211 -19472
-7644 18021 13523
-20120 16923 -11419
27772 31149 14005
3540 4894 -24790
10698 10223 -30397
-22533 -32437 -13665
25813 3264 -16414
11453 11955 18268
5092 27052 17930
10915 6493 20432
-14380 -6085 -25430
18599 6710 17279
22049 22259 -32189
1048 14621 6452
-24996 10856 29429
3537 -26338 19623
-4117 6617 14009
============== VERIFY WITH perl:
10 -8834 -947 -8151,
20 13789 -20056 -11874,
30 6919 -27211 -19472,
40 -7644 18021 13523,
50 -20120 16923 -11419,
60 27772 31149 14005,
70 3540 4894 -24790,
80 10698 10223 -30397,
90 -22533 -32437 -13665,
100 25813 3264 -16414,
110 11453 11955 18268,
120 5092 27052 17930,
130 10915 6493 20432,
140 -14380 -6085 -25430,
150 18599 6710 17279,
160 22049 22259 -32189,
170 1048 14621 6452,
180 -24996 10856 29429,
190 3537 -26338 19623,
200 -4117 6617 14009,