I'm trying to create a recursive descent parser. So far I have all the foundation set, I just need to properly implement a few functions to enforce the grammar. I thought everything was right, it looks it, but I guess my Aop, Expr, or Term function is doing something wrong. Sometimes the input stream gets cut off and things aren't recognized. I don't see how though.
Is there any site or source that explains this more in depth, with code examples? Everything I've seen is very generic, which is fine, but I'm stuck on implementation.
NOTE: Edit April 17 2016: My functions were pretty much alright and well structured for the context of my program. The problem I was having, and didn't realize, was that at certain instances when I called getToken I "ate up" characters from the input stream. Sometimes this is fine, other times it wasn't and the input stream needed to be reset. So I simply add a small loop in cases where I needed to put back strings char by char. E.G:
if(t.getType() !=Token::EQOP)
{
//cout<<"You know it" << endl;
int size = t.getLexeme().size();
while(size>0)
{
br->putback(t.getLexeme().at(size-1));
size--;
}
return ex;
}
So with that being said, I pretty much was able to edit my program accordingly and everything worked out once I saw what was eating up the characters.
This is the grammar :
Program::= StmtList
StmtList::= Stmt | StmtList
Stmt::= PRINTKW Aop SC | INTKW VAR SC | STRKW VAR SC | Aop SC
Expr::= Expr PLUSOP Term | Expr MINUSOP Term | Term
Term::= Term STAROP Primary | Primary
Primary::= SCONST | ICONST | VAR | LPAREN Aop RPAREN
Here's the main program with all of the functions: http://pastebin.com/qMB8h8vE
The functions that I seem to be having the most trouble with is AssignmentOperator(Aop), Expression(Expr), and Term. I'll list them here.
ParseTree* Aop(istream *br)
{
ParseTree * element = Expr(br);
if(element!=0)
{
if(element->isVariable())
{
Token t= getToken(br);
if(t==Token::EQOP)
{
cout<<"No" << endl;
ParseTree * rhs = Aop(br);
if(rhs==0)
return 0;
else
{
return new AssignOp(element, rhs);
}
}
else
{
return element;
}
}
}
return 0;
}
ParseTree* Expr(istream *br)
{
ParseTree * element = Term(br);
if(element!=0)
{
Token t=getToken(br);
if(t==Token::MINUSOP || t==Token::PLUSOP)
{
if(t==Token::PLUSOP)
{
ParseTree* rhs = Expr(br);
if(rhs==0)
return 0;
else
{
return new AddOp(element, rhs);
}
}
if(t==Token::MINUSOP)
{
ParseTree* rhs = Expr(br);
if(rhs==0)
return 0;
else
{
return new SubtractOp(element, rhs); //or switch the inputs idk
}
}
}
else
{
return element;
}
}
return 0;
}
ParseTree* Term(istream *br)
{
ParseTree *element = Primary(br);
if(element!=0)
{
Token t=getToken(br);
if(t==Token::STAROP)
{
ParseTree* rhs =Term(br);
if(rhs==0)
return 0;
else
{
return new MultiplyOp(element, rhs);
}
}
else
{
return element;
}
}
return 0;
}
In order to write a recusrive descent parser, you need to convert your grammar into LL form, getting rid of left recursion. For the rules
Term::= Term STAROP Primary | Primary
you'll get something like:
Term ::= Primary Term'
Term' ::= epsilon | STAROP Primary Term'
this then turns into a function something like:
ParseTree *Term(istream *br) {
ParseTree *element = Primary(br);
while (element && peekToken(br) == Token::STAROP) {
Token t = getToken(br);
ParseTree *rhs = Primary(br);
if (!rhs) return 0;
element = new MulOp(element, rhs); }
return element;
}
Note that you're going to need a peekToken function to look ahead at the next token without consuming it. Its also possible to use getToken + ungetToken to do the same thing.
Related
I am getting an error that I am having problems fixing as recursion hasn't "sunk in" yet.
It is supposed to go through an array of symbols already placed by the Class OrderManager Object and check if the symbol passed in is already there or not, if it is not there it should allow the trade, otherwise it will block it (multiple orders on the same currency compounds risk)
[Error] '}' - not all control paths return a value.
I believe it is because of the retest portion not having a return value but again I'm still newish to making my own recursive functions. However it may also be because my base and test cases are wrong possibly?
P.S I added (SE) comments in places to clarify language specific things since it is so close to C++.
P.P.S Due to the compiler error, I have no clue if this meets MVRC. Sorry everyone.
bool OrderManager::Check_Risk(const string symbol, uint iter = 0) {
if((iter + 1) != ArraySize(m_symbols) &&
m_trade_restrict != LEVEL_LOW) // Index is one less than Size (SE if
// m_trade_restrict is set to LOW, it
// allows all trades so just break out)
{
if(OrderSelect(OrderManager::Get(m_orders[iter]),
SELECT_BY_TICKET)) // Check the current iterator position
// order (SE OrderSelect() sets an
// external variable in the terminal,
// sort of like an environment var)
{
string t_base = SymbolInfoString(
OrderSymbol(),
SYMBOL_CURRENCY_BASE); // Test base (SE function pulls apart
// the Symbol into two strings
// representing the currency to check
// against)
string t_profit =
SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_PROFIT);
string c_base =
SymbolInfoString(symbol, SYMBOL_CURRENCY_BASE); // Current base
// (SE does the same as above but for the passed variable instead):
string c_profit = SymbolInfoString(symbol, SYMBOL_CURRENCY_PROFIT);
// Uses ENUM_LEVELS from Helpers.mqh (SE ENUM of 5 levels: Strict,
// High, Normal, Low, None in that order):
switch(m_trade_restrict) {
case LEVEL_STRICT: {
if(t_base == c_base || t_profit == c_profit) {
return false; // Restrictions won't allow doubling
// orders on any currency
} else
return Check_Risk(symbol, iter++);
};
case LEVEL_NORMAL: {
if(symbol == OrderSymbol()) {
return false; // Restrictions won't allow doubling
// orders on that curr pair
} else
return Check_Risk(symbol, iter++);
};
default: {
// TODO: Logging Manager
// Hardcoded constant global (SE set to LEVEL_NORMAL):
ENB_Trade_Restrictions(default_level);
return Check_Risk(symbol, iter);
}
}
}
} else {
return true;
}
}
So, I must just have been staring at the code for too long but the problem was the if(OrderSelect(...)) on ln 7 did not have a return case if the order was not properly set in the terminal. I will need to polish this but the following code removes the error.
bool OrderManager::Check_Risk(const string symbol, uint iter=0)
{
if((iter + 1) != ArraySize(m_symbols) && m_trade_restrict != LEVEL_LOW) // Index is one less than Size
{
if(OrderSelect(OrderManager::Get(m_orders[iter]), SELECT_BY_TICKET)) //Check the current iterator position order
{
string t_base = SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_BASE); //Test base
string t_profit = SymbolInfoString(OrderSymbol(), SYMBOL_CURRENCY_PROFIT);
string c_base = SymbolInfoString(symbol, SYMBOL_CURRENCY_BASE); //Current base
string c_profit = SymbolInfoString(symbol, SYMBOL_CURRENCY_PROFIT);
switch(m_trade_restrict) // Uses ENUM_LEVELS from Helpers.mqh
{
case LEVEL_STRICT :
{
if(t_base == c_base || t_profit == c_profit)
{
return false;
}
else return Check_Risk(symbol, ++iter);
};
case LEVEL_NORMAL :
{
if(symbol == OrderSymbol())
{
return false;
}
else return Check_Risk(symbol, ++iter);
};
default: {
// TODO: Logging Messages
ENB_Trade_Restrictions(default_level); //Hardcoded constant global
return Check_Risk(symbol, iter);
}
}
}
else {return Check_Risk(symbol, ++iter);}
}
else {return true;}
}
I am new to regex, can you please tell me how to take a query parameter with all the below combinations.
(ParamName=Operator:ParamValue) is my set of query parameter value. This will be separated with ;(AND) or ,(OR) and i want to group them within braces. Like in below example
Ex: http://,host:port>/get?search=(date=gt:2020-02-06T00:00:00.000Z;(name=eq:Test,department=co:Prod))
Here the date should be greater than 2020-02-06 and name = Test or department contains Prod.
How to parse these query parameters. Please suggest.
Thanks, Vijay
So, I wrote a solution in JavaScript, but it should be adaptable in other languages as well, with a bit of research.
It's quite a bit of code, but what you're looking to achieve is not super easy!
So here's the code bellow, it's thoroughly commented, but please, if you there is something you don't understand, ask away, and I'll be happy to answer you :)
//
// The 2 first regexes are a parameter, which looks like date=gt:2020-02-06T00:00:00.000Z for example.
// The difference between those 2 is that the 1st one has **named capture group**
// For example '(?<operator>...)' is a capture group named 'operator'.
// This will come in handy in the code, to keep things clean
//
const RX_NAMED_PARAMETER = /(?:(?<param>\w+)=(?<operator>\w+):(?<value>[\w-:.]+))/
const parameter = "((\\w+)=(\\w+):([\\w-:.]+)|(true|false))"
//
// The 3rd parameter is an operation between 2 parameters
//
const RX_OPERATION = new RegExp(`\\((?<param1>${parameter})(?:(?<and_or>[,;])(?<param2>${parameter}))?\\)`, '');
// '---------.---------' '-------.------' '----------.---------'
// 1st parameter AND or OR 2nd parameter
my_data = {
date: new Date(2000, 01, 01),
name: 'Joey',
department: 'Production'
}
/**
* This function compates the 2 elements, and returns the bigger one.
* The elements might be dates, numbers, or anything that can be compared.
* The elements **need** to be of the same type
*/
function isGreaterThan(elem1, elem2) {
if (elem1 instanceof Date) {
const date = new Date(elem2).getTime();
if (isNaN(date))
throw new Error(`${elem2} - Not a valid date`);
return elem1.getTime() > date;
}
if (typeof elem1 === 'number') {
const num = Number(elem2);
if (isNaN(num))
throw new Error(`${elem2} - Not a number`);
return elem1 > num;
}
return elem1 > elem2;
}
/**
* Makes an operation as you defined them in your
* post, you might want to change that to suit your needs
*/
function operate(param, operator, value) {
if (!(param in my_data))
throw new Error(`${param} - Invalid parameter!`);
switch (operator) {
case 'eq':
return my_data[param] == value;
case 'co':
return my_data[param].includes(value);
case 'lt':
return isGreaterThan(my_data[param], value);
case 'gt':
return !isGreaterThan(my_data[param], value);
default:
throw new Error(`${operator} - Unsupported operation`);
}
}
/**
* This parses the URL, and returns a boolean
*/
function parseUri(uri) {
let finalResult;
// As long as there are operations (of the form <param1><; or ,><param2>) on the URL
while (RX_OPERATION.test(uri)) {
// We replace the 1st operation by the result of this operation ("true" or "false")
uri = uri.replace(RX_OPERATION, rawOperation => {
// As long as there are parameters in the operations (e.g. "name=eq:Bob")
while (RX_NAMED_PARAMETER.test(rawOperation)) {
// We replace the 1st parameter by its value ("true" or "false")
rawOperation = rawOperation.replace(RX_NAMED_PARAMETER, rawParameter => {
const res = RX_NAMED_PARAMETER.exec(rawParameter);
return '' + operate(
res.groups.param,
res.groups.operator,
res.groups.value,
);
// The "res.groups.xxx" syntax is allowed by the
// usage of capture groups. See the top of the file.
});
}
// At this point, the rawOperation should look like
// (true,false) or (false;false) for example
const res = RX_OPERATION.exec(rawOperation);
let operation;
if (res.groups.param2 === undefined)
operation = res.groups.param1; // In case this is an isolated operation
else
operation = res.groups.param1 + ({',': ' || ', ';': ' && '}[res.groups.and_or]) + res.groups.param2;
finalResult = eval(operation);
return '' + finalResult;
});
}
return finalResult;
}
let res;
res = parseUri("http://,host:port>/get?search=(date=gt:2020-02-06T00:00:00.000Z;(name=eq:Test,department=co:Prod))");
console.log(res);
res = parseUri("http://,host:port>/get?search=(date=lt:2020-02-06T00:00:00.000Z)");
console.log(res);
I have several functions that try and evaluate some data. Each function returns a 1 if it can successfully evaluate the data or 0 if it can not. The functions are called one after the other but execution should stop if one returns a value of 1.
Example functions look like so:
int function1(std::string &data)
{
// do something
if (success)
{
return 1;
}
return 0;
}
int function2(std::string &data)
{
// do something
if (success)
{
return 1;
}
return 0;
}
... more functions ...
How would be the clearest way to organise this flow? I know I can use if statements as such:
void doSomething(void)
{
if (function1(data))
{
return;
}
if (function2(data))
{
return;
}
... more if's ...
}
But this seems long winded and has a huge number of if's that need typing. Another choice I thought of is to call the next function from the return 0 of the function like so
int function1(std::string &data)
{
// do something
if (success)
{
return 1;
}
return function2(data);
}
int function2(std::string &data)
{
// do something
if (success)
{
return 1;
}
return function3(data);
}
... more functions ...
Making calling cleaner because you only need to call function1() to evaluate as far as you need to but seems to make the code harder to maintain. If another check need to be inserted into the middle of the flow, or the order of the calls changes, then all of the functions after the new one will need to be changed to account for it.
Am I missing some smart clear c++ way of achieving this kind of program flow or is one of these methods best. I am leaning towards the if method at the moment but I feel like I am missing something.
void doSomething() {
function1(data) || function2(data) /* || ... more function calls ... */;
}
Logical-or || operator happens to have the properties you need - evaluated left to right and stops as soon as one operand is true.
I think you can make a vector of lambdas where each lambdas contains specific process on how you evaluate your data. Something like this.
std::vector<std::function<bool(std::string&)> listCheckers;
listCheckers.push_back([](std::string& p_data) -> bool { return function1(p_data); });
listCheckers.push_back([](std::string& p_data) -> bool { return function2(p_data); });
listCheckers.push_back([](std::string& p_data) -> bool { return function3(p_data); });
//...and so on...
//-----------------------------
std::string theData = "Hello I'm a Data";
//evaluate all data
bool bSuccess = false;
for(fnChecker : listCheckers){
if(fnChecker(theData)) {
bSuccess = true;
break;
}
}
if(bSuccess ) { cout << "A function has evaluated the data successfully." << endl; }
You can modify the list however you like at runtime by: external objects, config settings from file, etc...
I'm having some trouble with the following method and I need some help trying to figure out what I am doing wrong.
I want to return a reference to a Value in a document. I am passing the Document from outside the function so that when I read a json file into it I don't "lose it".
const rapidjson::Value& CTestManager::GetOperations(rapidjson::Document& document)
{
const Value Null(kObjectType);
if (m_Tests.empty())
return Null;
if (m_current > m_Tests.size() - 1)
return Null;
Test& the_test = m_Tests[m_current];
CMyFile fp(the_test.file.c_str()); // non-Windows use "r"
if (!fp.is_open())
return Null;
u32 operations_count = 0;
CFileBuffer json(fp);
FileReadStream is(fp.native_handle(), json, json.size());
if (document.ParseInsitu<kParseCommentsFlag>(json).HasParseError())
{
(...)
}
else
{
if (!document.IsObject())
{
(...)
}
else
{
auto tests = document.FindMember("td_tests");
if (tests != document.MemberEnd())
{
for (SizeType i = 0; i < tests->value.Size(); i++)
{
const Value& test = tests->value[i];
if (test["id"].GetInt() == the_test.id)
{
auto it = test.FindMember("operations");
if (it != test.MemberEnd())
{
//return it->value; is this legitimate?
return test["operations"];
}
return Null;
}
}
}
}
}
return Null;
}
Which I am calling like this:
Document document;
auto operations = TestManager().GetOperations(document);
When I inspect the value of test["operations"] inside the function I can see everything I would expect (debug code removed from the abode code).
When I inspect the returned value outside the function I can see that it's an array (which I expect). the member count int the array is correct as well, but when print it out, I only see garbage instead.
When I "print" the Value to a string inside the methods, I get what I expect (i.e. a well formated json), but when I do it outside all keys show up as "IIIIIIII" and values that aren't strings show up correctly.
rapidjson::StringBuffer strbuf2;
rapidjson::PrettyWriter<rapidjson::StringBuffer> writer2(strbuf2);
ops->Accept(writer2);
As this didn't work I decided to change the method to receive a Value as a parameter and do a deep copy into it like this
u32 CTestManager::GetOperationsEx(rapidjson::Document& document, rapidjson::Value& operations)
{
(...)
if (document.ParseInsitu<kParseCommentsFlag>(json).HasParseError())
{
(...)
}
else
{
if (!document.IsObject())
{
(...)
}
else
{
auto tests = document.FindMember("tests");
if (tests != document.MemberEnd())
{
for (SizeType i = 0; i < tests->value.Size(); i++)
{
const Value& test = tests->value[i];
if (test["id"].GetInt() == the_test.id)
{
const Value& opv = test["operations"];
Document::AllocatorType& allocator = document.GetAllocator();
operations.CopyFrom(opv, allocator); //would Swap work?
return operations.Size();
}
}
}
}
}
return 0;
}
Which I'm calling like this:
Document document;
Value operations(kObjectType);
u32 count = TestManager().GetOperationsEx(document, operations);
But... I get same thing!!!!
I know that it's going to be something silly but I can't put my hands on it!
Any ideas?
The problem in this case lies with the use of ParseInSitu. When any of the GetOperations exist the CFileBuffer loses scope and is cleaned up. Because the json is being parsed in-situ when the buffer to the file goes, so goes the data.
I want to define a sweet macro that transforms
{ a, b } # o
into
{ o.a, o.b }
My current attempt is
macro (#) {
case infix { { $prop:ident (,) ... } | _ $o } => {
return #{ { $prop: $o.$prop (,) ... } }
}
}
However, this give me
SyntaxError: [patterns] Ellipses level does not match in the template
I suspect I don't really understand how ... works, and may need to somehow loop over the values of $prop and build syntax objects for each and somehow concatenate them, but I'm at a loss as to how to do that.
The problem is the syntax expander thinks you're trying to expand $o.$prop instead of $prop: $o.$prop. Here's the solution:
macro (#) {
rule infix { { $prop:ident (,) ... } | $o:ident } => {
{ $($prop: $o.$prop) (,) ... }
}
}
Notice that I placed the unit of code in a $() block of its own to disambiguate the ellipse expansion.
Example: var x = { a, b } # o; becomes var x = { a: o.a, b: o.b };.