Use of paratheses symbols in django-viewflow flow - django

I have been at a loss understanding the use of parentheses in django-viewflow flow code.
For example in the code below
start = (
flow.Start(views.StartView)
.Permission('shipment.can_start_request')
.Next(this.split_clerk_warehouse)
)
# clerk
split_clerk_warehouse = (
flow.Split()
.Next(this.shipment_type)
.Next(this.package_goods)
)
from here
It seems as though, a tuple containing functions is assigned to start and to split_clerk_warehouse e.t.c. What does it mean. From my best guess it would seem that the .Next functions accept a tuple as input.
NOTE I do understand the method chaining used here. I am just at a loss to understand the use of braces.
Thanks.

If I understand correctly, you wonder what the use is of the outer brackets.
Let us first write the (first, but applicable to the second) statement without outer brackets:
start = flow.Start(views.StartView).Permission('shipment.can_start_request').Next(this.split_clerk_warehouse)
This is exactly equivalent to code in your sample. But you probably agree that this is quite unreadable. It requires a user to scroll over the code, and furthermore it is a long chain of characters, without any structure. A programmer would have a hard time understanding it, especially if - later - we would also use brackets inside the parameters of calls.
So perhaps it would make sense to write it like:
start = flow.Start(views.StartView).
Permission('shipment.can_start_request').
Next(this.split_clerk_warehouse)
But this will not work: Python is a language that uses spacing as a way to attach semantics on code. As a result it will break: Python will try to parse the separate linkes as separate statements. But then what to do with the tailing dot? As a result the parser would error.
Now Python has some ways to write statements in a multi-line fashion. For example with backslashes:
start = flow.Start(views.StartView). \
Permission('shipment.can_start_request'). \
Next(this.split_clerk_warehouse)
with the backslash we specify that the next line actually belongs to the current one, and thus it is parsed like we wrote this all on a single line.
The disadvantage is that we easily can forget a backslash here, and this would again let the parser error. Furthermore this requires linear work: for every line we have to add one element.
But programming languages actually typically have a feature that programmers constantly use to group (sub)expressions together: brackets. We use it to give precedence (for example 3 * (2 + 5)), but we can use it to simply group one expression over multiple lines as well, like:
start = (
flow.Start(views.StartView)
.Permission('shipment.can_start_request')
.Next(this.split_clerk_warehouse)
)
Everything that is within the brackets belongs to the same expression, so Python will ignore the new lines.
Note that tuple literals also use brackets. For example:
() # empty tuple
(1, ) # singleton tuple (one element)
(1, 'a', 2, 5) # 4-tuple
But here we need to write a comma at the end for a singleton tuple, or multiple elements separated by comma's , (except for the empty tuple).

Related

OpenModelica SimulationOptions 'variableFilter' not working with '^' exceptions

To reduce size of my simulation output files, I want to give variable name exceptions instead of a list of many certain variables to the simulationsOptions/outputFilter (cf. OpenModelica Users Guide / Output) of my model. I found the regexp operator "^" to fullfill my needs, but that didn't work as expected. So I think that something is wrong with the interpretation of connected character strings when negated.
Example:
When I have any derivatives der(...) in my model and use variableFilter=der.* the output file will contain all the filtered derivatives. Since there are no other varibles beginning with character d the same happens with variableFilter=d.*. For testing I also tried variableFilter=rde.* to confirm that every variable is filtered.
When I now try to except by variableFilter=^der.*, =^rde.* or =^d.*, I get exactly the same result as without using ^. So the operator seems to be ignored in this notation.
When I otherwise use variableFilter=[^der].*, =[^rde].* or even =[^d].*, all wanted derivation variables are filtered from the ouput, but there is no difference between those three expressions above. For me it seems that every character is interpretated standalone and not as as a connected string.
Did I understand and use the regexp usage right or could this be a code bug?
Side/follow-up question: Where can I officially report this for software revision?
_
OpenModelica v.1.19.2 (64-bit)

Matlab: What's the most efficient approach to parse a large table or cell array with regexp when sometimes there is no match?

I am working with a messy manually maintained "database" that has a column containing a string with name,value pairs. I am trying to parse the entire column with regexp to pull out the values. The column is huge (>100,000 entries). As a proxy for my actual data, let's use this code:
line1={'''thing1'': ''-583'', ''thing2'': ''245'', ''thing3'': ''246'', ''morestuff'':, '''''};
line2={'''thing1'': ''617'', ''thing2'': ''239'', ''morestuff'':, '''''};
line3={'''thing1'': ''unexpected_string(with)parens5'', ''thing2'': 245, ''thing3'':''246'', ''morestuff'':, '''''};
mycell=vertcat(line1,line2,line3);
This captures the general issues encountered in the database. I want to extract what thing1, thing2, and thing3 are in each line using cellfun to output a scalar cell array. They should normally be 3 digit numbers, but sometimes they have an unexpected form. Sometimes thing3 is completely missing, without the name even showing up in the line. Sometimes there are minor formatting inconsistencies, like single quotes missing around the value, spaces missing, or dashes showing up in front of the three digit value. I have managed to handle all of these, except for the case where thing3 is completely missing.
My general approach has been to use expressions like this:
expr1='(?<=thing1''):\s?''?-?([\w\d().]*?)''?,';
expr2='(?<=thing2''):\s?''?-?([\w\d().]*?)''?,';
expr3='(?<=thing3''):\s?''?-?([\w\d().]*?)''?,';
This looks behind for thingX' and then tries to match : followed by zero or one spaces, followed by 0 or 1 single quote, followed by zero or one dash, followed by any combination of letters, numbers, parentheses, or periods (this is defined as the token), using a lazy match, until zero or one single quote is encountered, followed by a comma. I call regexp as regexp(___,'tokens','once') to return the matching token.
The problem is that when there is no match, regexp returns an empty array. This prevents me from using, say,
out=cellfun(#(x) regexp(x,expr3,'tokens','once'),mycell);
unless I call it with 'UniformOutput',false. The problem with that is twofold. First, I need to then manually find the rows where there was no match. For example, I can do this:
emptyout=cellfun(#(x) isempty(x),out);
emptyID=find(emptyout);
backfill=cell(length(emptyID),1);
[backfill{:}]=deal('Unknown');
out(emptyID)=backfill;
In this example, emptyID has a length of 1 so this code is overkill. But I believe this is the correct way to generalize for when it is longer. This code will change every empty cell array in out with the string Unknown. But this leads to the second problem. I've now got a 'messy' cell array of non-scalar values. I cannot, for example, check unique(out) as a result.
Pardon the long-windedness but I wanted to give a clear example of the problem. Now my actual question is in a few parts:
Is there a way to accomplish what I'm trying to do without using 'UniformOutput',false? For example, is there a way to have regexp pass a custom string if there is no match (e.g. pass 'Unknown' if there is no match)? I can think of one 'cheat', which would be to use the | operator in the expression, and if the first token is not matched, look for something that is ALWAYS found. I would then still need to double back through the output and change every instance of that result to 'Unknown'.
If I take the 'UniformOutput',false approach, how can I recover a scalar cell array at the end to easily manipulate it (e.g. pass it through unique)? I will admit I'm not 100% clear on scalar vs nonscalar cell arrays.
If there is some overall different approach that I'm not thinking of, I'm also open to it.
Tangential to the main question, I also tried using a single expression to run regexp using 3 tokens to pull out the values of thing1, thing2, and thing3 in one pass. This seems to require 'UniformOutput',false even when there are no empty results from regexp. I'm not sure how to get a scalar cell array using this approach (e.g. an Nx1 cell array where each cell is a 3x1 cell).
At the end of the day, I want to build a table using these results:
mytable=table(out1,out2,out3);
Edit: Using celldisp sheds some light on the problem:
celldisp(out)
out{1}{1} =
246
out{2} =
Unknown
out{3}{1} =
246
I assume that I need to change the structure of out so that the contents of out{1}{1} and out{3}{1} are instead just out{1} and out{3}. But I'm not sure how to accomplish this if I need 'UniformOutput',false.
Note: I've not used MATLAB and this doesn't answer the "efficient" aspect, but...
How about forcing there to always be a match?
Just thinking about you really wanting a match to skip this problem, how about an empty match?
Looking on the MATLAB help page here I can see a 'emptymatch' option, perhaps this is something to try.
E.g.
the_thing_i_want_to_find|
Match "the_thing_i_want_to_find" or an empty match, note the | character.
In capture group it might look like this:
(the_thing_i_want_to_find|)
As a workaround, I have found that using regexprep can be used to find entries where thing3 is missing. For example:
replace='$1 ''thing3'': ''Unknown'', ''morestuff''';
missingexpr='(?<=thing2'':\s?)(''?-?[\w\d().]*?''?,) ''morestuff''';
regexprep(mycell{2},missingexpr,replace)
ans =
''thing1': '617', 'thing2': '239', 'thing3': 'Unknown', 'morestuff':, '''
Applying it to the entire array:
fixedcell=cellfun(#(x) regexprep(x,missingexpr,replace),mycell);
out=cellfun(#(x) regexp(x,expr3,'tokens','once'),fixedcell,'UniformOutput',false);
This feels a little roundabout, but it works.
cellfun can be replaced with a plain old for loop. Your code will either be equally fast, or maybe even faster. cellfun is implemented with a loop anyway, there is no advantage of using it other than fewer lines of code. In your explicit loop, you can then check the output of regexp, and build your output array any way you like.

How can you require an undetermined character to be repeated consecutively a certain number of times in Ruby Treetop?

I want to create a rule that requires a non-number, non-letter character to be consecutively repeated three times. The rule would look something like this:
# Note, this code does not do what I want!
grammar ThreeCharacters
rule threeConsecutiveCharacters
(![a-zA-Z0-9] .) 3..3
end
end
Is there any way to require the first character that it detects to be repeated three times?
There was previously a similar question about detecting the number of indentations: PEG for Python style indentation
The solution there was to first initialize the indentation stack:
&{|s| #indents = [-1] }
Then save the indentation for the current line:
&{|s|
level = s[0].indentation.text_value.length
#indents << level
true
}
Whenever a new line begins it peeks at the indentation like this:
!{|s|
# Peek at the following indentation:
save = index; i = _nt_indentation; index = save
# We're closing if the indentation is less or the same as our enclosing block's:
closing = i.text_value.length <= #indents.last
}
If the indentation is larger it adds the new indentation level to the stack.
I could create something similar for my problem, but this seems like a very tedious way to solve it.
Are there any other ways to create my rule?
Yes, you can do it this way in Treetop. This kind of thing not generally possible with a PEG because of the way packrat parsing works; it's greedy but you need to limit its greed using semantic information from earlier in the parse. It's only the addition in Treetop of semantic predicates (&{...}} that make it possible. So yes, it's tedious. You might consider using Rattler instead, as it has a significant number of features in addition to those available in Treetop. I can't advise (as maintainer of Treetop, but not being a user of Rattler) but I am very impressed by its feature set and I think it will handle this case better.
If you proceed with Treetop, bear in mind that every semantic predicate should return a boolean value indicating success or failure. This is not explicit in the initialisation of #indents above.

Vim, C++, look up member function

I am using vim 7.x
I am using alternate file.
I have a mapping of *.hpp <--> *.cpp
Suppose I'm in
class Foo {
void some_me#mber_func(); // # = my cursor
}
in Foo.hpp
is there a way to tell vim to do the following:
Grab word under # (easy, expand("")
Look up the class I'm inside of ("Foo") <-- I have no idea how to do this
Append `1 & 2 (easy: using ".") --> "Foo::some_member_func"
4: Switch files (easy, :A)
Do a / on 4
So basically, I can script all of this together, except the "find the name of the enclosing class I'm in part (especially if classes are nested).
I know about ctags. I know about cscope. I'm choosing to not use them -- I prefer solutions where I understand where they break.
This is relatively easy to do crudely and very difficult to do well. C and C++ are rather complex languages to parse reliably. At the risk of being downvoted, I'd personally recommend parsing the tags file generated by ctags, but if you really want to do it in Vim, there are a few of options for the "crude" method.
Make some assumptions. The assumptions you make depend on how complicated you want it to be. At the simplest level: assume you're in a class definition and there are no other nearby braces. Based on your coding style, assume that the opening brace of the class definition is on the same line as "class".
let classlineRE = '^class\s\+\(\k\+\)\s\+{.*'
let match = search(classlineRE, 'bnW')
if match != 0
let classline = getline(match)
let classname = substitute(classline, classlineRE, '\1', '')
" Now do something with classname
endif
The assumptions model can obviously be extended/generalised as much as you see fit. You can just search back for the brace and then search back for class and take what's in between (to handle braces on a separate line to "class"). You can filter out comments. If you want to be really clever, you can start looking at what level of braces you're in and make sure it's a top level one (go to the start of the file, add 1 every time you see '{' and subtract one every time you see '}' etc). Your vim script will get very very very complicated.
Another one risking the downvote, you could use one of the various C parsers written in python and use the vim-python interface to make it act like a vim script. To be honest, if you're thinking of doing this, I'd stick with ctags/cscope.
Use rainbow.vim. This does highlighting based on depth of indentation, so you could be a little clever and search back (using search('{', 'bW') or similar) for opening braces, then interrogate the syntax highlighting of those braces (using synIDattr(synID(line("."), col("."),1), "name")) and if it's hlLevel0, you know it's a top-level brace. You can then search back for class and parse as per item 1.
I hope that all of the above gives you some food for thought...

Tokenize the text depending on some specific rules. Algorithm in C++

I am writing a program which will tokenize the input text depending upon some specific rules. I am using C++ for this.
Rules
Letter 'a' should be converted to token 'V-A'
Letter 'p' should be converted to token 'C-PA'
Letter 'pp' should be converted to token 'C-PPA'
Letter 'u' should be converted to token 'V-U'
This is just a sample and in real time I have around 500+ rules like this. If I am providing input as 'appu', it should tokenize like 'V-A + C-PPA + V-U'. I have implemented an algorithm for doing this and wanted to make sure that I am doing the right thing.
Algorithm
All rules will be kept in a XML file with the corresponding mapping to the token. Something like
<rules>
<rule pattern="a" token="V-A" />
<rule pattern="p" token="C-PA" />
<rule pattern="pp" token="C-PPA" />
<rule pattern="u" token="V-U" />
</rules>
1 - When the application starts, read this xml file and keep the values in a 'std::map'. This will be available until the end of the application(singleton pattern implementation).
2 - Iterate the input text characters. For each character, look for a match. If found, become more greedy and look for more matches by taking the next characters from the input text. Do this until we are getting a no match. So for the input text 'appu', first look for a match for 'a'. If found, try to get more match by taking the next character from the input text. So it will try to match 'ap' and found no matches. So it just returns.
3 - Replace the letter 'a' from input text as we got a token for it.
4 - Repeat step 2 and 3 with the remaining characters in the input text.
Here is a more simple explanation of the steps
input-text = 'appu'
tokens-generated=''
// First iteration
character-to-match = 'a'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'ap'
pattern-found = false
tokens-generated = 'V-A'
// since no match found for 'ap', taking the first success and replacing it from input text
input-text = 'ppu'
// second iteration
character-to-match = 'p'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'pp'
pattern-found = true
// since pattern found, going recursive and check for more matches
character-to-match = 'ppu'
pattern-found = false
tokens-generated = 'V-A + C-PPA'
// since no match found for 'ppu', taking the first success and replacing it from input text
input-text = 'u'
// third iteration
character-to-match = 'u'
pattern-found = true
tokens-generated = 'V-A + C-PPA + V-U' // we'r done!
Questions
1 - Is this algorithm looks fine for this problem or is there a better way to address this problem?
2 - If this is the right method, std::map is a good choice here? Or do I need to create my own key/value container?
3 - Is there a library available which can tokenize string like the above?
Any help would be appreciated
:)
So you're going through all of the tokens in your map looking for matches? You might as well use a list or array, there; it's going to be an inefficient search regardless.
A much more efficient way of finding just the tokens suitable for starting or continuing a match would be to store them as a trie. A lookup of a letter there would give you a sub-trie which contains only the tokens which have that letter as the first letter, and then you just continue searching downward as far as you can go.
Edit: let me explain this a little further.
First, I should explain that I'm not familiar with these the C++ std::map, beyond the name, which makes this a perfect example of why one learns the theory of this stuff as well as than details of particular libraries in particular programming languages: unless that library is badly misusing the name "map" (which is rather unlikely), the name itself tells me a lot about the characteristics of the data structure. I know, for example, that there's going to be a function that, given a single key and the map, will very efficiently search for and return the value associated with that key, and that there's also likely a function that will give you a list/array/whatever of all of the keys, which you could search yourself using your own code.
My interpretation of your data structure is that you have a map where the keys are what you call a pattern, those being a list (or array, or something of that nature) of characters, and the values are tokens. Thus, you can, given a full pattern, quickly find the token associated with it.
Unfortunately, while such a map is a good match to converting your XML input format to a internal data structure, it's not a good match to the searches you need to do. Note that you're not looking up entire patterns, but the first character of a pattern, producing a set of possible tokens, followed by a lookup of the second character of a pattern from within the set of patterns produced by that first lookup, and so on.
So what you really need is not a single map, but maps of maps of maps, each keyed by a single character. A lookup of "p" on the top level should give you a new map, with two keys: p, producing the C-PPA token, and "anything else", producing the C-PA token. This is effectively a trie data structure.
Does this make sense?
It may help if you start out by writing the parsing code first, in this manner: imagine someone else will write the functions to do the lookups you need, and he's a really good programmer and can do pretty much any magic that you want. Writing the parsing code, concentrate on making that as simple and clean as possible, creating whatever interface using these arbitrary functions you need (while not getting trivial and replacing the whole thing with one function!). Now you can look at the lookup functions you ended up with, and that tells you how you need to access your data structure, which will lead you to the type of data structure you need. Once you've figured that out, you can then work out how to load it up.
This method will work - I'm not sure that it is efficient, but it should work.
I would use the standard std::map rather than your own system.
There are tools like lex (or flex) that can be used for this. The issue would be whether you can regenerate the lexical analyzer that it would construct when the XML specification changes. If the XML specification does not change often, you may be able to use tools such as lex to do the scanning and mapping more easily. If the XML specification can change at the whim of those using the program, then lex is probably less appropriate.
There are some caveats - notably that both lex and flex generate C code, rather than C++.
I would also consider looking at pattern matching technology - the sort of stuff that egrep in particular uses. This has the merit of being something that can be handled at runtime (because egrep does it all the time). Or you could go for a scripting language - Perl, Python, ... Or you could consider something like PCRE (Perl Compatible Regular Expressions) library.
Better yet, if you're going to use the boost library, there's always the Boost tokenizer library -> http://www.boost.org/doc/libs/1_39_0/libs/tokenizer/index.html
You could use a regex (perhaps the boost::regex library). If all of the patterns are just strings of letters, a regex like "(a|p|pp|u)" would find a greedy match. So:
Run a regex_search using the above pattern to locate the next match
Plug the match-text into your std::map to get the replace-text.
Print the non-matched consumed input and replace-text to your output, then repeat 1 on the remaining input.
And done.
It may seem a bit complicated, but the most efficient way to do that is to use a graph to represent a state-chart. At first, i thought boost.statechart would help, but i figured it wasn't really appropriate. This method can be more efficient that using a simple std::map IF there are many rules, the number of possible characters is limited and the length of the text to read is quite high.
So anyway, using a simple graph :
0) create graph with "start" vertex
1) read xml configuration file and create vertices when needed (transition from one "set of characters" (eg "pp") to an additional one (eg "ppa")). Inside each vertex, store a transition table to the next vertices. If "key text" is complete, mark vertex as final and store the resulting text
2) now read text and interpret it using the graph. Start at the "start" vertex. ( * ) Use table to interpret one character and to jump to new vertex. If no new vertex has been selected, an error can be issued. Otherwise, if new vertex is final, print the resulting text and jump back to start vertex. Go back to (*) until there is no more text to interpret.
You could use boost.graph to represent the graph, but i think it is overly complex for what you need. Make your own custom representation.