Cypher IF A AND NOT(B) in match - if-statement

I want to write a cypher clause that says (I know this is not correct syntax, its just an example to show the general idea):
IF (S1)-->(B1{attr:TRUE})-->(G) AND NOT((S2)-->(B2{attr:FALSE})-->(G))
THEN stuff
So, if I have only the node B1 with attr=TRUE I want the pattern to match. If I have B1 with attr=TRUE and also B2 with attr=FALSE I want the pattern not to match. In all other cases where at least B1 with attr=TRUE is found, the mattern should match also.
But I cannot figure out how to implement this logic.

Here is an example of how to do that:
MATCH (S1)-->(B1{attr:TRUE})-->(G)
WHERE NOT ()-->({attr:FALSE})-->(G)
... stuff ...
Or, if stuff needs to use the B2 nodes:
MATCH (S1)-->(B1{attr:TRUE})-->(G), (B2{attr:FALSE})
WHERE NOT ()-->(B2)-->(G)
... stuff ...
This should give you an idea of how to start. It all depends on what data your stuff needs to use, and how much you want to specify about the S and B nodes.

Related

RegEx for square brackets' string but not vector's index, it's possible?

I'm using Habour in Sublime Text 3.
How can I create a regex for square brackets string like below:
a:= [text] // same as a:= "text"
b:= [3] // same as b:= "3"
c:= {2,[text]} // same as c:= {2,"text"}
d:=[text] // same as d:="text"
Funtion([text]) // Same as Function("text")
but not include vector index, like:
aVet[index] // Same as aVet[1], aVet[2]...
e:= aVet[index] // Same as aVet[1], aVet[2]...
f:= aVet[2,3] // Same as aVet[1,2], aVet[2,5]...
g:= aVet[CONSTANT] // Same as aVet[FOO], aVet[BAR]...
this should work for you:
[^a-zA-Z0-9\s]\s*(\[.*?\])
Regex101
vet\[.*?\]|(\[.*?\])
This assumes that vector indices always starts with a vet. You must add a tag of whichever language you're using to clear up this confusion. Anyway, the the code above should do the trick. Follow the link for a detailed breakdown of what's happening behind the curtain.
The code above might not be that intuitive, but the basic idea is this: the engine looks for the statements that has the word vet followed by square brackets in it. If there is one it matches it. If it doesn't it captures the one on the right side (what we want). The only issue is, if you add comments in your code that has square brackets in them, it might capture those too. If you plan to do so, the regex needs to be modified for more conditions, but this will work as long you don't do that. Let me know if that is not the case.
first things, first. I'm using Harbour in Sublime Text 3.
iismathwizard, your code almost does the magic.
[^a-zA-Z0-9\s]\s*([.*?])
It always gets the first character before the right's square brackets, like:
"=" in a, b and d examples
"," in c example
"(" in function example
However, it doesn't gets any vector's index.
user41235, your code not exclude the vector's index.
vet[.?]|([.?])
I added a few examples for more detail.
Sorry my english...

Vim matches a rectangle area

I want to match a rectangle area in Vim using a regex expression, for example:
abcd test1
abcd test2
I want to match test1 and test2 at once, but not abcds.
(test1 and test2 are constant, we don't need to consider [0-9], that's just an example)
I want to match every column-aligned test1 test2
This
test1
test2
the rectangle area may appear anywhere, I can't assume it is at "column 3" or something of that sort.
If they are not aligned, don't match it.
I tried \1\#<=test1\n\(.*\)\#<=test2 but no luck, because lookahead breaks a group. (from :help \\#<=)
Does anyone know how to do it with only vim-regex? Thanks.
Edit:
A complicated example may be this one:
aaaaaaaaa
b test1 b
c test2 c
ddddddddd
match only test1 and test2.
Usin two or more regex is acceptable (one for test1 and the other for test2?)
Edit2:
This is just for fun, I am just curious about how much vim can achieve, it's not a serious problem, it may be boring and meaningless for many people and that is fine with me, please don't be bothered, good night :)
Simply searching for /test[0-9] will suffice. But I think the spirit of the question is really more about visual blocks. In visual mode you can use text objects for movement. So, in this case:
Search for test1.
Press Control-V (to turn on visual block mode)
Press w to visually select the entire word.
Press j to visually select the next word in the column below the first one. (use a range to extend this rectangular block, e.g. 10j would visually select the next ten items in that column.)
Try the following to find the match you need
The syntax is
/somethingWeAreLooking\(_.\)*followedByTheOtherThing
In this case it will be like this:
/test\(_.\)*[1-9]

Hunspell/Aspell data conversion to human-readable inflection list

Is there an easy way to generate a human-readable inflection list from Hunspell/Aspell dictionary data files?
For example, I'd like to generate the following outputs (for different languages):
...
book, books
book, books, booked, booking
...
go, goes, went, gone, going
...
I looked at the Hunspell/Aspell docs, but couldn't find an API call that would do this.
There is a method that the command line one does, but it doesn't output quite in the format you're looking for. You could also do this manually if you wanted though just by some simple scripting with regex.
The format of for each set of affixes is
TYPE TAG REMOVE REPLACE MATCH
Such that where TAG matches what follows what's behind the /in a given word in the .dicfile, you can do the following (presuming you've already stripped the word of the /...):
if($word =~ /$match$/) $word =~ s/$remove$/$replace/;
Notice the $ there matching the end-of-line/word. Adjust with ^ if it's a prefix.
There are three caveats:
The $match directly from the .aff file is in almost all cases equivalent to standard regex. There are minor variations such that if the match is something like [abc-gh], you'd be better to change it to (a|b|c|-|g|h) or [abcgh-] (hunspell doesn't use hyphen as a metacharacter) otherwise it'll be interpreted as [abcdefgh] (standard regex). For a negated character class, your options are to manually move the - to the end of the expression (e.g. [^a-df] to [^adf-] or to use negative look behinds.
If $replace is 0, then you should change it to an empty string.
If your result ends with /..., you need to reprocess it again because it has a double affix.
Be careful. By my rough calculations, the dictionary I'm working on could have more than 50 million words being formed (and I wouldn't be surprised if it hits beyond 100 million).

How to create a regex to check whether a set of words exists in a given string?

How can I write a regex to check if a set of words exist in a given string?
For example, I would like to check if a domain name contains "yahoo.com" at the end of it.
'answers.yahoo.com', would be valid.
'yahoo.com.answers', would be wrong. 'yahoo.com' must come in the end.
I got a hint from somewhere that it might be something like this.
"/^[^yahoo.com]$/"
But I am totally new to regex. So please help with this one, then I can learn further.
When asking regex questions, always specify the language or application, too!
From your history it looks like JavaScript / jQuery is most likely.
Anyway, to test that a string ends in "yahoo.com" use /.*yahoo\.com$/i
In JS code:
if (/.*yahoo\.com$/i.test (YOUR_STR) ) {
//-- It's good.
}
To test whether a set of words has at least one match, use:
/word_one|word_two|word_three/
To limit matches to just the most-common, legal sub-domains, ending with "yahoo.com", use:
/^(\w+\.)+yahoo\.com$/
(As a crude, first pass)
For other permutations, please clarify the question.

What would be the best (runtime performance) application or pattern or code or library for matching string patterns

I have been trying to figure out a decent way of matching string patterns. I will try my best to provide as much information as I can regarding what I am trying to do.
The simplest thougt is that there are some specified patterns and we want to know which of these patterns match completely or partially to a given request. The specified patterns hardly change. The amount of requests are about 10K per day but the results have to pe provided ASAP and thus runtime performance is the highest priority.
I have been thinking of using Assembly Compiled Regular Expression in C# for this, but I am not sure if I am headed in the right direction.
Scenario:
Data File:
Let's assume that data is provided as an XML request in a known schema format. It has anywehere between 5-20 rows of data. Each row has 10-30 columns. Each of the columns also can only have data in a pre-defined pattern. For example:
A1- Will be "3 digits" followed by a
"." follwed by "2 digits" -
[0-9]{3}.[0-9]{2}
A2- Will be "1
character" follwoed by "digits" -
[A-Z][0-9]{4}
The sample would be something like:
<Data>
<R1>
<A1>123.45</A1>
<A2>A5567</A2>
<A4>456EV</A4>
<An>xxx</An>
</R1>
</Data>
Rule File:
Rule ID A1 A2
1001 [0-9]{3}.45 A55[0-8]{2}
2002 12[0-9].55 [X-Z][0-9]{4}
3055 [0-9]{3}.45 [X-Z][0-9]{4}
Rule Location - I am planning to store the Rule IDs in some sort of bit mask.
So the rule IDs are then listed as location on a string
Rule ID Location (from left to right)
1001 1
2002 2
3055 3
Pattern file: (This is not the final structure, but just a thought)
Column Pattern Rule Location
A1 [0-9]{3}.45 101
A1 12[0-9].55 010
A2 A55[0-8]{2} 100
A2 [X-Z][0-9]{4} 011
Now let's assume that SOMEHOW (not sure how I am going to limit the search to save time) I run the regex and make sure that A1 column is only matched aginst A1 patterns and A2 column against A2 patterns. I would end up with the follwoing reults for "Rule Location"
Column Pattern Rule Location
A1 [0-9]{3}.45 101
A2 A55[0-8]{2} 100
Doing AND on each of the loctions
gives me the location 1 - 1001 -
Complete match.
Doing XOR on each of the loctions
gives me the location 3 - 3055 -
Partial match. (I am purposely not
doing an OR, because that would have
returned 1001 and 3055 as the result
which would be wrong for partial
match)
The final reulsts I am looking for are:
1001 - Complete Match
3055 - Partial Match
Start Edit_1: Explanation on Matching results
Complete Match - This occurs when all
of the patterns in given Rule are
matched.
Partial Match - This ocurrs when NOT
all of the patterns in given Rule are
matched, but atleast one pattern
matches.
Example Complete Match (AND):
Rule ID 1001 matched for A1(101) and A2 (100). If you look at the first charcter in 101 and 100 it is "1". When you do an AND - 1 AND 1 the result is 1. Thus position 1 i.e. 1001 is a Complete Match.
Exmple Partial Match (XOR):
Rule ID 3055 matched for A1(101). If you look at the last character in 101 and 100 it is "1" and "0". When you do an XOR - 1 XOR 0 the result is 1. Thus position 3 i.e. 3055 is Partial Match.
End Edit_1
Input:
The data will be provided in some sort of XML request. It can be one big request with 100K Data nodes or 100K requests with one data node only.
Rules:
The matching values have to be intially saved as some sort of pattern to make it easier to write and edit. Let's assume that there are approximately 100K rules.
Output:
Need to know which rules matched completely and partially.
Preferences:
I would prefer doing as much of the coding as I can in C#. However if there is a major performance boost, I can use a different language.
The "Input" and "Output" are my requirements, how I manage to get the "Output" does not matter. It has to be fast, lets say each Data node has to be processed in approximately 1 second.
Questions:
Are there any existing pattern or
framewroks to do this?
Is using Regex the right path
specifically Assembly Compiled
Regex?
If I end up using Regex how can I
specify for A1 patterns to only
match against A1 column?
If I do specify rule locations in a
bit type pattern. How do I process
ANDs and XORs when it grows to be
100K charcter long?
I am looking for any suggestions or options that I should consider.
Thanks..
The regular expression API only tells you when they fully matched, not when they partially matched. What you therefore need is some variation on a regular expression API that lets you try to match multiple regular expressions at once, and at the end can tell you which matched fully, and which partially matched. Ideally one that lets you precompile a set of patterns so you can avoid compilation at runtime.
If you had that then you could match your A1 patterns against the AI column, A2 columns against the A2 pattern, and so on. Then do something with the list of partial and full regular expressions.
The bad news is that I don't know of any software out there that implements this.
The good news is that the strategy described in http://swtch.com/~rsc/regexp/regexp1.html should be able to implement this. In particular the State sets can be extended to have information about your current state in multiple patterns at the same time. This extended set of State sets will result in a more complex state diagram (because you're tracking more stuff), and a more complex return at the end (you're returning a set of State sets), but runtime won't be changed a bit, whether you're matching one pattern or 50.