using ^ (caret) inside the states in lex/flex - regex

I'll put up my lex code first(lex body only).
%%
ps {BEGIN STATE1;}
. ;
<STATE1>^[0-9] print("number after ps".)
with this code I'm trying to match a number right after the letters "ps". Thats why I used ^ character.
But the code doesn't match any correct strings such as ps3, ps4fd,ps554 etc.
Then I removed the ^ and tried but then it worked but also matches strings like pserd7, psfh45,psfhdjh4er etc.
I know that I can solve the problem without using states (ps[0-9].*). But I have to do this with states. How can I fix this? thanks....

with this code I'm trying to match a number right after the letters "ps". Thats why I used ^ character
But ^ doesn't mean that. It means 'beginning of line'.
I know that I can solve the problem without using states (ps[0-9].*). But I have to do this with states.
Why? Very strange requirement.
You need to add more rules to cover the other possibilities. For example:
<STATE1>. { BEGIN INITIAL; }
But this depends on what else if anything is legal after 'ps'.

Related

Regex-Match while ignoring a char from Searchword

I am using an Engineering Program which lets me Code formulas in order to filter out specific lines in a database. I am trying to look for a certain line in the database which contains e.g. "concrete" as a property.
In the Code I can use regular expressions.
The regex I was using so far looked like this:
".*(concrete).*";
so if the line in the database contains concrete, I will get the wanted result.
Now the Problem is: i would like to switch the word concrete with a variable, so that it Looks like this:
".*(#VARIABLE1).*";
(the Syntax with the # works in the program btw.)
the Problem is: if i set the variable as concrete, the program automatically switches it for 'concrete' . Obviously, the word concrete cant be found anymore, since the searchterm now contains the two ' Symbols in the beginning and i the end.
Is there a way to ignore those two characters using the Right regex?
what I want it to do is the following:
If a line in the database contains "25cm concrete in Grey"
I should get a match from the regex.
with the searchterm ".*(concrete).*"; it works, with the variable ".*(#VARIABLE1).*"; it doesnt.
EDIT:
the whole "Formula" in the program Looks like that:
if(Match(QTO(Typ:="Attribut{FloorsLayer_02_MaterialName}");".*(#V_QUALITY).*" ;"regex") ;QTO(Typ:="Attribut{Fläche}");0)
I want the if-condition to be true, when the match inside is true.
the whole QTO function is just the programs Syntax to use a certain Attribute into the match-function, the middle part is my Problem. I really don't know the programming language or anything,I'm new to this. hope it helps!
Thats more of a hack than a real solution and i'm not sure if it even works:
if you use the regex
.*(#VARIABLE1)?).*
and the string ?concrete(
this will result in a regex looking like this:
.*('?concrete(')?).*
which makes the additional characters optional.
This uses the following assumtption:
the string (#VARIABLE1) gets replaced by the ('<content of VARIABLE1>')

RegEx for square brackets' string but not vector's index, it's possible?

I'm using Habour in Sublime Text 3.
How can I create a regex for square brackets string like below:
a:= [text] // same as a:= "text"
b:= [3] // same as b:= "3"
c:= {2,[text]} // same as c:= {2,"text"}
d:=[text] // same as d:="text"
Funtion([text]) // Same as Function("text")
but not include vector index, like:
aVet[index] // Same as aVet[1], aVet[2]...
e:= aVet[index] // Same as aVet[1], aVet[2]...
f:= aVet[2,3] // Same as aVet[1,2], aVet[2,5]...
g:= aVet[CONSTANT] // Same as aVet[FOO], aVet[BAR]...
this should work for you:
[^a-zA-Z0-9\s]\s*(\[.*?\])
Regex101
vet\[.*?\]|(\[.*?\])
This assumes that vector indices always starts with a vet. You must add a tag of whichever language you're using to clear up this confusion. Anyway, the the code above should do the trick. Follow the link for a detailed breakdown of what's happening behind the curtain.
The code above might not be that intuitive, but the basic idea is this: the engine looks for the statements that has the word vet followed by square brackets in it. If there is one it matches it. If it doesn't it captures the one on the right side (what we want). The only issue is, if you add comments in your code that has square brackets in them, it might capture those too. If you plan to do so, the regex needs to be modified for more conditions, but this will work as long you don't do that. Let me know if that is not the case.
first things, first. I'm using Harbour in Sublime Text 3.
iismathwizard, your code almost does the magic.
[^a-zA-Z0-9\s]\s*([.*?])
It always gets the first character before the right's square brackets, like:
"=" in a, b and d examples
"," in c example
"(" in function example
However, it doesn't gets any vector's index.
user41235, your code not exclude the vector's index.
vet[.?]|([.?])
I added a few examples for more detail.
Sorry my english...

Using Regex to find function containing a specific method or variable

This is my first post on stackoverflow, so please be gentle with me...
I am still learning regex - mostly because I have finally discovered how useful they can be and this is in part through using Sublime Text 2. So this is Perl regex (I believe)
I have done searching on this and other sites but I am now genuinely stuck. Maybe I am trying to do something that can't be done
I would like to find a regex (pattern) that will let me find the function or method or procedure etc that contains a given variable or method call.
I have tried a number of expressions and they seem to get part of the way but not all the way. Particularly when searching in Javascript I pick up multiple function declarations instead of the one nearest to the call/variable that I am looking for.
for example:
I am looking for the function that calls the method save data()
I have learnt, from this excellent site that I can use (?s) to switch . to include newlines
function.*(?=(?s).*?savedata\(\))
however, that will find the first instance of the word function and then all the text unto and including savedata()
if there are multiple procedures then it will start at the next function and repeat until it gets to savedata() again
function(?s).*?savedata\(\) does something similar
I have tried asking it to ignore the second function (I believe) by using something like:
function(?s).*?(?:(?!function).*?)*savedata\(\)
But that doesn't work.
I have done some investigation with look forwards and look backwards but either I am doing it wrong (highly possible) or they are not the right thing.
In summary (I guess), how do I go backwards, from a given word to the nearest occurrence of a different word.
At the moment I am using this to search through some javascript files to try and understand the structure/calls etc but ultimately I am hoping to use on c# files and some vb.net files
Many thanks in advance
Thanks for the swift responses and sorry for not added an example block of code - which I will do now (modified but still sufficient to show the issue)
if I have a simple block of javascript like the following:
function a_CellClickHandler(gridName, cellId, button){
var stuffhappenshere;
var and here;
if(something or other){
if (anothertest) {
event.returnValue=false;
event.cancelBubble=true;
return true;
}
else{
event.returnValue=false;
event.cancelBubble=true;
return true;
}
}
}
function a_DblClickHandler(gridName, cellId){
var userRow = rowfromsomewhere;
var userCell = cellfromsomewhereelse;
//this will need to save the local data before allowing any inserts to ensure that they are inserted in the correct place
if (checkforarangeofthings){
if (differenttest) {
InsSeqNum = insertnumbervalue;
InsRowID = arow.getValue()
blnWasInsert = true;
blnWasDoubleClick = true;
SaveData();
}
}
}
running the regex against this - including the second one that was identified as should be working Sublime Text 2 will select everything from the first function through to SaveData()
I would like to be able to get to just the dblClickHandler in this case - not both.
Hopefully this code snippet will add some clarity and sorry for not posting originally as I hoped a standard code file would suffice.
This regex will find every Javascript function containing the SaveData method:
(?<=[\r\n])([\t ]*+)function[^\r\n]*+[\r\n]++(?:(?!\1\})[^\r\n]*+[\r\n]++)*?[^\r\n]*?\bSaveData\(\)
It will match all the lines in the function up to, and including, the first line containing the SaveData method.
Caveat:
The source code must have well-formed indentation for this to work, as the regex uses matching indentations to detect the end of functions.
Will not match a function if it starts on the first line of the file.
Explanation:
(?<=[\r\n]) Start at the beginning of a line
([\t ]*+) Capture the indentation of that line in Capture Group 1
function[^\r\n]*+[\r\n]++ Match the rest of the declaration line of the function
(?:(?!\1\})[^\r\n]*+[\r\n]++)*? Match more lines (lazily) which are not the last line of the function, until:
[^\r\n]*?\bSaveData\(\) Match the first line of the function containing the SaveData method call
Note: The *+ and ++ are possessive quantifiers, only used to speed up execution.
EDIT:
Fixed two minor problems with the regex.
EDIT:
Fixed another minor problem with the regex.

How to create a regex to check whether a set of words exists in a given string?

How can I write a regex to check if a set of words exist in a given string?
For example, I would like to check if a domain name contains "yahoo.com" at the end of it.
'answers.yahoo.com', would be valid.
'yahoo.com.answers', would be wrong. 'yahoo.com' must come in the end.
I got a hint from somewhere that it might be something like this.
"/^[^yahoo.com]$/"
But I am totally new to regex. So please help with this one, then I can learn further.
When asking regex questions, always specify the language or application, too!
From your history it looks like JavaScript / jQuery is most likely.
Anyway, to test that a string ends in "yahoo.com" use /.*yahoo\.com$/i
In JS code:
if (/.*yahoo\.com$/i.test (YOUR_STR) ) {
//-- It's good.
}
To test whether a set of words has at least one match, use:
/word_one|word_two|word_three/
To limit matches to just the most-common, legal sub-domains, ending with "yahoo.com", use:
/^(\w+\.)+yahoo\.com$/
(As a crude, first pass)
For other permutations, please clarify the question.

Regex help: Matching paths (using django)

Hate coming up with titles. I need something that'll actually capture the following:
site.com/500/ (a number as the first param)
site.com/500/ABC/ (a number and a 3 letter code)
site.com/500/ABC/DEF/ (a number and 2x 3 letter codes)
What I have been messing with:
^(\d+/)?(\w{3}/)?(\w{3}/)?$
That sort of works but includes the slashes in the arguments (so I end up with "500/"). Moving the slashes outside of the brackets won't match /500/ABC/ since the ? only works on the slash.
Obviously I can make it in multiple ones but I'm sure there's a way to do it in one go.
As well, I only want the actual arguments, since as I said it can work but ends up adding slashes to them, which isn't too good.
Thanks for any help.
how about ..
((\d+/)|(\d+/\w{3}/)|(\d+/\w{3}/\w{3}/))$
the result will be ..
site.com/500/ABC/DEF/ => 500/ABC/DEF/
site.com/500/ABC/ => 500/ABC/
site.com/500/ = 500/