Sublime Text 2 TM_FILEPATH regex snippet - regex

I'm trying to make CodeIgniter CRUD snippet for Sublime Text 2 and I can't figure out how to write regex snipet, which will return specific part of the TM_FILEPATH variable
I found this one in one of the CodeIgniter snippets:
${TM_FILEPATH/.+((?:application).+)/$1/:application/controllers/${1/(.+)/\l$1.php/}}
If the file location is for example:
/D/Web/MyApp/application/controllers/admin/user.php
This snippet will return:
application/controllers/admin/user.php
What I need is only the part after "controllers" and without extension, in this example:
admin/user
PS: The path after controllers can have various number of directories, it can be user or also admin/something/user.

${TM_FILEPATH/.+(?:controllers\/)(.+)\.\w+/PATH\l$1/}

Related

Adapting Regular Expression in Django URL to match filepath

So I am currently working on a web application that takes as input the location of a malware file for one of the functions.
This is passed via the views file. However after some altering of the models section of the application I found it was unable to parse the full filepath.
The code below works for the following pcap as input:
8cdddcd3-35fa-468d-8647-816518a9836a435be1c6e904836ad65f97f3eac4cbe19ee7ba0da48178fc7f00206270469165.pcap
url(r'^analyse/(?P<pcap>[\w\-]+\.pcap)$', views.analyse, name='analyse'),
However this code no longer works when it is a pcap containing the full filepath.
/home/freddie/malwarepcaps/8cdddcd3-35fa-468d-8647-816518a9836a435be1c6e904836ad65f97f3eac4cbe19ee7ba0da48178fc7f00206270469165.pcap
Any suggestions or pointers on how exactly I would alter the regular expression to accomodate the full filepath in the string being passed to the route would be very much appreciated.
regex: ((/\w+?)+/)?([\w-]+\.pcap)
django regex: ^analyse(?P<pcap>((/\w+?)+/)?([\w-]+\.pcap))$
note that there is no slash after analyse because it's part of pcap now.
so analyse/home/freddie/malwarepcaps/foo-bar.pcap should match this pattern and pcap will be equal to /home/freddie/malwarepcaps/foo-bar.pcap
test:
https://pythex.org/?regex=((%2F%5Cw%2B%3F)%2B%2F)%3F(%5B%5Cw-%5D%2B%5C.pcap)&test_string=8cdddcd3-35fa-468d-8647-816518a9836a435be1c6e904836ad65f97f3eac4cbe19ee7ba0da48178fc7f00206270469165.pcap%20%0A%2Fhome%2Ffreddie%2Fmalwarepcaps%2F8cdddcd3-35fa-468d-8647-816518a9836a435be1c6e904836ad65f97f3eac4cbe19ee7ba0da48178fc7f00206270469165.pcap&ignorecase=0&multiline=0&dotall=0&verbose=0
PS: I think it's better to move such parameter (path - /home/f/m/f.pcap) into querystring (for GET request) or into http-body (for POST request)
so it will be easier to obtain param without url-matching

Applescript to extract the Digital Object Identifier (DOI) from a PDF file

I looked for an applescript to extract the DOI from a PDF file, but could not find it. There is enough information available on the actual format of the DOI (i.e. the regular expression), but how could I use this to get the identifier from the PDF file?
(It would be no problem if some external program were used, such as Hazel.)
If you're ok with using an app, I'd recommend Skim. Good AppleScript support. I'd probably structure it like this (especially if the document might be large):
set DOIFound to false
tell application "Skim"
set pp to pages of document 1
repeat with p in pp
set t to text of p
--look for DOI and set DOIFound to true
if DOIFound then exit repeat--if it's not found then use url?
end repeat
end tell
I'm assuming a DOI would always exist on one page (not spread out to between two). Looks like they are invariably (?) on the first page of an article, which would make this quick of course, even with a large doc.
[edit]
Another way would be to get the Xpdf OSX binaries from http://www.foolabs.com/xpdf/download.html and use pdftotext in the command line (just tested this; it works well) and parse the text using AppleScript. If you want to stay in AppleScript, you can do something like:
do shell script "path/to/pdftotext 'path/to/pdf/file.pdf'"
which would output a file in the same directory with a txt file extension -- you parse that for DOI.
Have you tried it with pdfgrep? It works really well in commmandline
pdfgrep -n --max-count 1 --include "*.pdf" "DOI"
i have no idea to build an apple script though, but i would be interested in one also. so that if i drop a pdf into that folder it just automatically extracts the DOI and renames the file with the DOI in the filename.

Export a specific line in Notepad++

I have a large XHTML file that contains a lot of code, see the below example:
<a:CreationDate>0</a:CreationDate>
<a:Creator/>
<a:ModificationDate>0</a:ModificationDate>
<a:Modifier/>
<a:name>stack</a:name>
<a:CreationDate>0</a:CreationDate>
<a:Creator/>
<a:ModificationDate>0</a:ModificationDate>
<a:Modifier/>
<a:name>user</a:name>
How can I export or select a specific line? In the example I want to have such result:
<a:name>stack</a:name>
<a:name>user</a:name>
and the rest of the code should be ignored.
okay I found my desire result:
^((?!<a:name>.*</a:name>).)*$
As it seems it is a kind of xml document if you want to search a line for example
<a:CreationDate>0</a:CreationDate>
or
<a:name>user</a:name>
you can search by the closing tags like </a:name> or </a:CreationDate>
or you can use a scripting language like php or javascript to select the line.

Regex for converting file path to package/namespace

Given the following file path:
/Users/Lawrence/MyProject/some/very/interesting/Code.scala
I would like to generate the following using a single regex replace (the root can be a constant):
some.very.interesting
This is for the purpose of generating a snippet for Sublime Text which can automatically insert the correct package/namespace header for my scala/java classes :)
Sublime Text uses the following syntax for their regex replace patterns (aka 'substitutions'):
{input/regex/replace/flags}
Hence why an iterative approach cannot be taken - it has to be done in one pass! Also, substitutions cannot be nested :(
If you know the maximum number of nested folders.You can specify that in your regex.
For 1 to 3 nested folders
Regex:/Users/Lawrence/MyProject/(\w+)/?(\w+)?/?(\w+)?/[^/]+$
Replace:$1.$2.$3
For 1 to 5 nested folders
Regex:/Users/Lawrence/MyProject/(\w+)/?(\w+)?/?(\w+)?/?(\w+)?/?(\w+)?/[^/]+$
Replace:$1.$2.$3.$4.$5
Given the constraints this is only thing you can do
Input
/Users/Lawrence/MyProject/some/very/interesting/Code.scala
Regex
^/Users/Lawrence/MyProject/[^/]+/[^/]+/[^/]+/Code.scala
or
^/[^/]+/[^/]+/[^/]+/([^/]+)/([^/]+)/([^/]+)/
Replace
\1.\2.\3
Update
This gets you closer, but not exactly it:
Regex
(^/Users/Lawrence/MyProject/|/Code\.scala$|/)
Replacement
.
Output would be:
.some.very.interesting.
Without multiple replacements in a single line and without recursive back references it's going to be hard.
You might have to do a second replacement, replacing something like this with an empty string (if you can):
(^\.|\.$)

What is the mappings.ts file and how should it be set up in Tritium?

I'm using the Moovweb SDK and using Tritium. I want my mobile site to behave like my desktop site. I have different URLs pointing to my homepage. Should I use regex? A common element? And what's the best syntax for matching the path?
The mappings.ts file in the scripts directory is where particular pages are matched. The file is imported in html.ts and allows us to say "when a certain page is matched, make the following transformations."
Most projects already have a mappings file generated. A simple layout will be as so:
match($path) {
with(/home/) {
log("--> Importing pages/homes.ts in mappings.ts")
#import pages/home.ts
}
}
Every time you start working on a new page, you need to set up a new "map".
First: Match with a unique path
The Tritium above matches the path for the homepage. The path is the bit of a URL after the domain. For example, in www.example.com/search/item, "www.example.com" is the domain and "search/item" is the path.
The <>/home/<> is specifying the "home" part with regular expressions. You could also use a plain string if necessary:
with("home")
If Tritium matches the path with the matcher, it will import the home page.
It's probably true that the homepage of a site doesn't actually contain the word home. Most homepages are the URL without any matcher. A better string matcher could be:
match($path) {
with ("/")
}
Or, using regex:
with(/index|^\/$/) {
As you can see, the <>with()<> function of the mappings file is where knowledge of Regex can really come in handy. Check out our short guide on regex. Sometimes it will be simpler, such as <>(/search/)<>.
Remember to come up with the most unique aspect of the URL possible. If two <>with()<> functions match the same URL, then the one that appears first in the mappings file will be used. If you cannot find a unique URL matcher for different page types, you may have to match via other means.
Why Use Regex?
It might seem easier to use a string rather than a regex matcher. However, regex provides a lot more flexibility over which URLs are matched.
For example, a site could use a string of numbers in its product page URLs. Using a normal string matcher would not be practical - you'd have to list out all the numbers possible for all the items on the site. An easier way would be to use regex to say, "If there's a string of 5 digits, continue!" (The code for matching 5 digits: <>/\d{5}/<>.)
Second: Log the match
When matching a particular path, you should also use <>log()<> statements so you know exactly what's getting imported. The log statement will be printed in the command line window, so you can see if your regular expression accurately matches your path.
match($path) {
with(/index|^\/$/) {
log("--> importing pages/home.ts in mappings.ts")
}
}
Third: Import the file
Finally, use the <>#import<> function to include the page-specific tritium file.
match($path) {
with(/index|^\/$/) {
log("--> importing pages/home.ts in mappings.ts")
#import pages/home.ts
}
}