Export a specific line in Notepad++ - regex

I have a large XHTML file that contains a lot of code, see the below example:
<a:CreationDate>0</a:CreationDate>
<a:Creator/>
<a:ModificationDate>0</a:ModificationDate>
<a:Modifier/>
<a:name>stack</a:name>
<a:CreationDate>0</a:CreationDate>
<a:Creator/>
<a:ModificationDate>0</a:ModificationDate>
<a:Modifier/>
<a:name>user</a:name>
How can I export or select a specific line? In the example I want to have such result:
<a:name>stack</a:name>
<a:name>user</a:name>
and the rest of the code should be ignored.

okay I found my desire result:
^((?!<a:name>.*</a:name>).)*$

As it seems it is a kind of xml document if you want to search a line for example
<a:CreationDate>0</a:CreationDate>
or
<a:name>user</a:name>
you can search by the closing tags like </a:name> or </a:CreationDate>
or you can use a scripting language like php or javascript to select the line.

Related

Change font style for RegEx globally in LaTeX

Although I know that Latex is a "semantic language", I´m trying to find a solution for the following problem.
In my 60k words document are about 250 strings I can search for via a RegEx, e.g. something like [A-Z]{2}\d\d[A-Z]{2} for a string like SE25GH.
Is it possible to define a font style for this RegEx string globally in my document, so that I can change it easily without crawling through my document or typing a defined command like \makemytextbold{SE25GH}?
Thanks.

Opencart hide default text if no special offers available

My question is similar to OpenCart Hide special offers title if no special offers available.
I want to remove the default text, "There are no special offer products to list." that displays when there are no specials offered.
I do not know the file from which I could find and remove this.
I'd be grateful for some help with this. Thanks.
The text you are looking for is in the following language file:
catalog/language/english/product/special.php
You can either remove the text between the single quotes, so the $_['text_empty'] variable would be the following:
$_['text_empty'] = '';
... or as a better solution, you could just disable displaying it in your theme. Search for the following file:
catalog/view/theme/YOUR_THEME_NAME/template/product/special.tpl
... and then remove the following string from it:
<?php echo $text_empty; ?>
An even better way would be using a vQmod script, but you mentioned in your question, that you'd like to remove it from the file, so I would skip this part.

Applescript to extract the Digital Object Identifier (DOI) from a PDF file

I looked for an applescript to extract the DOI from a PDF file, but could not find it. There is enough information available on the actual format of the DOI (i.e. the regular expression), but how could I use this to get the identifier from the PDF file?
(It would be no problem if some external program were used, such as Hazel.)
If you're ok with using an app, I'd recommend Skim. Good AppleScript support. I'd probably structure it like this (especially if the document might be large):
set DOIFound to false
tell application "Skim"
set pp to pages of document 1
repeat with p in pp
set t to text of p
--look for DOI and set DOIFound to true
if DOIFound then exit repeat--if it's not found then use url?
end repeat
end tell
I'm assuming a DOI would always exist on one page (not spread out to between two). Looks like they are invariably (?) on the first page of an article, which would make this quick of course, even with a large doc.
[edit]
Another way would be to get the Xpdf OSX binaries from http://www.foolabs.com/xpdf/download.html and use pdftotext in the command line (just tested this; it works well) and parse the text using AppleScript. If you want to stay in AppleScript, you can do something like:
do shell script "path/to/pdftotext 'path/to/pdf/file.pdf'"
which would output a file in the same directory with a txt file extension -- you parse that for DOI.
Have you tried it with pdfgrep? It works really well in commmandline
pdfgrep -n --max-count 1 --include "*.pdf" "DOI"
i have no idea to build an apple script though, but i would be interested in one also. so that if i drop a pdf into that folder it just automatically extracts the DOI and renames the file with the DOI in the filename.

Using regex to eliminate chunks in a file (categorized events in iCal file)

I have one .ics file from which I would like to create individual new .ics files depending on the event categories (I can't get egroupware to export only events of one category, I want to create new calendars depending on category). My intended approach is to repeatedly eliminate all events but those of one category and then save the file using EditPad Lite 7 (Windows).
I am struggling to get the regular expression right. .+? is still too greedy and negating the string (e.g. to eliminate all but events from one category) doesn't work either.
Sample
BEGIN:VEVENT
DESCRIPTION:Event 2
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:Event 3
CATEGORIES:Sports
END:VEVENT
BEGIN:VEVENT
DESCRIPTION:Event 4
END:VEVENT
The regex BEGIN:VEVENT.+?CATEGORIES:Sports.+?END:VEVENT should only match sports events but it catches everything from the first BEGINto the first ENDfollowing the category.
Edit: negating doesn't work either: BEGIN:VEVENT.+?((?!CATEGORIES:Sports).).+?END:VEVENT.
What am I missing? Any pointers are highly appreciated.
I guess newlines are removed or ignored, because your regex does not care about them.
I only have a correction to the match after CATEGORIES
BEGIN:VEVENT.+?CATEGORIES:Sports.*?END:VEVENT
^
Zero or more
The first part of your regex looks good, maybe the regex engine in EditPad is not so good.
Try it with a different editor or scripting language (like Eclipse or perl or Notepad+ or Notepad2)
You could split the input and then grep the matching Sports events
#sportevents = grep /Sports/, split /END:VEVENT/, $input
map $_.="END:VEVENT", #sportevents
This was perl, maybe you can launch a script from EditPad to do it.
The second line just restores the END:VEVENT that was stripped during split.
OK. Solved it. I found something here which can be used to split ics files. I tweaked it to use the category rather than the summary in the file name and then merged the individually generated files according to category. I added the usual ics header and footer to all files and, voilà, I had individual calendar files.

Sublime Text 2 TM_FILEPATH regex snippet

I'm trying to make CodeIgniter CRUD snippet for Sublime Text 2 and I can't figure out how to write regex snipet, which will return specific part of the TM_FILEPATH variable
I found this one in one of the CodeIgniter snippets:
${TM_FILEPATH/.+((?:application).+)/$1/:application/controllers/${1/(.+)/\l$1.php/}}
If the file location is for example:
/D/Web/MyApp/application/controllers/admin/user.php
This snippet will return:
application/controllers/admin/user.php
What I need is only the part after "controllers" and without extension, in this example:
admin/user
PS: The path after controllers can have various number of directories, it can be user or also admin/something/user.
${TM_FILEPATH/.+(?:controllers\/)(.+)\.\w+/PATH\l$1/}