searching source code got messed up because extra spaces between words - regex

I was told to add some text to an existing paragraph on a webpage. I opened the corresponding file and copied some of existing text and searched for it in the source code. Search found nothing and I was going nuts. It turned out in the source code some of the words had extra spaces between them and since this wasn't displayed in the browser it screwed up searching for them in the code.
Anyone have tips to avoid this? For example is there a way to ignore white spaces, perhaps using regular expressions? It should be simple to add a sentence of text but I ended up using the design view in Dreamweaver.

Using regular expressions, you can search for multiple spaces using +a +pattern +like +this by putting a + following each space, which will instruct the search to look for one or more.

Related

Why isn't Atom recognizing my regular expressions?

I'm using Atom to format some text data for analysis (I know there are probably better ways of doing it than this so I'm all ears) but it doesn't seem to be recognizing my regular expression.
The text is POS tagged tokens with sentences being delineated with newlines, formatted as such:
good\tJJ\n
workout\tNN\n
.\t.\n
''\t''\n
\n
Perhaps\tRB\n
the\tDT\n
I was able to replace all of the tabs (\t) with a front slash (/) no problem, but I'm now trying to turn all newlines that DON'T delineate sentences with just a space. I tried \S\n and it "wasn't found". I also tried to highlight all delineating newlines with ^\n$ but there were only two matches and only at the end of the document.
Am I doing this wrong? My only usage of regex is with Python, so maybe there's just a different way to do it in Atom.
EDIT: I'm just giving up and gonna use Python to process it. Nothing suggested work. The search function seemed to just be bugging out in general (e.g. one search would not work but then if I closed the search function and reopened it, the same search would work) because it's a long file (700,000+ lines) despite it not being a large file, data-wise (6,235 KB). If anyone can recommend a large file text editor, though, it would be appreciated.

Visual Studio Code - Removing Lines Containing criteria

This probably isn't a VS Code-specific question but it's my tool of choice.
I have a log file with a lot of lines containing the following:
Company.Environment.Security.RightsBased.Policies.RightsUserAuthorizationPolicy
Those are debug-level log records that clutter the file I'm trying to process. I'm looking to remove the lines with that content.
I've looked into Regex but, unlike removing a blank line where you have the whole content in the search criteria (making find/replace easy), here I need to match from line break to line break on some criteria between the two, I think...
What are your thoughts on how criteria like that would work?
If the criteria is a particular string and you don't want to have to remember regexes, there is a few handy keyboard shortcuts that can help you out. I'm going to assume you're on a Mac.
Cmd-F to open find.
Paste your string.
Opt-Enter to select all of the instances of the string on the page.
Cmd-L to broaden the selection to the entire line of each instance on the page.
Delete/Backspace to remove those lines.
I think you should be able to just search for ^.*CONTENT.*$\n, where the content is the text you showed us. That is, search on the following pattern:
^.*Company\.Environment\.Security\.RightsBased\.Policies\.RightsUserAuthorizationPolicy.*$\n
And then just replace with empty string.
I have already up-voted answer of #james. But.. still I found one more easy and many feature available extension in VS Code. Here it is
It have much easy options to apply filters.
To match specific case mentioned in question. I am attaching screenshot which display how to use for it. I am posting this for others who come here in search for same issue. (Like I came)

RegEx: Best way to search and replace $!esc.html($!{XYZ})

I have inherited a project that includes html email templates and the text files that get sent along with it.
The back-end puts it all together, so that it's a multipart email message in the end. In other words, if someone has HTML turned off, they can read the text version. TMI.
Problem:
The guy before me left all kinds of $!esc.html($!{XYZ}) in the text files. Where XYZ stands for various different strings in the code.
I haven't touched RegEx in years and am at a loss.
Question
Is it possible to look for every occurrence of such variables in the text files and replace it with: $!{LAST_NAME}?
Can someone point me in the right direction? I have tried one of those RegEx recipe sites, but I got stuck. Any suggestions and/or help with this would be tremendously appreciated.
I am using SublimeText3, and I know how to find & replace in .txt files only.
Peace. Calm. Light.
Not sure what 'flavor' of regex sublime uses, but this should work. I'm assuming the XYZ means it will only be letters in there?
\$!esc\.html\(\$!\{\w*\}\)
The following version accounts for any _'s
\$!esc\.html\(\$!\{(\w|_)*\}\)

How do I join two regular expressions into one in Notepad++?

I've been searching a lot in the web and in here but I can't find a solution to this.
I have to make two replacements in all registry paths saved in a text file as follows:
replace all asterisc with: [#42]
replace all single backslashes with two.
I already have two expressions that do this right:
1st case:
Find: (\*) - Replace: \[#42\]
2nd case:
Find: ([^\\])(\\)([^\\]) - Replace: $1$2\\$3
Now, all I want is to join them together into just one expression so that I can do run this in one time only.
I'm using Notepad++ 6.5.1 in Windows 7 (64 bits).
Example line in which I want this to work (I include backslashes but i don't know if they will appear right in the html):
HKLM\SOFTWARE\Classes\*\shellex\ContextMenuHandlers\
I already tried separating it with a pipe, like I do in Jscript (WSH), but it doesn't work here. I also tried a lot of other things but none worked.
Any help?
Thanks!
Edit: I have put all the backslashes right, but the page html seem to be "eating" some of them!
Edit2: Someone reedited my text to include an accent that doesn't remove the backslashes, so the expressions went wrong again. But I got it and fixed it. ;-)
Sorry, but this was my first post here. :)
As everyone else already mentioned this is not possible.
But, you can achieve what you want in Notepad++ by using a Macro.
Go to "Macro" > "Start Recording" menu, apply those two search and replace regular expressions, press "Stop Recording", then "Save Current Recorded Macro", there give it a name, assign a shortcut, and you are done. You now can reuse the same replacements whenever you want with one shortcut.
Since your replacement strings are totally different and use data that come not from any capture (i.e. [#42]), you can't.
Keep in mind that replacement strings are only masks, and can not contain any conditional content.

Regex select XML Element (containing hyphen) and inside content

I'm working with an enterprise CMS and in order to properly create our weekly-updated dropdown menu without republishing our entire site, I have an XML document being created which has a various number of useful XML elements. However, when pulling in a link with the CMS, the generated XML also outputs the link's contents (the entire HTML for the page). Needless to say, with roughly 50 items, the XML file is too big for use on the web (as it stands I think it's over 600KB). The element is <page-content>filler here</page-content>.
What I'm trying to do is use TextWrangler to find and replace all <page-content> tags as well as their containing content.
I've tried a few different regex's, but I can't seem to match the closing tag, so it will just trail on.
Here's what I've tried:
(<page-content>)(.*?)
The above will match up until the next starting <page-content> tag, which is not what I want.
(<page-content>)(.*?)(<\/page-content>)
(<page-content>)(.*?)(<\/page\-content>)
The above finds no matches, even though the below will find the 7 matches it should.
(<content>)(.*?)(<\/content>)
I don't know if there's a special way to deal with hyphens (I'm inexperienced in regular expressions), but if anyone could help me out, it would be greatly appreciated.
Thanks!
EDIT: Before you tell me that Regex isn't meant to parse HTML, I know that, but there seems to be no other way for me to easily find and replace this. There are too many occurences to manually delete it and save the file again every week.
It seems the problem is that your . is not matching newlines that exist between your open and close tags.
An easy solution for this would be to add the s flag in order for your . to match over newlines. TextWrangler appears to support inline modifiers (?s). You could do it like this:
(<page-content>)(?s)(.*?)(<\/page-content>)
More information on modifiers here.