Okay, So I know how to use the wildcard feature of 'Find and Replace' in Microsoft products. What I can't figure out is how to use the wildcards in my replacement text. Below is an example of what I want to accomplish.
I have a 92 page word document. I am trying to add color formatting quickly. The statement 'filling in page # - ITEM # factory or technical' appears 125 times throughout the document. the # after page ranges from 1-5 and the # after ITEM ranges from 1-25 for each page. I did a wildcard look-up to find everything, 'filling in page [0-9] - ITEM [0-9] factory or technical' but what I need to do is change the font color on all 125 occurrences from black to green. I know how to adjust all the formatting, I just don't know how to use the wildcard feature in the 'Replace with' field.
And perhaps this isn't possible. But I have multiple phrases that are similar that reoccur throughout this document and I was hoping to use the 'Find and Replace' feature to speed up this process. An
Related
Hi i'm doing some infosec research and searching through text bins.
I'm using a text editor to search files and I'm wanting to search for email addresses with certain conditions. Text is comma-separated.
Say for example i know the email is 20 chars long and I know that the domain is gmail.com, and I also know it starts with t.
[tT](.{9})#gmail.com
If it was correct it should pick up for example: tqwertyuio#gmail.com and tzxcvb1234#gmail.com. Right?
I'm using emEditor which uses Boost Regex engine I think. This regex is just not working as it also returns anything that has that expression in it.
I've tried to use anchors, but they are not working. Perhaps its this engine. I would of thought i would go:
^[tT](.{9})#gmail.com$
But it's not working. Any help?? Thanks SO much i really just want to learn why i cant do this.
I believe you are looking for 20 characters long email which are NOT surrounded by alphabets or numbers. In that case, you can search for:
(?<!\w)[tT](.{9})#gmail.com(?!\w)
where \w is an alphabet or a number, (?<!\w) is negative lookbehind, and (?!\w) is negative lookahead.
In the Find dialog box, you can enter this regular expression, and make sure you select the "Regular Expressions" option.
You might also want to try the Filter toolbar with the same regular expression.
I have no programing experience and thought this would be simple, but I have searched for days without luck. I am using a program to strip content from a web page. The program uses regex filters to display what you want from the stripped content. The stripped content can be any letter and is in the form of USD/SEK. I want to display USDSEK, without the "/"
Thanks
To elaborate further - I am using a program called Data toolbar for chrome, which makes it easy to strip content from web pages. After it strips the content, it provides a regex filter to display what part of the content is displayed. But I have to know the regex command to remove the / from USD/SEK, to display just USDSEK. I've tried [A-Z.,]+ but that only displays USD. I need the regex command to grab the first 3 and last 3 characters only, or to omit the / from the string.
Try adding parentheses around the groups which you wish to capture:
([a-zA-Z]{3})\/([a-zA-Z]{3})
or
([a-zA-Z]{3})\/((?1))
Depending on the functionality of the program you are using you can then reference these captured groups as $1and $2 (or \1and \2 depending on flavor)
I'm using Autoblogged to pull a feed in as a blog post. I need to create a reg expression to convert the title of the item to things I can use as meta data. I've attached a screen of the backend I have access to. Any help would be greatly appreciated!
Here are examples of the title from the feed.
Type One Training Event New New Mexico, WY November 2012
Type Two Training Event Seattle, WA November 2012
I need that to become this:
<what>Type One Training Event</what> <city>New New Mexico</city>, <%state>WY</state> <month>November</month> <year>2012</year>
<what>Type Two Training Event</what> <city>Seattle</city>, <state>WA</state> <month>November</month> <year>2012</year>
Essentially says take whatever is before the word event and make that "what"
Take anything after the word event and before the comma and make that "city"
Take the two letters after the comma and make that "state"
Take the last two words and make em month and year
Autblogged backend:
We actually have an email in queue to respond to you directly once we get our v2.9 update out. The update fixes a bug in the regex feature but I thought I would go ahead and comment here so this question isn't just left open.
The ability to extract info from a feed is one of the coolest and most powerful features of AutoBlogged and this is a perfect example of what you can do with those features.
First of all, here are the regex patterns you would use:
What: (.*)\sTraining\sEvent
City: Training\sEvent\s([^,]*)
State: .*,\s([A-Z]{2})
To use these, you create new custom fields in the feed settings. Note that the custom fields also use the same syntax as the post templates so you can use the powerful regex function to extract info from the feed. This is how the fields should look:
Once you create these custom fields you can use them in your post templates and they will be added as custom fields to your post in WordPress.
Once you have these custom fields set up, you can use them in your post template as %what%, %city% or %state_code%. As I mentioned before this will also create custom fields on your blog post in WordPress as well. If you don't want that, you can just use %regex("%title%", "(.*)\sTraining\sEvent", "1")% instead of %what% directly in your post template.
Quick explanation of the syntax:
If you use %regex("%title%", "(.*)\sTraining\sEvent", "1")% it means this:
Get this info from the %title% field
Use the regex pattern (.*)\sTraining\sEvent
Use match reference 1, the (.*) part.
Perhaps match:
^(.* Event) (.*), ([A-Z]{2}) +(?i(Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|June?|July?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)) +((?:19|20)\d{2})\b
EDIT: re your comment, it looks like you have to surround your regex in delimiters. Try:
/insert_above_regex_here/
If you want case-insensitive, then do:
/insert_above_regex_here_but_remove_(?i_and_matching_)/i
However if you do case-insensitive, your state ([A-Z]{2}) will also match two lower-case letters. If this is OK then go for it. You culd also try changing that part of the regex to (?-i([A-Z]{2})) which says "be case-sensitive for this part", but it depends on whether that engine supports it (don't worry, you'll get an error if it doesn't).
Then replace with:
<what>$1</what> <city>$2</city>, <state>$3</state> <month>$4</month> <year>$5</year>
I'm not sure what flavour of regex that interface has so you might not be able to do the (?i bit in the Month regex (it just makes that bit case insensitive) -- you'll just have to be careful then to write your months with one capital letter and the rest lowercase, or you can modify the regex to allow upper-case too.
I have 2000 page website and it contains over 500 acronyms. What Regular expression could I use to find all the acronyms in the text only? I'm using dream-weaver. Some examples would be AFD, GTDC, IJQW and so on.. these are 2 or more capitals might be bounded or surround by other characters. Such example would be (DFT) or l'WQF - any ideas??
If dreamweaver has search via grep capability, you could just search for any string of letters with all capitals, including whatever necessary punctuation you need, e.g. [A-Z'-]{3,}. The 3 is the minumum number of letters in the acronym... you can change that as needed.
This would probably be better done via shell script, though, just for speed's sake. Let us know what OS you're using and someone else can leave a comment as to how to script that, as I probably don't know.
I keeping having the problem trying to extract data using regex whereas my result is not what I wanted because there might be some newlines, spaces, html tags, etc in the string, but is there anyway to actually see what is in the string, the debugger seems to show only the real text. How do you deal with this?
If the content of the string is HTML then debugger gives you a choice of viewing "HTML" or "Source". Source should show you any HTML tags that are there.
However if your concern is white space, this may not be enough. Your only option is to "view source" on the original page.
The best course of action is to explicitly handle these possibilities in your regex. For example, if you think you might be getting white space in your target string, use the \s* pattern in the critical positions. That will match zero or more spaces, tabs, and new lines (you must also have the "s" option checked in the regex panel for new lines).
However, without specific examples of source text and the regex you are using - advice can only be generic.
What I do is use a regex tester (whichever uses the same regex engine that you are using) and I test my pattern on it. I've tried using text editors that display invisible characters but to me they only add to the confusion.
So I just go by trial and error. For instance, if a line ends in:
</a>
Then I'll try the following patterns on the regex tester until I find one that works:
</a>.
</a>..
</a>\s
</a>\s*
</a>\n
</a>\r
</a>\r\n
Etc.