I am trying to remove some data in a about 1500 lines. Here is my problem
Text<br />random text<br />
Text<br />slight different random text to the above<br />
Text<br />slightly different again random text the above 2<br />
What i need to do is remove
<br />slightly different again random text the above 2<br />
And everything in it. The problem is that the text changes every time. Is there a wildcard variable that i can use in the replace function?
You can use the "Use regular expressions" option in the Find and Replace dialog. You'll need something like:
<br />.+?<br />
try this regular expression in the find and replace dialog box
(<br\s*\/?>\s*)+[a-zA-Z]*(<br\s*\/?>\s*)+
Related
I have an XML file that I am trying to parse into my database, but am getting an error stating a certain field exceeds my max character count (2000). I've identified the field in question, but don't have a row number in my error, so I have to find and delete the offender(s) in the XML itself.
Below is a sample. I need to find any entries where the characters between the first occurrence of "CCCStmts Correction" and "RoAmts" is over 2000 characters. I'm using Notepad++ and can only think this will work with regex. Ideas?
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<CCCStmts Correction="sample text" />
<RoAmts PayType="x" AmtType="x" TotalAmt="x" />
Regex is not the answer. You could do it with regex, of course, but I assume you have used an API to represent the XML programmatically in a model? Or, even if not, that you are parsing it in order to submit the relevant value contained within the XML, to your database. So once you acquire the value, simply test its length then, and submit it if it conforms to the field's requirements.
To check the length of the string, simply use...
// if the length is 2000 or less
if (string.length()` < 2001) {
//your routine
}
... and it will skip over any value that is composed of 2001+ characters.
This approach does not require an additional iteration purely to search, and does not require any replacements to be made. It will be much tidier, and much more efficient.
I want to find the following pattern:
Image not found: /Images/IMG-20160519-WA0015.jpg
And replace with some markup, including the image name from the above text like:
<a href="IMG-20160519-WA0015.jpg"><img src="IMG-20160519-WA0015.jpg" width="180" height="240" alt="IMG-20160519-WA0015.jpg" class="image" />
Is it possible with some kind of Regex or plugin or I'm simply burning neurones?
Thanks.
Try finding ^Image not found: \/Images\/(IMG-.*\.jpg) and replacing with <a href="\1"><img src="\1" width="180" height="240" alt="\1" class="image" />
Note that the caret (^) in the regex says that it must be at the beginning of the line, not sure if that's the case for you but I suspect that it is. I also assumed that the "IMG-" prefix is constant, if not then you can just remove those four characters from the regex.
If you're not aware of it, RegExr is a nice interactive way to build and test regular expressions.
EDIT: Since you mentioned having trouble in the comments, here's an image of my settings:
Help needed to set-up a macro/code-formatting/code-style/reformat code where in the code I write (Coldfusion), is tag based and the ending of a tag needs to be formatted.
The CFML code formatter is not doing this. All I want is when I format my code, any tag that is closed or ends with /> without a space from its previous character(any character), needs to be spaced and closed.
Example: any code line that ends with )/> or "/> or character/> capital-letter/> or digit/> or anything/> needs to be changed to ) /> or " /> or character /> capital letter /> or digit /> or anything /> respectively.
How do I get this done?
If you are looking for an automatic conversion for <empty-tag/> to <empty-tag />, you can do that in IntelliJ preferences: Editor->Code Style->XML. Open "Other" and under "Spaces" section on the left check "In empty tag".
I don't think it's possible to configure the IntelliJ formatter to do this. You need to use Find | Replace in Path... with a regular expression.
I am trying to find multiple occurrences of some text in Eclipse in multiple files but not able to write the correct regular expression for the same.
I have multiple occurrences of text which matches the following pattern
<import resource="classpath:META-INF/cxf/**cxf-extension**-soap.xml" />
<import resource="classpath:META-INF/cxf/**cxf-extension**-http.xml" />
After cxf-extension there could be any thing.
So I want to find out all such occurrences which start with "< import" has the work "cxf-extension" and ends with "/>".
I think this will work:
<import.*?cxf-extension.*?\/>
And here are some tests:
http://www.regex101.com/r/yX3mH3
I want to get string between em tag , including other html also.
for example:
<em>UNIVERSALPOSTAL UNION - International Bureau Circular<br />
By: K.J.S. McKeown</em>
output should be as:
UNIVERSALPOSTAL UNION - International Bureau Circular<br />
By: K.J.S. McKeown
please help me.
Thanks
Use the regular expression function like this:
REMatch("(?s)<em>.*?</em>", html)
See also: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=regexp_01.html
The (?s) sets the mode to single line, so that the input text is interpreted as one line even if it contains line feeds. This is probably the default (I'm not sure) so it can be omitted. As Peter pointed out in a comment, this is not the default and therefore must be set.
The .*? matches all characters inbetween <em> and </em>. The questionmark after the multiplier makes it "non-greedy", so that as few as possible characters are matched. This is needed in case the input html contains something like <em>foo</em><em>bar</em> where otherwise only the outermost <em></em> tags are considered.
The returned array contains all matches found, i.e. all texts including html that was in <em> tags.
Note that this could fail for circumstances where </em> also occurs as attribute text and is incorrectly not html-encoded, for example: <em><a title="Help for </em> tag">click</a></em> or in other rare circumstances (e.g. javascript script tags etc.). A regex cannot replace a full HTML/XML parser and if you need 100% accurateness, you should consider using one: http://livedocs.adobe.com/coldfusion/8/htmldocs/help.html?content=functions_t-z_23.html
If your input is exactly in the format given above, you don't even need regex - just strip the outer tags:
<cfsavecontent variable="Input">[text from above]</cfsavecontent>
<cfset Output = mid( Input, 4 , len(Input) - 9 />
If your input is more than this (i.e. a significant piece of HTML, or a full HTML document), regex is still not the ideal tool - instead, you should be using a HTML parser, such as JSoup:
<cfset jsoup = createObject('java','org.jsoup.Jsoup') />
<cfset Output = jsoup.parse(Input).select('em').html() />
(With CF8, this code requires placing the jsoup JAR file in CF's lib directory, or using a tool such as JavaLoader.)
If you are using jquery you can do this also pretty easily.
$("em").html();
Will return all html between the em tags.
See this fiddle
I had to remove any text that was to follow after a partiucular tag . Now the HTML content was getting generated dynamically from a database that cater to 5 different langauges. so I only had the div tag to help me. I am not sure why REMatch("(?s).*?", html) did not work for me. However Ben helped me here (http://www.bennadel.com/blog/769-Learning-ColdFusion-8-REMatch-For-Regular-Expression-Matching.htm). My code looks like tghis:
<cfset extContentArr = REMatch("(?i)<div class=""inlineBlock"" style=""margin-right:30px;"">.+?</div>",qry_getContent.colval) />
<cfif !ArrayIsEmpty(extContentArr)>
Loop the array and do whatever you need with the extract , I just deleted them.
</cfif>