When using MediaWiki's markup language, the only thing that I hate is creating numbered lists. The only way I know to create a list is to do something like this:
#Item1
#Item2
However, if I want to add spaces or some other text between those lines, the numbering gets lost. For example, the following will create text that has two number one items:
#Item1
Somestuff
#Item2
Is there any way around this, or should I just use bullet points instead? I noticed just now that the stackoverflow system does not allow numbering like this, you have to do it all manually.
Like this:
#Item1
#:Somestuff
#Item2
I use <ol></ol> and <li></li> to embed the <pre></pre> code formatting portions. Works great for me! :-)
There are a couple of options, but you can start an ordered list from an arbitrary number like this:
#Item1
Something
<ol start="2">
#Item2
</ol>
You can also use "#:" if you don't mind "Something" being indented a lot:
#Item1
#:
#: Something
#:
#Item2
There are quite a lot of options with lists, you can find more info on Wiki's Help Pages:List.
update
Newer version work more like regular html markup the old syntax will now give you a double indent and will not adjust the start offset, but the following works well, even with the source/syntaxhighlight tag.
<ol>
<li>Item1</li>
Something
</ol>
<ol start="2">
<li>Item2</li>
<source lang=javascript>
var a = 1;
</source>
</ol>
In short everything within the ol tag will have the same indentation and will not be numbered if it is outside a li tag. The following will now work and it mean you don't have to offset groups manually.
<ol>
<li>Item1</li>
Something
<li>Item2</li>
<source lang=javascript>
var a = 1;
</source>
</ol>
The #: works, but you cannot create a
section with spaces, so I would prefer
the non-working option. Anyone knows a
similar syntaxis that does the trick
(start numbering at given value)?
This response is probably a bit late, but I figure I'll add it in case anyone stumbles across this, as I have.
You can create a section with spaces by doing something like:
# Item 1
#:
#:
# Item 2
This will appear as:
Item 1
Item 2
Now, before you say this doesn't work, the trick is to add an ASCII no-break space after the #: rather than just simply hitting spacebar. You can add this by holding ALT on your keyboard and typing 0160. Doing this should add the usual Wiki paragraph formatting while preserving your numbering between #s.
Hope that helps!
"#:" will not work with other tags like
<source lang=javascript>
//...
</source>
I'm using Mediawiki 1.13.3 and this works:
#Item1
Somestuff
<ol start="2">
<li>Item2 </li>
</ol>
And for cases where you want to have some block text within your numbered wiki list try this
# one
#:<pre>
#:some stuff
#:some more stuff</pre>
# two
Which produces:
1. one
some stuff
some more stuff
2. two
From the Wiki Help Page I was able to get the numbering in a list to stay consitant using <p> and <pre>:
# Item 1
# Item 2 <p><pre>Item 2 Pre Stuff</pre></p>
# Item 3
Would generate
1. Item 1
2. Item 2
[ Item 2 Pre Stuff ]
3. Item 3
Following the link to Wiki Help, I found an example that meets what I think are the implied requirements
The list needs to keep numbering
Sometimes the "Somestuff" should be on it's own line in the source
To get (1) there are a few solutions proposed. Bug one way is to use paragraph delimiters around the extra "somestuff".
Example 1:
# Paragraph 1.<p>Paragraph 2.</p><p>Paragraph 3.</p>
# Second item.
To meet (2), you use paragraph marking in combination with commenting out the new lines (with <!-- newline-->).
Example 2:
# Paragraph 1.<!--
--><p>Paragraph 2.</p><!--
--><p>Paragraph 3.</p>
# Second item.
Both examples display as
Result:
1. Paragraph 1.
Paragraph 2.
Paragraph 3.
2. Second item
Note that the comment eats all of the white space between the end of one element and the start of the next, which seems to be standard practice, and makes sense if you're trying to have whitespace without the "wiki effects" of the white space.
Extension:ComplexList
https://www.mediawiki.org/w/index.php?oldid=2126533
was put together but not maintained (for lack of time). It works with 1.26.2 of MediaWiki.
For example.
<cl>
1. list item A1
* list item A2
continuing list item A2
further continuing list item A2
* list item A3
</cl>
becomes
list item A1
list item A2continuing list item A2further continuing list item A2
list item A3
You can do:
# one
# two<br />spanning more lines<br />doesn't break numbering
# three
## three point one
## three point two
Regular old <br> works as well but probably pisses off someone.
You can put additional HTML formatting in as well to do <pre> formatting and the like without breaking the numbering as well. This also works other list formats.
From:
http://www.mediawiki.org/wiki/Help:Formatting
edit: Also found that inside a <pre></pre> many of my old tricks don't work, but using
works as a newline, and allows multi-line blocks. The cost is that you jam all your lines on one line.
# one
#: <pre>foo
bar</pre>
Related
Have I found a bug in Notepad++ or am I doing something wrong?
Background info
(Please note that I do know that one are supposed not to use Regex parsing HTML, but I think this is a special case that should work - without the possible Notepad++ bug ;-)
I have exported Apple Notes as HTML using Exporter 3.0 on a Mac. In the HTML output every Note line is between <div> - </div> elements and also "header/title lines" like <h1> - </h1> or <h2> - </h2> etc. Each "header/title line" is often split in several unnecessary HTML header elements as in the following simplified example.
<div><h1>TEST </h1><h1>Title<br></h1></div>
<div><b><h2>T1</h2><u><h2>T2</h2></u><h2> </h2></b><h2>(</h2><h2>T3</h2><u><h2>T4</h2></u><h2>)</h2><b><h2><br></h2></b></div>
This HTML can't be imported into OneNote giving the same result as seen in Apple Notes i.e. each "header/title" line is split in multiple lines. That's true even when changing the <h1>/<h2> block elements to inline elements using an initial <style>h1, h2 {display: inline;}</style> statement. (Maybe that is a bug or restriction in OneNote, but I need to find a workaround.)
Therefore, I need to clean the example HTML output above from the unnecessary HTML header <h1> or <h2> (all but the first in every line) and </h1> or </h2> (all but the last in every line), to get the following result that can be imported to OneNote without problem.
<div><h1>TEST Title<br></h1></div>
<div><b><h2>T1<u>T2</u> </b>(T3<u>T4</u>)<b><br></h2></b></div>
Solution ? - Developed Regex
I'm quite new to Regex, especially advanced Regex, but I think I have found a way to clean the erroneous HTML code using TWO different Regex expressions as follows.
Both works well when tested using regex101.com, I think.
The first one is used to remove unnecessary </h1> or </h2> elements and is a Positive Lookahead function (it works both in regex101 and in Notepad++)
(</h[1-6]>)(?=.*?\1)
(Demo)
Picture 1 shows a working Find All + Mark All in Notepad++
Picture 2 shows a working Replace All
The Second one used to remove unnecessary <h1> or <h2> elements and is a Positive Lookbehind function (it works in regex101 but NOT fully in Notepad++)
(?<=(<(h[1-6])>))(?:.*?)\K\1
(Demo)
Picture 3 shows a working Find All + Mark All in Notepad++ = All 8 occurrences found
Picture 4 shows a NOT working Replace All in Notepad++ = Only 5 occurrences (of the 8 found) are replaced
If I redo the same Replace All a second time 2 of the
remaining 3 occurrences are replaced.
If I redo the same Replace All a third time the last
remaining occurrence is replaced.
BUG ?
Is this a bug in Notepad++ or is this behavior normal or am I doing something strange here? Please help me understand.
So, rather than make multiple passes through your data, you can get it all in one pass with this:
(^.*?<h[1-6]>)?(.*?)</?h[1-6]>(?=.*</h[1-6]>.*?$)
and replace it with \1\2. The first capture group skips the first <h#> on each line and is null after line start. The second capture group captures everything up to the next <h#> tag. The optional slash (/?) scans and deletes both open and close tags. The last part is a positive lookahead to make sure the last </h#> is not deleted.
In the two lines of your examples all the header levels were the same on the line and this regex is fine. If the first open and last close don't match, then you have a problem but I think your solutions also have that same problem. In any case you can fix that in a second pass with ^(.*<h)([1-6])(.*<h)[1-6] and replace it with \1\2\3\2.
I would also point out that this creates unbalanced HTML with a <b>, followed by <h1>, followed by </b>, followed by </h1>. I don't know if that is OK for your case. If not, it might be better to remove ALL the <h#> tags and anchor new ones just inside the <div> </div> pair.
In any event here is a REGEX101 screenprint with this regex working on your examples:
In bbedit there is a feature where you select the text, choose text->change case->make title case from the menu and it will act accordingly.
Is there a way to select multiple strings of text across files in your project then apply the same text formatting?
I know you can do some regex stuff and change it, but none of that is the true title case where it ignores the words like "of" "and" "the" etc. The title case works great, I just need to do it on many items.
For example 5 html files have <h2>THIS IS THE TITLE</h2> -so now I go to each file select the text and do the above menu item. That is fine if there are 5, but if there are 2500 that I want to make <h2>This is the Title</h2> - then I need to be able to select more than one at a time....
Thanks in advance!
------edits
So if you were searching across multiple files for all your <h2> tags and you get back several across different files.....
<h2>MY TITLE</h2>
<h2>this is a title</h2>
<h2>Another title</h2>
The title case would change each of them accordingly to:
<h2>My Title</h2>
<h2>This is a Title</h2>
<h2>Another Title</h2>
Currently you select each one individually to do so via the menu. We'd like to do this with a find-all and change case if that makes sense....
F:<h2>(.*?)</h2>
R: <h2>\1</h2>[make this a certain case]
Thanks.
Almost any operation which involves repetition in BBEdit can be automated using AppleScript. :-)
Here's the text of an AppleScript script which will perform the operation you describe. You can copy and paste this into the AppleScript editor, and save it in BBEdit's "Scripts" folder for future use as needed.
use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions
tell application "BBEdit"
-- make sure we start at the top, because searches will (by default) proceed from the end of the selection range.
tell text of document 1 to select insertion point before first character
repeat
tell text of document 1
-- Find matches for a string inside of heading tags (any level)
-- NOTE extra backslashes in the search pattern to keep AppleScript happy
set aSearchResult to find "(<(h\\d)>)(.+?)(</\\2>)" options {search mode:grep}
end tell
if (not found of aSearchResult) then
exit repeat -- we're done
end if
-- the opening tag is the first capture group. We'll use this below
set openingTagText to grep substitution of "\\1"
-- the title is the third capture group
set titleText to grep substitution of "\\3"
-- use "change case" to titlecase the title
set changedTitleText to change case (titleText as string) making title case
-- select the range of text containing the title, so that we can replace it
set rangeStart to (characterOffset of found object of aSearchResult) + (length of openingTagText)
set rangeEnd to (rangeStart + (length of changedTitleText) - 1)
select (characters rangeStart through rangeEnd of text of document 1)
-- replace the range
set text of selection to changedTitleText
-- select found object of aSearchResult
end repeat
-- put the insertion point back at the top, because it's a nice thing to do
tell text of document 1 to select insertion point before first character
end tell
Our database stores a "table of contents" for each issue of our magazine as an unordered list. I want to create an "Articles related to #specificString#" page, so I'd like to query for the Table of Contents, and then find and display only those list items containing that specific string.
For example, suppose the specific string is "bumblebee," and the stored table of contents list is like so:
<ul>
<li>"The Secret Life of the Honeybee" by Anonymous</li>
<li>I got stung by a bumblebee!</li>
<li>"Flight of the Bumblee" was composed by Rimsky-Korsakov.</li>
<li>"The Case of the Disappearing Honeybee" by A. Conan Doyle, 1904</li>
</ul>
I'd like to match and display the text from the second and third list item but not the first or the fourth. I do not need to return the HTML -- only the relevant text. Conversely, if I could bleep out any list items that do NOT contain the relevant text, that would be fine as well!
I have tried
REMatchNoCase("<li>.*bumblebee.*</li>", text)
which finds all list items, even those that do not contain "bumblebee." Any suggestions or nudges in the right direction would be greatly appreciated! Many thanks!
You should use negation of the terms you want to enclosure your match instead of .*.
You can do this:
<li>[^>]*bumblebee[^<]*</li>
Here is Demo
Hi Have the following code, I am using the following code to remove the contents from the page which i do not know:
I am using regex, and i cannot use jsoup, please do not provide any jsoup link or code because that will be useless to use here for me..
<cfset removetitle = rereplacenocase(cfhttp.filecontent, '<title[^>]*>(.+)</title>', "\1")>
Now above the same way, i want to use the follwoing things:
1. <base href="http://search.google.com">
2. <link rel="stylesheet" href="mystyle.css">
3. and there are 5 tables inside the body, i want to remove the 2nd table.,
Can anyone guide on this
Scott is right, and Leigh was right before, when you asked a similar question, jSoup is your best option.
As to a regex solution. This is possible with regex but there are problems that regex cannot always solve. For instance, if the first or second table contains a nested table, this regex would trip. (Note that text is not required between the tables, I'm just demonstrating that things can be between the tables)
(If there is always a nested table, regex can handle it, but if there is sometimes a nested table, in other words: unknown), it gets a lot messier.)
<cfsavecontent variable="sampledata">
<body>
<table cellpadding="4"></table>stuff
is <table border="5" cellspacing="7"></table>between
<table border="3"></table>the
<table border="2"></table>tables
<table></table>
</body>
</cfsavecontent>
<cfset sampledata = rereplace(sampledata,"(?s)(.*?<table.*?>.*?<\/table>.*?)(<table.*?>.*?<\/table>)(.*)","\1\3","ALL") />
<cfoutput><pre>#htmleditformat(sampledata)#</pre></cfoutput>
What this does is
(?s) sets . to match newlines as well.
(.*?<table.*?>.*?<\/table>.*?) Matches everything before the first table, the first table, and everything between it and the second table and sets it as capture group 1.
(<table.*?>.*?<\/table>) Matches the second table and creates capture group 2.
(.*) matches everything after the second table and creates capture group 3.
And then the third paramters \1\3 picks up the first and third capture groups.
If you have control of the source document, you can create html comments like
<!-- table1 -->
<table>...</table>
<!-- /table1 -->
And then use that in the regex and end up with a more regex-friendly document.
However, still, Scott said it best, not using the proper tool for the task is:
That is like telling a carpenter, build me a house, but don't use a hammer.
These tools are created because programmers frequently run into precisely the problem you're having, and so they create a tool, and often freely share it, because it does the job much better.
I have an apparently simple regex query for pipes - I need to truncate each item from it's (<img>) tag onwards. I thought a loop with string regex of <img[.]* replaced by blank field would have taken care of it but to no avail.
Obviously I'm missing something basic here - can someone point it out?
The item as it stands goes along something like this:
sample text title
<a rel="nofollow" target="_blank" href="http://example.com"><img border="0" src="http://example.com/image.png" alt="Yes" width="20" height="23"/></a>
<a.... (a bunch of irrelevant hyperlinks I don't need)...
Essentially I only want the title text and hyperlink that's why I'm chopping the rest off
Going one better because all I'm really doing here is making the item string more manageable by cutting it down before further manipulation - anyone know if it's possible to extract a href from a certain link in the page (in this case the 1st one) using Regex in Yahoo Pipes? I've seen the regex answer to this SO q but I'm not sure how to use it to map a url to an item attribute in a Pipes module?
You need to remove the line returns with a RegEx Pipe and replace the pattern [\r\n] with null text on the content or description field to make it a single line of text, then you can use the .* wildcard which will run to the end of the line.
http://www.yemkay.com/2008/06/30/common-problems-faced-in-yahoo-pipes/