New line breaks custom markdown extension - regex

I'm writing an extension for python-markdown, that is supposed to put the text inside some custom tags of mine into a styled div.
I have created a simple Inline Pattern class that encapsulates matched expression in a div tag. My regex is as follows: r'(\{mytag_start\})(.+)(\{mytag_end\})' which then is put inside "^(.*?) --- (.*?)$" by the markdown.inlinepatterns.Pattern class upon compilation, so that the compile method is called as re.compile("^(.*?)%s(.*?)$" %r'(\{mytag_start\})(.+)(\{mytag_end\})').
At a first glance this does seem to do the trick, however I've noticed that all line breaks need to be hardcoded as <br> tags.
So
{mytag_start}This code<br>
will work{mytag_end}
However, the following code breaks the entire markdown
{mytag_start}This code
will not{mytag_end}
So instead I just get the entire above block unprocessed in plain text.
I tried supplying re.MULTILINE and re.DOTALL to the re.compile but it didn't help. Any ideas?
EDIT: Here is a sample extension file that exhibits the aforementioned problems. I then load the extension in my django template using {{ content:"mdx_MyExtension"}}.

Try using a non-greedy operator (+immediately followed by ?) :
r'(\{mytag_start\})(.+?)(\{mytag_end\})'
Full regex :
^(?:.*?)(\{mytag_start\})(.+?)(\{mytag_end\})(?:.*?)$
Flags :
DOTALL, IGNORECASE, MULTILINE
Data test :
blah
blash
<h1>Title</h1>
{mytag_start}This code<br>
will work{mytag_end}
<b>bold</b>
{mytag_start}This code
will not{mytag_end}
Output :
# Run findall
>>> regex.findall(string)
[(u'{mytag_start}', u'This code<br>\nwill work', u'{mytag_end}'), (u'{mytag_start}', u'This code\n\nwill not', u'{mytag_end}')]

Related

Problem by adding truncateworld and safe tags

I use Django 1.11 for my blog.
I illustrate all acticles in my first page by a title, an image and a few words.
For the few words, I'm using this method :
{{post.text|safe|linebreaks|truncatewords:"50"}}
I use a text editor and sometimes, I use for example italic text :
<i/>Italic text</i>
Let's imagine the value of truncatewords is on "1". It means it returns :
<i/>Italic
There is my problem. Some HTML tags are still opened. It means that the italic text never end and It will be applied for the rest of the code.
Do you know if a trick or workaround exists ?
Thank you.

New Line on Django admin Text Field

I am trying to create a blog o django where the admin posts blogs from the admin site.
I have given a TextField for the content and now want to give a new line.
I have tried using \n but it doesn't help. The output on the main html page is still the same with \n printing in it. I have also tried the tag and allowed tags=True in my models file. Still the same. All the tags are coming as it is on the html page.
My Django admin form submitted:
The result displayed in my public template:
You should use the template filter linebreaks, that will convert the reals \n (that means the newline in the textarea, not the ones you typed using \ then n) into <br />:
{{ post.content|linebreaks }}
Alternatively, you can use linebreaksbr if you don't want to have the surrounding <p> block of course.
After searching the internet and trying different Django Template Filters, I came across one specific filter, SAFE.
For me, LINEBREAKS filter didn't work, as provided by #Maxime above, but safe did.
Use it like this in your html template file.
{{post.content|safe}}
To have a better understanding of SAFE filter, i suggest reading the documentation.
{{post.content|linebreaks}}
This will make the line in the textbox appear as it is without using \n or \.
{{post.content|linebreaksbr}}
Besides the newline function in your CSS Declaration will work too.

Binding HTML strings in Ember.JS

I am using a third party indexing service (Swiftype) to search through my database. The returned records contains a property called highlight. This simply adds <em> tags around matching strings.
I then bind this highlight property in Ember.JS Handlebars as such:
<p> Title: {{highlight.title}} </p>
Which results in the following output:
Title: Example <em>matching</em> text
The browse actually displays the <em> tags, instead of formatting them. I.e. Handlebars is not identifying the HTML tags, and simply printing them as a string.
Is there a way around this?
Thanks!
Handlebars by default escapes html, to prevent escaping, use triple brackets:
<p> Title: {{{highlight.title}}} </p>
See http://handlebarsjs.com/#html-escaping
Ember escapes html because it could be potentional bad code which can be executed. To avoid that use
Ember.Handlebars.SafeString("<em>MyString</em>");
Here are the docs
http://emberjs.com/guides/templates/writing-helpers/
if you've done that you could use {{hightlight.title}} like wished,...
HTH

Parsing HTML tags using XSLT/MarkLogic

I am trying to convert an XML file to HTML. The XML file has a bunch of HTML tags of the form:
<item><text>Line 1<br/>Line 2<br/>Line 3</text></item>
Ultimately, the output that appears in Internet Explorer is:
<text>Line 1<br/>Line 2<br/>Line 3</text>
When I would like:
Line 1Line 2Line 3
Once I discovered disable-output-escaping, the text rendered properly in IE. Unfortunately, MarkLogic does not support this attribute.
I was able to eliminate the tags altogether using replace(), but I cannot replace the line break tags with an actual new line character.
Does anyone have any ideas on how to either:
1) Render the HTML properly in MarkLogic, or
2) Properly parse the HTML tags in XSLT.
Thanks!
Maybe you want this
let $foo := <item><text>Line 1<br/>Line 2<br/>Line 3</text></item>
return xdmp:unquote($foo/text())

Add a newline after each closing html tag in web2py

Original
I want to parse a string of html code and add newlines after closing tags + after the initial form tag. Here's the code so far. It's giving me an error in the "re.sub" line. I don't understand why the regex fails.
def user():
tags = "<form><label for=\"email_field\">Email:</label><input type=\"email\" name=\"email_field\"/><label for=\"password_field\">Password:</label><input type=\"password\" name=\"password_field\"/><input type=\"submit\" value=\"Login\"/></form>"
result = re.sub("(</.*?>)", "\1\n", tags)
return dict(form_code=result)
PS. I have a feeling this might not be the best way... but I still want to learn how to do this.
EDIT
I was missing "import re" from my default.py. Thanks ruakh for this.
import re
Now my page source code shows up like this (inspected in client browser). The actual page shows the form code as text, not as UI elements.
<form><label for="email_field">Email:</label>
<input type="email" name="email_field"/><label
for="password_field">Password:</label>
<input type="password" name="password_field"/><input
type="submit" value="Login"/></form>
EDIT 2
The form code is rendered as UI elements after adding XML() helper into default.py. Thanks Anthony for helping. Corrected line below:
return dict(form_code=XML(result))
FINAL EDIT
Fixing the regex I figured myself. This is not optimal solution but at least it works. The final code:
import re
def user():
tags = "<form><label for=\"email_field\">Email:</label><input type=\"email\" name=\"email_field\"/><label for=\"password_field\">Password:</label><input type=\"password\" name=\"password_field\"/><input type=\"submit\" value=\"Login\"/></form>"
tags = re.sub(r"(<form>)", r"<form>\n ", tags)
tags = re.sub(r"(</.*?>)", r"\1\n ", tags)
tags = re.sub(r"(/>)", r"/>\n ", tags)
tags = re.sub(r"( </form>)", r"</form>\n", tags)
return dict(form_code=XML(tags))
The only issue I see is that you need to change "\1\n" to r"\1\n" (using the "raw" string notation); otherwise \1 is interpreted as an octal escape (meaning the character U+0001). But that shouldn't give you an error, per se. What error-message are you getting?
By default, web2py escapes all text inserted in the view for security reasons. To avoid that, simply use the XML() helper, either in the controller:
return dict(form_code=XML(result))
or in the view:
{{=XML(form_code)}}
Don't do this unless the code is coming from a trusted source -- otherwise it could contain malicious Javascript.