Creating new lines before a-tags - prettier

How to stop creating new lines in a paragraph when it has a link?
Input:
<p><b>asfd;a sd;lfkja sd;lfka</b></p>
Output:
<p>
<b>asfd;a sd;lfkja sd;lfka</b>
</p>
If there is no A tag, then everything looks fine.
<p><b>asfd;a sd;lfkja sd;lfka</b></p>
Line length limit (printWidth) does not help.
playground

Related

Bootstrap group list shows text sprites at runtime

The code is listed below, using the bootstrap class class="list-group-item-text".
As the image here shows, the text seems to have a duplicate version just above it - looks almost like dotted lines.
`
<asp:Panel ID="pnlInstructions" runat="server" Visible="true">
<h4>Instructions:</h4>
<div class="form-inline col-lg-12" id="Instructions" style="padding-top:10px;padding-bottom:20px;padding-left:10px;">
<ol class="list-group-item-text" style="text-emphasis:filled;outline:none;">
<li>First, please use MyEd to get your extract file ready.</li>
<li>Then, fill in the following to Log in.</li>
</ol>
</div>
</asp:Panel>
`
I've researched the problem using the word "sprites", but that seems to have a different meaning than I expected, since I thought it meant "unwanted junk" in a display.
I'm not sure if this appears on all browsers.

How to label a text with multiple paragraphs in AWS Ground Truth?

I was trying setup a single label labeling task in AWS Groundtruth through the console. My goal is to match some users in social media and for each user I have several possible candidates out of which one should be selected (label). My CSV looks like this:
firtname | lastname | candidates
Romeo Montague x
Juliet Capulet x
Instead of "x", I would like to have something like this
candidate_1
description_1
link_1
candidate_2
description_2
link_2
candidate_3
description_3
link_3
The human worked should then select whereas the correct label is candidate_1, candidate_2 or candidate_3 or none of the above.
I am aware Sagemaker ground truth does not accept new lines characters and that it renders it in HTML so I tried to input the following:
candidate_1 <br/> description <br/> link <br/><br/> candidate_2 <br/> description <br/> link <br/><br/> candidate_3 <br/> description <br/> link <br/><br/>
unfortunately, when I take a look at the console, the input on the left does not render correctly:
The line breaks within the div tag seem to be simply ignored by the UI.
I found this post which contains the answer but I am struggling to adapt to my concrete use case.
How can I change my csv so that the multiple paragraphs get rendered corrected?

django-ckeditor: Is there a django config/setting to get <br> instead of <br />?

I'd like to have <br> elements from CKEditor instances like that, not <br />. I found this question with an answer, but I don't know where that code should go, in config.js? In my <script> element/JS file? And, actually, I want to know if I can use a config in CKEDITOR_CONFIGS to change this.
Example:
I write the following in my browser:
This is the first line
This is the second line.
I get this in the inspector:
<body class="cke_editable cke_editable_themed cke_contents_ltr cke_show_borders" spellcheck="false" contenteditable="true">
This is the first line
<br>
This is the second line.
<br type="_moz">
</body>
I get this in my view when I process the POST request:
'This is the first line<br />\r\nThis is the second line.'
When I save this to the database and then reload the page, I get in the browser:
This is the first line
This is the second line.
Because the \r\n was read as another newline and now I have two line breaks. (See edit below)
Why the <br />? Also, why do I get a newline after the <br> element in the HTML? This causes me to get <br />\r\n in my model instance attributes and thus, in my database. The \r\n won't actually cause a problem (see edit below), but it would be easier if I get rid of it; I think <br> should suffice.
EDIT
Actually, having the \r\n does cause a problem. When I submit the form, I get, as I said, <br />\r\n in my database. Then, when I render this in the browser, I get two linebreaks (<br>) because it read the first <br /> and then the \r\n as another <br>. This happens because I am using the linebreaksbr template tag; my processing code outputs newlines in this way. It would be easier if I can get rid of the \r\n too, instead of changing my code.

awk alpha regular expression improvement

I've a text file that has alpha lines. Some of the alpha lines start with 'Narrated' and needs to be processed differently from all other alpha lines. Below is the test data:
This is my article
<img src="">
<a href="">
New magazine
Narrated by abc
<a href="">
Is this a new paper?
<img src="">
<a href="link1">
<a href="link2">
That is an old journal
<img src="">
<a href="">
A fine book!
<img src="">
<a href="">
Yes, this is some book.
Narrated by xyz
<img src="">
<a href="">
My current script looks like this:
BEGIN {
title = "^[A-Z].*"
narrated = "to be defined"
image = "^<img.*"
links = "^<a.*"
}
$0 ~ title {
pos = index($0, "Narrated"); # check if the line contains narrated
if (pos == 0) {
print $0; #print other line
} else {
print $0; #print narrated line
}
}
$0 ~ img {
# do processing
}
$0 ~ link {
# do processing
}
I want to define the "narrated" regular expression and improve the "title" regular expression. Thanks for help!
The input is a series of data sets that has optional and mandatory items. Some of the items can be repeated. Each set will have following items in below order:
1) description of the item (mandatory)
2) narrated by (optional)
3) link description (one or more links per set. mandatory)
Additional info about the data set
a) All items of the set are separated by new lines
b) Last item of the set has ']' as the last character i.e. ]
c) Raw file has other data issues which are not mentioned here (e.g.
What should be RS and FS for this data set?
The expected output is a json array that is produced by parsing the input file and combining the elements based upon other characteristics that are embedded within the data. All related elements occur in sequence so line-by-line processing of the 'processed data file - not raw data file' with awk works as a solution for this problem. Raw file processing by awk will probably work as well but I've not given it a shot as it contains data elements that need to be discarded anyway and required data elements are surrounded by other text elements.
To handle narrated lines, just do:
/^Narrated/ {do some thing}
or
$1=="Narrated" {do some thing}
do some thing will only be run if line starts with Narrated
I do not see why you should use regular expression here.
Title lines could be:
/^This is/ {do some thing}
Pleas post expected output of your code.

HTML: sanitize a set of tags but allow all tags in <code> blocks

I'm using Django+Markdown for processing user input. Text produced by the markdown filter need to be 'safe' and is not protected by django's auto-escape mechanism, so I have to escape user input myself. This is how I do it now:
{{ text|force_escape|markdown:"codehilite" }}
However, if text contains something that would be marked as <code> by markdown, it is escaped as well and the output would be pretty ugly(e.g., '<' is displayed as < in <code>). For example, if
text = u'''
<script>alert("I'm not working 'cause I'll be escaped")</script>
The following would be marked as a code block:
<script>alert("not xss 'cause I'm in <code>")</script>
'''
Using the filter mentioned above, the produced text is:
<p>
<script>alert("I'm not working 'cause I'll be escaped")</script>
The following would be marked as a code block:
</p>
<pre class="codehilite">
<code>
&lt;script&gt;alert(&quot;not xss &#39;cause I&#39;m in &lt;code&gt;&quot;)&lt;/script&gt;
</code>
</pre>
What I what is:
<p>
<script>alert("I'm not working 'cause I'll be escaped")</script>
The following would be marked as a code block:
</p>
<pre class="codehilite">
<code>
<script>alert("not xss 'cause I'm in <code>")</script>
</code>
</pre>
I'm thinking about using BeautifulSoup to get the <code> blocks produced by markdown and reverse-escape their content. But soup.code.text returns only the 'text', excluding the tags. so I couldn't get my hands on any of the <,>,',",&s in it..
Don't escape the input before passing it to Markdown. As you found, this breaks user input in some cases. And, it doesn't ensure security: consider, e.g., "[clickme](javascript:alert%28%22xss%22%29)".
Instead, the correct approach is to use Markdown in its safe mode. I've written elsewhere about how to do so, but the short version in Django is to use something like {{ text | markdown:"safe" }}. (Alternatively, you can apply a HTML sanitizer, like HTML Purifier, to the output of the Markdown processor.)