Embed XML from file into RMD - r-markdown

I have an XML file (SOME.XML), the contents of which I would like to embed in an .RMD document.
If I was content with having the XML reside directly within the .RMD, I'm aware that I could simply do:
```xml
---some xml here---
```
My embarassing attempts so far are:
```{xml code=readr::read_file('SOME.XML')}
```
...which failed as there is no XML engine.
I have also tried:
````{r results='asis'}
cat('```xml')
cat(readr::read_file('SOME.XML'))
cat('```')
````
...for which, although the knitting completes, the output is not at all correct.
Is this possible? (No doubt there is some trick here that I am missing!)
If needed, I could have a pre-knit stage where the content of the XML file is substituted in before subsequently passing a modified version of the .RMD to rmarkdown::render. However, I'd like to avoid this if possible.
Update:
Using readLines as proposed by the answer is complaining about ...incomplete final line found on 'SOME.XML'. Using readr::read_file() avoids this for me. For my particular project, I am now using:
```xml
`r readr::read_file("SOME.XML")`
```
Thank you to #user2554330 for the elegant solution!

Here's one way to do it:
```xml
`r paste(readLines("SOME.XML"), collapse = "\n")`
```
The idea is to put the XML into inline R code. I think knitr ignores the chunk wrappers because they aren't in the {xml ... format it looks for, but it will see the inline code and expand it. Then Pandoc will handle the formatting.
Here's how to modify your approach to get it to work:
````{r results='asis', echo = FALSE}
cat('```xml\n')
cat(readr::read_file('SOME.XML'), sep = "\n")
cat('\n```\n')
````

Related

Change default behavior of callout blocks in Quarto

In Quarto, I'd like to change the default behavior of a single callout block type so that it will
always automatically have the same caption (e.g. "Additional Resources")
always be folded (collapse="true")
Let's say I want this for the tip callout block type while the others (note, warning, caution, and important) should not be affected.
In other words, I want the behavior/output of this:
:::{.callout-tip collapse="true"}
## Additional Resources
- Resource 1
- Resource 2
:::
by only having to write this:
:::{.callout-tip}
- Resource 1
- Resource 2
:::
Update:
I have actually converted the following lua filter into a quarto filter extension collapse-callout, which allows specifying default options for specific callout blocks more easily. See the github readme for detailed instructions on installation and usage.
As #stefan mentioned, you can use pandoc Lua filter to do this more neatly.
quarto_doc.qmd
---
title: "Callout Tip"
format: html
filters:
- custom-callout.lua
---
## Resources
:::{.custom-callout-tip}
- Resource 1
- Resource 2
:::
## More Resources
:::{.custom-callout-tip}
- Resource 3
- Resource 4
:::
custom-callout.lua
local h2 = pandoc.Header(2, "Additional Resources")
function Div(el)
if quarto.doc.isFormat("html") then
if el.classes:includes('custom-callout-tip') then
local content = el.content
table.insert(content, 1, h2)
return pandoc.Div(
content,
{class="callout-tip", collapse='true'}
)
end
end
end
Just make sure that quarto_doc.qmd and custom-callout.lua files are in the same directory (i.e. folder).
After a look at the docs and based on my experience with customizing Rmarkdown I would guess that this requires to create a custom template and/or the use of pandoc Lua filters.
A more lightweight approach I used in the past would be to use a small custom function to add the code for your custom callout block to your Rmd or Qmd. One drawback is that this requires a code chunk. However, to make your life a bit easier you could e.g. create a RStudio snippet to add a code chunk template to your document.
---
title: "Custom Callout"
format: html
---
```{r}
my_call_out <- function(...) {
cat(":::{.callout-tip collapse='true'}\n")
cat("## Additional Resources\n")
cat(paste0("- ", ..., collapse = "\n\n"))
cat("\n:::\n")
}
```
```{r results="asis"}
my_call_out(paste("Resource", 1:2))
```
Blah blah
```{r results="asis"}
my_call_out("Resource 3", "Resource 4")
```
Blah blah

Why is libxml not storing html in my htmlDocPtr?

I am working on a piece of software that uses libxml to store xml on webpages in an xmlDocPtr. I need to expand this functionality to do the same for html.
The original code:
xmlDocPtr doc = xmlParseEntity(filename.c_str());
Where filename = 10.1.1.135/poll_data.xml and everything works just fine
Now, I have html filename = 10.1.1.165/index.htm and would like to store this as well. I have tried using htmlParseDoc with no success.
htmlDocPtr doc = htmlParseFile(filename.c_str(), "windows-1252");
The resulting doc object is not null but it does not contain the contents of the index.html
Netbeans spits out:
http://10.1.1.165/index.htm:1: HTML parser error : Document is empty
Any suggestions?

golang template escape first char

I'm trying to build sitemap XML file with the standard template package.
But the first charset "<" become "&lt ;", and make the XML unreadable for clients.
package main
import (
"bytes"
"fmt"
"html/template"
)
const (
tmplStr = `{{define "indexSitemap"}}<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://www.test.com/sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://www.test.com/events-sitemap.xml</loc>
</sitemap>
<sitemap>
<loc>https://www.test.com/gamesAndTeams-sitemap.xml</loc>
</sitemap>
</sitemapindex>{{end}}`
)
func main() {
// Parse the template and check for error
tmpl, parseErr := template.New("test").Parse(tmplStr)
if parseErr != nil {
fmt.Println(parseErr)
return
}
// Init the writer
buf := new(bytes.Buffer)
// Execute and get the template error if any
tmplExErr := tmpl.ExecuteTemplate(buf, "indexSitemap", nil)
if tmplExErr != nil {
fmt.Println(tmplExErr)
return
}
// Print the content malformed
fmt.Println(buf)
}
playground golang
Is that normal?
How can I make it works normaly.
Thanks in advance
Your example shows you're using the html/template package, which auto-escapes text for html usage.
If you want a raw template engine, use the text/template package instead - the html one just wraps it with context-aware escaping.
However, you'll need to make sure by yourself that the texts you output with the raw template engine are XML-safe. You can do this by exposing some escape function to your template, and passing all texts via this function instead of writing them directly.
[EDIT] It looks like a bug in html/template, if you omit the ? from the xml declaration it works okay. But still my advice stands - if it's not html you're better off using the text/template package. Actually, even better, describe the site map as a struct and don't use a template at all, just XML serialization.
Also see issue #12496 on github which confirms they are not planning to fix this.
https://github.com/golang/go/issues/12496
Probably because this is the HTML templating package and you're trying
to produce XML. I suspect that it doesn't know how to parse the
directives with the question mark there.
You probably want to use the text/template package instead, if you're
not going to be taking advantage of any of the HTML auto-escaping
features.

How to replace text in content control after, XML binding using docx4j

I am using docx4j 2.8.1 with Content Controls in my .docx file. I can replace the CustomXML part by injecting my own XML and then calling BindingHandler.applyBindings after supplying the input XML. I can add a token in my XML such as ¶ then I would like to replace that token in the MainDocumentPart, but using that approach, when I iterate through the content in the MainDocumentPart with this (link) method none of my text from my XML is even in the collection extracted from the MainDocumentPart. I am thinking that even after binding the XML, it remains separate from the MainDocumentPart (??)
I haven't tried this with anything more than a little test doc yet. My token is the Pilcrow: ¶. Since it's a single character, it won't be split in separate runs. My code is:
private void injectXml (WordprocessingMLPackage wordMLPackage) throws JAXBException {
MainDocumentPart part = wordMLPackage.getMainDocumentPart();
String xml = XmlUtils.marshaltoString(part.getJaxbElement(), true);
xml = xml.replaceAll("¶", "</w:t><w:br/><w:t>");
Object obj = XmlUtils.unmarshalString(xml);
part.setJaxbElement((Document) obj);
}
The pilcrow character comes from the XML and is injected by applying the XML bindings to the content controls. The problem is that the content from the XML does not seem to be in the MainDocumentPart so the replace doesn't work.
(Using docx4j 2.8.1)

Does Qt Linguist offer the ability to add new entries to the editable .ts file?

I didn't find a way to do this - only to edit the translations to the existing fields.
If there is no way to achieve this - how should this be done (somehow automatically, because right now I was manually adding
<message>
<source>x</source>
<translation>xx</translation>
</message>
blocks to my .ts file and I assume that's not the correct way.
No, that's not the correct way :) Use tr() in the code to mark strings for translation.
For example
label->setText( tr("Error") );
The you run lupdate for your project to extract them to a .ts. See here for more details.
Or do you need to translate strings that are not in the source code?
I just wrote a python script to insert new entries
into the .ts file for a homegrown parser using ElementTree. It doesnt make the code pretty
when it adds it, but I believe it works just fine (so far):
from xml.etree import ElementTree as ET
tree = ET.parse(infile)
doc = tree.getroot()
for e in tree.getiterator()
if e.tag == "context":
for child in e.getchildren():
if child.tag == "name" and child.text == target:
elem = ET.SubElement(e, "message")
src = ET.SubElement(elem, "source")
src.text = newtext
trans = ET.SubElement(elem, "translation")
trans.text = "THE_TRANSLATION"
tree.write(outfile)
Where infile is the .ts file, outfile may be the same as infile or different.
target is the context you are looking for to add a new message into,
and newtext is of course the new source text.