Extract a portion of a docx into a new docx - python-2.7

I have a docx file with just text. I would like to create a new docx file containing just part of a page in the original docx. I am using python-docx for this.
So far I have been able to transverse the original docx document and copy each desired paragraph/run in the original to the new document as follows (this example should make an exact copy, I believe):
Doc = docx.Document('/tmp/input.docx')
OutDoc = docx.Document()
for para in Doc.paragraphs:
currentParagraph = OutDoc.add_paragraph(style=para.style)
for run in para.runs:
currentParagraph.add_run(run.text, style=run.style)
OutDoc.save('/tmp/output.docx')
Even though I am copying all style information, it seems that I am missing something, as the output lacks some of the formatting.

In Word, the style name applied to a paragraph or run (or any other content) is ignored if that style is not explicitly defined in the new document.
You can either parse through the styles in the source document and recreate each one in the new document, or create a blank "template" document for the new document that already contains the styles you want.
The "default" python-docx document template includes many of the built-in styles, but if your document uses any customized styles, that would explain the symptom you're seeing.
See these pages in the documentation for more:
http://python-docx.readthedocs.org/en/latest/user/styles-understanding.html
http://python-docx.readthedocs.org/en/latest/user/styles-using.html
http://python-docx.readthedocs.org/en/latest/api/document.html#docx.document.Document.styles
http://python-docx.readthedocs.org/en/latest/api/style.html

Related

QT C++ Project (Creating Login with XML)

I have question about my internship project. They want me to create a basic Login page(ID, Password). I create a XML file for Username and Password. The program should check the XML file for username and password*. If they are correct it will direct to a second window. I'm stuck on processing XML file for username and password. How can read those information in XML file.
As #JarMan said, I would recommend the QXmlStreamReader. You can fill it with a file (QIODevice), QString, QByteArray, etc...
Parsing a value could e.g. look like that
xml.attributes().value( attribute ).toString();
if attribute is a QString and xml is the QXmlStreamReader.
See the doc https://doc.qt.io/qt-5/qxmlstreamreader.html
There are several ways to do it. Marris mentioned one, but another one is to have this kind of code generated automatically. The way this works is that you first write an XML Schema that describes what your XML data looks like. An introduction to the XML Schema language can be found e. g. here.
Then you use an XML Schema compiler to translate the XML Schema to C++ classes. The schema compiler will also generate code to parse an XML file into objects, meaning you don't have to write any code to deal with XML by hand. It's a purely declarative approach: declare what the data looks like, and let the computer figure out the details.

Using document as a template: problem with heading numbering

The document body is hardcoded and then inserted into a template document with contains cover, summary, headers and styles. Heading styles are numbered 1, 1.1, 1.2, and so on. But to insert a heading just with 'Heading [n]' style does not work, numbering is lost. I think this happens because numbering is set through a multilevel list with headings attached.
Question: is it possible to use a document as a template without coding any formatting, or it is inevitable to deal with list styles in the code?
Yes, you can use a document as a template without any formatting. Please note that when you copy nodes from one document to another, this option specifies how formatting is resolved when both documents have a style with the same name, but different formatting.
The formatting is resolved as follows:
Built-in styles are matched using their locale independent style
identifier. User defined styles are matched using case-sensitive
style name.
If a matching style is not found in the destination document, the
style (and all styles referenced by it) are copied into the
destination document and the imported nodes are updated to reference
the new style.
If a matching style already exists in the destination document, what
happens depends on the importFormatMode parameter passed to
Document.ImportNode as described below.
When using the UseDestinationStyles option, if a matching style already exists in the destination document, the style is not copied and the imported nodes are updated to reference the existing style.
So, in your case, I suggest you please use UseDestinationStyles option while inserting one document into another.
I work with Aspose as Developer Evangelist.

Changing citation style to number-letter in papaja Rmarkdown package

I'm writing a scientific manuscript in RMarkdown using the papaja package, which enables me to report my statistics beautifully. However, the journal now requires me to submit a Word document with number-letter referencing. Is it possible to change the referencing style to a number-letter style in Papaja?
I tried opening the LaTeX output from papaja, but it has the citations set out in the text in APA format (e.g. "Apthorp, Bolbecker, Bartolomeo, O'Donnell, \& Hetrick, 2018"), which is not useful to me.
Here's the code from the top of the manuscript:
bibliography : ["PD_sway-1.bib"]
floatsintext : no
figurelist : no
tablelist : no
footnotelist : no
linenumbers : yes
mask : no
draft : no
documentclass : "apa6"
classoption : "man"
output : papaja::apa6_pdf
It would be great if I could get a Word document with number-letter referencing that I could then edit, but a LaTeX file or PDF with the correct citation format would be fine too.
The references are already typed out in APA style in the LaTeX document because they are handled by pandoc-citeproc rather than LaTeX. This has the advantage that the automatic reference formatting also works when you output your document in Word format. To get a Word document all you need to do is change the output line in the YAML front matter:
output: papaja::apa6_docx
Note that the formatting of Word documents that pandoc supports is somewhat limited and you may have to fix some things manually. From the corresponding section in the papaja manual:
More over, rendered documents in DOCX format require some manual work before they fully comply with APA guidelines.
We, therefore, provide the following checklist of necessary changes:
Always,
add a header line with running head and page number
If necessary,
position author note at the bottom of page 1
move figures and tables to the end of the manuscript
add colon to level 3-headings
in figure captions,
add a colon following the figure numbers and italicize(e.g. "Figure 1. This is a caption.")
in tables,
add horizontal rules above the first and below the last row
add midrules
Changing the citation style works just as it does in any R Markdown document. The work-in-progress papaja manual has a section on this:
Other styles can be set in the YAML front matter by specifying a CSL, or Citations Style Language, file. You can use either one of the large number of existing CSL files, customize an existing CSL file, or create a new one entirely.
To change the citation style, download the CSL file and add the following to the YAML front matter:
csl: "path/to/mystyle.csl"
I'm not sure what style the journal requires but most likely a corresponding CSL file already exists.

Create new language in mid-project

Was requested by the client a new language to add to the project in sitecore. I'm having some issues with that,
I've created the language, however all the renderings are empty for the new language.
There's no version assigned to any item for the new language also.
I tried to do a trick that was to export the English language and open the generated xml file and rename the tags , to the requested language and then import. But when I'm trying to import it, shows as I was importing English language and not the new one.
How can I create a new language with all the renderings and content (even if it goes to english when empty string, I know it's default procedure) ?
Thanks
You can use Powershell script for that.
See e.g. this blog http://www.coreblimeysitecore.com/blog/create-language-versions-using-sitecore-powershell/
Sample code is
Add-ItemLanguage -Path "master:\sitecore\content" -Language "en" -TargetLanguage "de-AT","de-de","en-za","fi-fi","fr-be","it-it","pl-pl","ru-ru","sv-se","fr-fr" -IfExist OverwriteLatest -IgnoredFields ""
And here is a very similar question on http://sitecore.stackexchange.com site:
https://sitecore.stackexchange.com/questions/1584/create-new-language-version-for-content-branch

How to disable DOI/URL for bibtex in Rmarkdown

I am using better bibtex and zotero to generate references in rmarkdown.
It works very good except that journal articles and books have an url/doi associated.
My adviser is not too happy about it and I could not figure out how to disable the url/doi in the rmarkdown config or elsewhere.
What I know is that you have to edit your *.csl file (asa.csl, apa.csl or something you use). You could accomplish this very easy by uploading it to this online csl editor. Browse to bibliography/layout/access(macro)/Group/conditional/ and look if there is an URL entry. I got rid of the DOI by setting an option there that the variable should be 'url' AND the document type 'webpage'. Then download the new *csl file, save it to your prefered directory and just knit it. (Look also here with pictures).
Note: Please make rather a safety copy before messing around with your *csl.