Disabling figure numbering in bookdown

Disabling figure numbering in bookdown - r-markdown

Is it possible to use figure captions (i.e., fig.cap) but without numbering, that is, with neither "Figure", nor the number shown in bookdown?
The best solution would be one that works for all output formats (say, gitbook, pdf_book and epub_book).

Related

bookdown figure number formatting

By default, the {bookdown} package formats figure and table captions as, for example:
Figure 2.1: Here is a nice figure!
Is there a way to remove the : following Figure 2.1 or to change it to any arbitrary character (e.g., .):
Figure 2.1. Here is a nice figure!
I can change the Figure label to any arbitrary text as described here; however, I have not been able to find documentation for customizing the separator between the figure number and caption.
From the source code here, I fear that the colon is hard-coded in. Any suggestions or resources are greatly appreciated!

How to get most similar words to a document in gensim doc2vec?

I have built a gensim Doc2vec model. Let's call it doc2vec. Now I want to find the most relevant words to a given document according to my doc2vec model.
For example, I have a document about "java" with the tag "doc_about_java". When I ask for similar documents, I get documents about other programming languages and topics related to java. So my document model works well.
Now I want to find the most relevant words to "doc_about_java".
I follow the solution from the closed question How to find most similar terms/words of a document in doc2vec? and it gives me seemingly random words, the word "java" is not even among the first 100 similar words:
docvec = doc2vec.docvecs['doc_about_java']
print doc2vec.most_similar(positive=[docvec], topn=100)
I also tried like this:
print doc2vec.wv.similar_by_vector(doc2vec["doc_about_java"])
but it didn't change anything. How can I find the most similar words to a given document?

Not all Doc2Vec modes even train word-vectors. In particular, the PV-DBOW mode dm=0, which often works very well for doc-vector comparisons, leaves word-vectors at randomly-assigned (and unused) positions.
So that may explain why the results of your initial attempt to get a list-of-related-words seem random.
To get word-vectors, you'd need to use PV-DM mode (dm=1), or add optional concurrent word-vector training to PV-DBOW (dm=0, dbow_words=1).
(If this isn't the issue, there maybe other problems in your training setup, so you should show more detail about your data source, size, and code.)
(Separately, your alternate attempt code-line, by using doc2vec["doc_about_java"] is retrieving a word-vector for "doc_about_java" (which may not be present at all). To get the doc-vector, use doc2vec.docvecs["doc_about_java"], as in your first code block.)

Generate inline rather than list-style footnotes in Pandoc Markdown output?

When converting from some format (say, HTML or Docx) to Markdown in Pandoc, is it possible to render all footnotes in the inline style ("this is the main text^[this is a footnote]") rather than as numbered references with a corresponding list at the end of the document? I want to work on my Markdown documents (converted from a Docx of my thesis) as master texts, but now if I add a new footnote it messes up the numbering.
Alternatively, is there another convenient way (i.e. not Pandoc) that this could be done? Cutting text in one part of a file and adding corresponding text in another part seems a bit beyond a simple regex.
Thanks in advance for any help.
EDIT: I've just hacked up an extremely simple Python script to do this, in case anyone else has the same issue.

Pandoc's Markdown syntax is quite flexible about footnotes:
The footnotes themselves need not be placed at the end of the document. They may appear anywhere except inside other block elements (lists, block quotes, tables, etc.).
Like:
Here is a footnote reference[^1] and some more text.
[^1]: Here is the footnote.
Here's the next paragraph.
However, the Markdown Writer (the module that generates markdown files, as opposed to reading them) currently simply places all of them at the end of the document. But this could be implemented behind a flag, similar to the --reference-links flag. Feel free to submit an issue or pull request!

Inline footnotes and references are quite nice for writing and editing markdown documents, but cumbersome for reading them.
I used ltrgoddard's inliner with success to process several files that I use with pandoc and latexmk to produce PDF. inliner works well for transforming end-style references to inline style references in an already-written document.
Cross references to other questions and clues for posterity:
Convert markdown links from inline to reference
Vim plugin for adding external links
Also see http://drbunsen.github.io/formd/
and https://instant-thinking.de/2014/02/20/markdown-footnotes-with-vim/ for more info re: formd, which should work for converting inline references end-style references, and vice-versa.
Note that formd works on URLs and ignores footnotes, so this may be seen as a similar project (with different goals) but not an alternative.

solr PatternReplaceCharFilterFactory working unexpectedly

I am relatively new to Solr so please forgive me if I'm missing something obvious. I have an application that allows users to search for musical artists. The indexing comes from a read-only database with correct spellings so on the index side I have it figured out.
On the query side however I need to anticipate various spelling errors/differences and want to help solr find those instances. From our old home-grown search solution, I have a list of regex's and the artists they apply to. When I was trying to translate those to solr using the PatternReplaceCharFilterFactory, I noticed that some worked perfectly, while others didn't work at all ... with seeming no rhyme nor reason between them.
For example:
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="em[ei]n[ei]m" replacement="Eminem"/>
accurately captures the common misspellings of Eminem. But for the band 311:
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[Tt]hree [Ee]leven" replacement="311"/>
Does not work. Another example is Nine Inch Nails:
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="((nine|9).*inch.*nails\b)|(n\.? ?i\.? ?n\.?\b)" replacement="Nine Inch Nails"/>
works perfectly for finding the most common patterns for the band's name. But for Eve 6:
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[Ee]ve.{0,4}([Ss]ix|6)" replacement="Eve 6"/>
Is there something fundamental I'm missing on the usage of this filter? I've tried a number of variations on the regex's I've mentioned above (even going so far as using literals like 'three eleven'), but still with no success. I've tried making the filter in question the only PatternReplaceCharFilterFactory in the analyzer. I also know for sure that these items are in the index correctly because when I search for the correct spelling it returns the proper results.
Any suggestions?
Snowdall

I suspect the problem is not with your Char Factory, but with what comes after all, specifically the tokenizer. If you use standard tokenizer, it will get rid of the numbers you have just put into your stream. If you don't need the text to be split into tokens, you could look at KeywordTokenizerFactory instead.
In general, the best way to troubleshoot this in Solr 4+ is the Analysis screen in the Admin WebUI. It allows you to enter your text against particular field type and see what happens with it after each component in the analysis chain.

I would recommend using the SynonymFilter for the kind of application you describe. It allows you to provide an external file where you list words and their synonyms, like:
eminem <=> emenem
nine <=> 9
If you precede this with a LowerCaseFilter, you won't have to fuss about case normalization in your synonyms. You should be able to handle the 311 case too as long as you don't tokenize (ie use a KeywordTokenizer as Alexander Rafalovitch suggested).

xslfo with FOP: Check if content overflows and call different template?

I have a question with XSLFO, generator is FOP. What I wanna do:
In the PDF I wanna generate an item list, each item is in a box with a specific width and height. In case the content does not fit this box, the content should be displayed in a bigger box (with also specific dimensions).
I do not see any way to reach that in XSLFO, especially with FOP.
Has someone an idea to solve that?
Thanks for every idea!!

There are two separate, independent processing steps involved here:
Generation of XSL-FO markup (using a stylesheet and an XSLT processor).
Rendering of XSL-FO markup as PDF (using a FO processor, such as FOP).
The second step cannot influence the first. It is not possible to test for overflow conditions during rendering and somehow decide what template to invoke. There is no feedback loop. What you are asking for is not possible.
It is possible to do crude text fitting by estimating the length of text strings in XSLT. That is the idea behind "Saxon Extension for Guessing Composed Text String Length".
I have not used this extension, and it may not even be available anymore (the announcement about it is from 2004). In any case, this is very far from an actual layout feedback mechanism.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Disabling figure numbering in bookdown - r-markdown

Is it possible to use figure captions (i.e., fig.cap) but without numbering, that is, with neither "Figure", nor the number shown in bookdown? The best solution would be one that works for all output formats (say, gitbook, pdf_book and epub_book).

Related

bookdown figure number formatting

How to get most similar words to a document in gensim doc2vec?

Generate inline rather than list-style footnotes in Pandoc Markdown output?

solr PatternReplaceCharFilterFactory working unexpectedly

xslfo with FOP: Check if content overflows and call different template?

Categories

Resources