Type "p" Annotations with Plain Text files - gate

I am trying to understand whether GATE is able to extract annotations of type "p" from plain text files which are UTF-8 encoded.
HTML files and PDF files work just fine and "p" annotations are added when these 2 file types are being analysed.
I tried using different PRs but i do not seem to be able to get type "p" annotations under Original Markups.
Is there a way to achieve this for Plain Text files?

I think you should use Annotation Set Transfer PR which will move "p" annotations from Original Markups to Default set. Then you will be able to use them according to your requirements.

Related

Quarto - How to embed files in outputted word documents programmatically?

My quarto document uses knitr as the backend. I have the output set to create a word document. I would like the outputted word document to also contain an embedded object that is the file used to create the word document itself.
So create_word.qmd will create output_word_file.docx. I would like there to be an embedded object in the output_word_file.docx file that is a copy of create_word.qmd. Is this possible programmatically? Or do I need to manually do this part?
If it's only possible using the .Rmd file type then I can make that work.

Setting the supported file types for a MFCBrowseEdit control in File mode? [duplicate]

how to give default file name extension in CMFCEditBrowseCtrl::EnableFileBrowseButton? How the arguments should be passed? I tried like following code.
CMFCEditBrowseCtrl py_file_path;
py_file_path.EnableFileBrowseButton(_T"PY",_T"*.py");
But it is not displaying the .py files. It says "no items matches". I guess there is some problem with the lpszDefExt and lpszFilter values i use. Could anyone tell me what is the value of those arguments to list all .py files?
You need to set it like this:
CMFCEditBrowseCtrl py_file_path;
py_file_path.EnableFileBrowseButton(_T("PY"), _T("Python files|*.py||"));
The final argument is a filter string, where the description and filter are delimited by |.

How to get email attachment MIME type by file extension?

I'm trying to write a class that would use Mandrill APIs to send an email with an attachment. To do that I need to provide MIME type of the attachment for the base-64 encoded attachment contents. The question is how to get it, assuming using the file extension of the attachment?
PS. I was hoping for something better than a long switch/case situation. But if that's my only option, where can I get the most exhaustive list of such associations?
You can look in the Registry, under either:
HKEY_CLASSES_ROOT\<file extension>\ and see if it has a "Content Type" value.
HKEY_CLASSES_ROOT\MIME\Database\Content Type\, enumerating each subkey until you find one whose "Extension" value contains the file extension.
There is also a FindMimeFromData() function.
If you don't find a matching content type, you can always use application/octet-stream.
Apart to what was suggested in a different answer, here's a hard-coded list of file-extension-to-MIME-types associations that I converted from this Ruby project's JSON list into a C-struct.
Oops, can't post it here. It's too long. Here's the file instead.

Is there anyway to rename the "Source" button to something like "HTML"?

Is there anyway to rename the "Source" button to something like "HTML", I ask this as users are confused at how to add html code using the editor?
Yes, inside of the "lang" folder you will see all of the various language files.
For my case, and probably yours, You will want to edit the file "en.js". The file is "compressed" to some degree so it may be difficult to read, but it's still not too difficult to change one string. If you do plan on changing multiple strings you will most likely want to use a service to format Javascript.
Search for the following segment of code. It was one of the very last lines in the file.
"sourcearea":{"toolbar":"Source"}
change it to
"sourcearea":{"toolbar":"HTML"}
Avoid This Method Unless Required
And as for a very unsuggested method, since you can't modify the language files for some reason, you can modify the ckeditor.js file and force a specific label.
Inside of "ckeditor.js" change the line below
a.ui.addButton("Source",{label:a.lang.sourcearea.toolbar,command:"source",toolbar:"mode,10"});
to the follow code
a.ui.addButton("Source",{label:"HTML",command:"source",toolbar:"mode,10"});
The only thing modified is the "label" value in the above line. We remove the reference to the a.language.sourcearea.toolbar and insert a string in it's place instead.

HTML templating in C++ and translations

I'm using HTML_Template for templating in my C++-based web app (don't ask). I chose that because it was very simple and it turns out to be a good solution.
The only problem right now is that I would like to be able to include translatable strings in the HTML templates (HTML_Template does not really support that).
Ultimately, what I would like is to have a single file that contains all the strings to be translated. It can then be given to a translator and plugged back in to the app and used depending on which language the user chose in settings.
I've been going back and forth on some options and was wondering what others felt was the best choice (or if there's a better choice that isn't listed)
Extend HTML_Template to include a tag for holding the literal string to translate. So, for example, in the HTML I would put something like
<TMPL_TRANS "this is the text to translate"/>
Use a completely separate scheme for translation and preprocess the HTML files to generate the final template files (without the special translation lingo). For example, in the pre-processed file, translatable text would look like this:
{{this is the text to translate}}
and the final would look like:
this is the text to translate
Don't do anything and let the translators find the string to translate in the html and js files themselves.
You may want to consider arrays, if not already.
A popular implementation for translating strings is to use tables and indices. One index is for the language and the second index is for the string. Create a function that returns strings based on these two indices:
const std::string& Get_String(unsigned int language_index, unsigned int string_index);
Each language would have a table of strings (or const char *). There would be a table of pointers to language tables, one for each supported language.
The biggest pain is to convert existing code to use this system.
Hope this helps.