Is html file produced by r markdown contain raw data? - r-markdown

I'm planning to use R markdown for reporting health checkup data in company. What I concern about is that if html file produced by r markdown contain the raw data (such as patient name) and accesible via that file in anyway. If that is the case, is there any way to avoid that?

This is difficult to answer without an example of your data (which I realise will be hard to share, given your question).
Basically, there's nothing in the html-version of Rmarkdown that would necessarily embed the data if you don't ask it to. If all you're planning to do is to summarise and ignore (or anonymise) individual cases then Rmarkdown should work fine.
For more information, take a look at the useful help page for html pages using Rmarkdown.
I hope this helps you.

Related

Converting RTF tables into xml

I was given the task to convert great amount of RTF tables into XML ones (around or way more than 100.000), but I have no idea how to even start it and i cannot get help from the lead developer, because ironically he had never written a line of code.
I was thinking about c++ as I need t to be fast, but I'm open to any ideas.
What I need is some information I can start the project with or any library/program I could use for my help, thank you.
EDIT: I have XSD schemas to work with.
Found the solution after looking for a while. I can use LibreOffice to save it as html or other various forms that will keep the table as it is and also give a clear code i can pull an XSD on to make it valid also.

How to parse the java comments of a groovy file to html format?

I have a set of .groovy files (Java). All of these files have the same comment format.
I developped a tool with wich I'm able to read those files and applying a REGEX to get all the comments in a list. (Finally i just have to copy paste these comments to .html file)
I would like to know if it's a correct practice in order to generate a HTML page with the comment (a kind of documentation). If not, what would you recommend ?
I read about Doxygen and Javadoc but i'm not sure about using them (if they can be really useful in my case since the comments are already written)
If you can suggest a library in order to generate easily a HTML Webpage or any other advice.
Any help is appreciated.
There exists Groovydoc which is roughly the equivalent of Javadoc, just for Groovy.
As your setup is not that (you already have comments, probably not in Groovydoc format, and you have half the tooling), there are still multiple ways open to you. As you already extract the documentation from groovy, if I were you, I would do a minimal post-formatting, if necessary, and output the documentation as markdown (e.g., github markdown) or asciidoc (e.g., asciidoctor). Then you can use any preferred tool to convert the post-formatted documentation into HTML.
To answer the question "How to parse the java comments" – you shouldn't. If possible, especially in a new project, stick with the standard tooling. In the case of Groovy that's Groovydoc. The normal (non Java/Groovy-Doc style) comments themselves you should never need to extract from the source code. They should be so much context-specific, that without the corresponding code they are anyways useless.

Converting HTML file to PDF using Win32/MFC

As part of my application, my client has requested that I include an automated e-mailing system. As part of this system, I generate HTML code and use automation to send it via. Outlook.
However, they also require a PDF copy of the HTML document to be sent as an attachment. My initial attempts involved using libHaru, which proved difficult to use efficiently, as I was required to create the PDF document from scratch, which required computation of the position of each of the lines in a table, and positioning of all the text, etc.
I was wondering if there would be a way to programmatically convert HTML code (or an HTML file if need be) into a PDF document either by using Win32/MFC itself or an external library.
Thanks in advance!
EDIT: Just to clarify, I am looking for solutions which minimize external dependencies.
You should evaluate this utility wkhtmltopdf:
http://code.google.com/p/wkhtmltopdf/
You can call it from the command line without the need to run a setup.
I use it generating my output documents as html then cal a ShellExecute(...) to convert it to PDF. It's great!
Inside uses webkit + qt. So compability with modern HTML is OK.
Hope it helps.
I'd take a look at PDF Creator, which can be used as a COM object (that acts pretty much like a printer). I haven't used it to print HTML, so I'm not sure, but my guess is that you'll probably end up having to instantiate a web browser control to render the HTML, and then feed it from there to the PDF control.
Some possible answers are in this thread:
C++ Library to Convert HTML to PDF?
Not sure if they will satisfy your particular requirements, but these might at least get you started.
Edit:
Some other possible options here.
Not MFC but you can try QtWebKit. It can render and export HTML to PDF, PNG, JPEG

C++ Logger-Should I use an ordinary xml parser?

I'm working on a logging system for my 2D engine, and I'm confused on how I should go about creating/editing the file, and how I should output that file.
I've learned that XML is more of a data carrier rather than a data displayer like HTML is. I've read that I can use XML to HTML converters. One method I've thought about is writing characters to a file in HTML.
Clarity on these matters is what I ask of you, stack overflow.
Creating an XML (or HTML) file doesn't need any special library. Straightforward string concatenation is usually good enough, you may have to encode some special characters (e.g. > into >.
But as Owen says, plain text is a log more common for log files. One reasonable compromise is comma-separated values in a text file, this gives you a little bit of structure without much overhead. For example, the Windows web server (IIS) uses this format by default, and if you have some fields that are output for each line such as timestamp or source filename and line number, this makes it easy to separate those out again.
Just about every log I've ever worked with has been pure text delimited by newlines. If you're going to depart from that, you may want to ask yourself what it is about your logging needs that you want to accomplish with markup.
If you must go the way of markup, I would suggest an XML format that contains a minimal set of markup that would be useful in your situation. You could use XML to capture structure in your log entries (timestamp, severity, and operational code, for example) that would be inconvenient to code for in HTML.
Note that you could also go hybrid and embed some XHTML tags in an XML element whose purpose is to capture displayable text, if you want.
The problem with XML or HTML files is that you cannot append at any time. You have to close the final tag (document tag) properly at the end of writing.
Therefore, it's not a popular format for logging.
For logging, I suggest using one of the existing log engines, such as Apache logger, or, John Torjo's boost log candidate. They will support log levels, runtime configuration, etc.
If you are considering writing logs in XML files, please, stop.
Log files should be simple plain text files, XML-izing it is introducing needless complexity. They are not structured data, they are meant to be read by people, not automated tools.
It all starts with XML logs, and then it goes downhill from there.

Everything inside < > lost, not seen in html?

I have many source/text file, say file.cpp or file.txt . Now, I want to see all my code/text in browser, so that it will be easy for me to navigate many files.
My main motive for doing all this is, I am learning C++ myself, so whenever I learn something new, I create some sample code and then compile and run it. Also, along these codes, there are comments/tips for me to be aware of. And then I create links for each file for easy navigation purpose. Since, there are many such files, I thought it would be easy to navigate it if I use this html method. I am not sure if it is OK or good approach, I would like to have some feedback.
What I did was save file.cpp/file.txt into file.html and then use pre and code html tag for formatting. And, also some more necessare html tags for viewing html files.
But when I use it, everything inside < > is lost
eg. #include <iostream> is just seen as #include, and <iostream> is lost.
Is there any way to see it, is there any tag or method that I can use ?
I can use regular HTML escape code < and > for this, to see < > but since I have many include files and changing it for all of them is bit time-consuming, so I want to know if there is any other idea ??
So is there any other solution than s/</< and s/>/>
I would also like to know if there any other ideas/tips than just converting cpp file into html.
What I want to have is,
in my main page something like this,
tip1 Do this
tip2 Do that
When I click tip1, it will open tip1.html which has my codes for that tip. And also there is back link in tip1.html, which will take me back to main page on clicking it. Everything is OK just that everything inside < > is lost,not seen.
Thanks.
You might want to take a look at online tools such as CodeHtmler, which allows you to copy into the browser, select the appropriate language, and it'll convert to HTML for you, together with keyword colourisation etc.
Or, do like many other people and put your documentation in Doxygen format (/** */) with code samples in #verbatim/#endverbatim tags. Doxygen is good stuff.
A few ideas:
If you serve the files as mimetype text/plain, the browser should display the text for you.
You could also possibly configure your browser to assume .cpp is text/plain.
Instead of opening the files directly in the browser, you could serve them with a web server than can change the characters for you.
You could also use SyntaxHighlighter to display the code on the client side using JavaScript.
It is pretty much essential that somewhere along the line you use a program to prevent the characters '<>&' from being (mis-)interpreted by your browser (and expand significant repeated blanks into '` '). You have a couple of options for when/how to do that. You could use static HTML, simply converting each file once before putting it into the web server document hierarchy. This has the least conversion overhead if the files are looked at more often than they are modified. Alternatively, you can configure your web server to server the pages via a filter program (CGI, or something more sophisticated) and serve the output of that in lieu of the file. The advantage is that files are only converted when needed; the disadvantage is that the files are converted each time they are needed. You could get fancy and consider a caching solution - convert the file on first demand but retain the converted file for future use. The main downside there is that the web server needs to be able to write to where the converted file is cached - not necessarily a good idea for security reasons. (A minimalist approach to security requires the document hierarchy to be owned by and only writable by one user, say webmaster, and the web server runs as another user, say webserver. Now the web server cannot do any damage because it cannot write anywhere in the document hierarchy. Simple; effective; restrictive.)
The program can be a simple Perl script or a simple C program (the C source for webcode 1.3 is available here).