How can I programmatically generate documents using TeX? - web-services

This applies to popular languages used for web development e.g. Python, Java, Rails etc.
I want to be able to programatically generate TeX documents. For example a user submits a form and a field contains the LaTeX code to be typeset and the web service returns the typeset PDF.
Are there libraries available for such a task? I can't find any.
The only other solution I can think of is to use external shell command functionality that's usually available. But this is a bit messy.

Some time ago I created an Etherpad lite plugin that allows you to compile the LaTeX serverside. FlyLaTeX does it similarly, but didn't really work for me and the code looked pretty messy and almost impossible to fix and debug when I was having a look at it about 4 months ago.
Basically you need to generate a temporary file that you can then compile with LaTeX.
I don't know of any generation libraries, but LaTeX is quite easy to generate. However, pandoc can convert different formats into LaTeX.
There is also https://github.com/manuels/texlive.js/ which is an emscripten-based clientside port of LaTeX (that unfortunately has very limited capabilities and is quite large).

Related

How to create a pdf from html5 and css3 in Django?

I have a very simple use case. I want to create a pdf from a html file that I have.
Problem:
I have checked this tool. But it does not work for css3. And gives errors while parsing bootstrap files.
xhtml2pdf
I am currently checking Prince (giving some issues when I am running it.)
Has anyone faced such an issue and have you been able to solve it ?
There is surprisingly very little options available when you are not prepared to pay for a commercial library. I had the same requirement from one of my clients that did not want to pay for any third party tools, so I had to make a plan. This is what I did, not the best solution, but it got the job done
I downloaded the newest version of wkhtmltopdf. Unfortunately the wkhtmltopdf tool did not display some of my google graphs embedded in my HTML when converting to PDF. So I used the wkhtmltoimage tool also included to convert to a PNG, which woked as expected and displayed all the graphs.
I then downloaded the newest version of imagemagick and converted the PNG to PDF.
I automated this process using C#. You should be able to do this using python as well (Please note that I have no knowledge of python, and could be wrong).
Unfortunately this is not the most elegant solution because you have to perform two conversions and do a bit of work to automate everyting, but this is the best solution I could come up with that gave me the desired results and quality.
Of course there are losts of commercial software out there that will do a faster and better job.
Just a side note:
The web page that I had to convert was devloped in HTML5 and CSS3 using version 3 of bootstrap and it contained some google graphs and charts. Everything was converted without any problems.

Extensible lightweight markup language

Lightweight markup languages offer a fixed set of features. This feature set is growing, but every time I write a more complex article, I have to realize something is missing. Examples include: proper image captions, table of figures, file include, cross-references, etc. So I end up creating a tool chain around it, with a Makefile and tricky sed commands.
I typically want to insert ad-hoc markers into my text and process them later. They can be one-liners, or more complex -- and this where the whole regex approach fails. Here is a snippet of an imaginary markup.
I can generate an image from an external dot file [.myDot diag.dot The process],
and it will be included with a caption.
Or the dot source is right here [.myDotHere
foo->bar->Done;
]
I'm looking for a markup tool which can be easily extended to suite my ad-hoc needs. The options I found so far
Makefile, pre- and postprocessing with sed/perl scripts
Built in regex pre-processing in txt2tags
Pandoc parses markdown into an internal AST which can be transformed with haskell scripts
So what I'm looking for is
a language designed with customization and extensibility in mind
lightweight; no TeX/LaTeX please
not something which handles all my specific issues, but not extensible
My output is usually just html, so it doesn't have to support many targets
I created Glyph with extensibility in mind. You can create your own macros either using Glyph itself or Ruby.
Glyph aims to make publishing easier while giving all possible control to the writer, it can manage book metadata, ToC, internal links, snippets, etc. etc.
For documentation on all its features check out the Glyph book, which was created using Glyph itself.
Your "toolchain" approach is a good one - You won't IMO find a single project that will handle your specific needs, best to follow the *nix philosophy and use the best tool for the job that plugs into your open toolchain.
If macro inclusion is an issue, don't worry about solving that by your choice of markup syntax - find the right tool for that specific job and use it upstream.
The choice of markup should be IMO based on the availability of transformation tools to your desired output. IMO Pandoc is by far the most actively developed project in this space, and very flexible, especially with its scripting facility. Note it's also very well supported in GoogleGroups - John will likely respond directly and quickly to any issues you may have.
Note that Pandoc's flexibility also means your master source text isn't as "locked in", as you can easily convert for example from its extended markdown syntax to reST, if say you wanted to take advantage of Sphinx's or DocBook's capabilities. (BTW also check out AsciiDoc, which the latest Pandoc outputs - apparently a reader is also in the works)
Check out Pandoc's "extras" wiki page, I've been particularly excited by the ConTeXt filter script; I'm not sure if it'll be a good fit for you, but it includes some macro include capabilities, and IMO nothing will give you better typographical control.

Python modules for visualization of C++ code

I'm looking for python modules that can help with grepping C++ code. I have a large code base that I would like to do some analysis on. Ultimately I would like to come up with a graphical map of the software. There is lots of message passing going on amongst apps so I would like to be able to capture that information and present it visually. I have been looking around at some of the data visualization packages but have only stumbled on math and plotting related ones.
What are the best tools for this job, preferably in python?
Your best tool for the job is Graphviz. If you look at their gallery you'll find the sort of thing that you're interested in along with links to projects.
Under the language bindings section here there are a few python entries. Personally I don't use them as the dot language format is simple enough that you can build up fairly complex graphs from Python just using print statements.
You ca look at doxygen and see if it does (at least some part of) what you want. It generates call graph and class diagrams directly in html or xml format (I believe you need to have dot installed for fancy graphs).

How to replace text in a PowerPoint (.ppt) document?

What solutions are there? I know only solutions for replacing Bookmarks in Word (.doc) files with Apache POI?
Are there also possibilities to change images, layouts, text-styles in .doc and .ppt documents?
I think about replacement of areas in Word and PowerPoint documents for bulk processing.
Platform: MS-Office 2003
What are your platform limitations?
Obviously Apache POI will get you at least part of the way there.
Microsoft's own COM API's are fairly powerful and are documented here. I would recommend using them if a) you are not running in a server (many users, multithreaded) environment; b) you can have a proper version of powerpoint installed on the production machine; and c) you can code against a COM object model.
It's a bit pricey, but Aspose.Slides is a very powerful library for manipulating PowerPoint files
If you include using other Office suits as an option, here's a list of possible solutions:
Apache POI-HSLF
PowerPoint 2007 APIs
OpenOffice.org UNO
Using POI you can't edit .pptx file format, but you don't depend on the apps installed on the system. Other two options, on the contrary, make use of other apps, but they are definitely better for dealing with presentations. OpenOffice has better compability with older formats, by the way. Also if you use UNO, you'll have a great choice of languages, UNO exists for Java, C++, Python and other languages.
My experience is not directly with Power Point, but I've actually rolled my own WordML (XML) generator. It a) removed all dependencies on Word, b) was very fast c) and let me build up documents from scratch.
But it was a lot of work to create. And I was only creating a write only implementation.
I'm not as familiar with Power Point, so this is conjecture, but you may be able to roll your own by reading XML (Power Point 2003??) and/or cracking the Office Open XML file (zipped XML), then using XPath to manipulate the data, and then saving everything back to disk.
This won't work on older OLE Compound Document based Power Point files though.
I've done something like that before: programmatically accessed and manipulated PowerPoint presentations. Back when I did it, it was all in C++ using COM, but similar principles apply to C#/VB .NET apps, since they do COM interop very easily.
What you're looking for is called the Office Document Model. Basically, Office applications expose their documents programmatically, as trees of objects that define their contents. These objects are accessible via an API, and you can manipulate them, add new ones, and do whatever other processing you want. It's exceedingly powerful; you can use it to manipulate pretty much all aspects of a document. But you'll need an installation of Office and Visual Studio to be able to use it.
Some links:
Intro: http://msdn.microsoft.com/en-us/library/d58327k6.aspx
Hope this helps!
Apparently new users can only include one link per posting. How lame! :)
Here's the other link I meant to include:
Example of manipulating PowerPoint documents programmatically: http://msdn.microsoft.com/en-us/library/cc668192.aspx

Workflow to Turn Wiki content into a system manual

We're in the middle of deploying a new software system to lot's of users in lot's of places (200+ users over 8 countries). In the past we've written a manual for the users, then update it every so often. This works ok, in that all the users ahve the same manual and it covers the main things but it has it's problems, like it doesn't get updated that often, we sometimes miss updates, and some users will have old copies.
We've been talking about using a wiki during the testing and deployment phases to build a knowledge base about the system. Ideally we'd then like some way to convert that into some form fo electronic document that we can then 'pretty-fie' and send out as the official manual, as well as letting users use and update the wiki.
Has anyone else done anything similar ? Any suggestions for wiki systems, workflows, document formats etc?
Most wikis support export via PDF e.g.:
MediaWiki PDF Export
DokuWiki PDF Export
TWiki PDF Export
You can write something that generates LaTeX from the wiki and renders a manual to PDF. With packages like hyperref you can retain cross-references as hyperlinks.
Additionally, you can integrate content from multiple sources such as a data dictionary into the LaTeX document, which can be mixed and matched with the wiki content. You could also set the architecture up so it can support cross-referencing that goes either way.
Framemaker could also support this using generated MIF files, and you could also use Lout in a similar way or convert your wiki content to docbook, which would allow you to use any of the many rendering options available to that format.
As an aside, the following Stackoverflow postings discuss various systems for maintaining documentation.
Application (Not a Markup Language) for Producing a User Manual
Can LaTeX be used for producing any documentation that accompanies software?
What tools are used to write documentation?
What tools does your team use for writing user manuals?
How best to write documentation (ideally in latex) targeting both the web (html) and paper (pdf)?
Best tool(s) for working with DocBook XML documents?
What is the recommended toolchain for formatting XML DocBook?
Is a successor for TeX/LaTeX in sight?
Madcap Flare is a help-and-manual authoring tool that uses HTML for the source of each topic. You could pretty easily do a mass import of the Wiki pages. Would then require some cleaning but after that you have a nice single-source system that can output CHM, web-browsable help, PDF, DOC/DOCX, etc.
How are you storing the help source at the moment? Is it MS Word files, MS help, LaTeX?
If you put your help source files under version control then you will get all the benefits of a wiki without having to migrate to a new system - people can make edits to the help files easily - those changes can be tracked, reverted etc. and you get the prettified manuals as before.
I followed Node's links and came across some mediawiki pages that I thought were noteworthy.
Extension:OpenDocument Export
Extension:PDF Writer
Category:Data extraction extensions
I gave a previous answer which may be useful for the "wiki to PDF" part -- look at using the open source PediaPress code or functionality. You can get ODFs from it too, although their PDFs are already quite pretty (but you might want to rebrand it and restyle it for your company I suppose).