Extensible lightweight markup language - customization

Lightweight markup languages offer a fixed set of features. This feature set is growing, but every time I write a more complex article, I have to realize something is missing. Examples include: proper image captions, table of figures, file include, cross-references, etc. So I end up creating a tool chain around it, with a Makefile and tricky sed commands.
I typically want to insert ad-hoc markers into my text and process them later. They can be one-liners, or more complex -- and this where the whole regex approach fails. Here is a snippet of an imaginary markup.
I can generate an image from an external dot file [.myDot diag.dot The process],
and it will be included with a caption.
Or the dot source is right here [.myDotHere
foo->bar->Done;
]
I'm looking for a markup tool which can be easily extended to suite my ad-hoc needs. The options I found so far
Makefile, pre- and postprocessing with sed/perl scripts
Built in regex pre-processing in txt2tags
Pandoc parses markdown into an internal AST which can be transformed with haskell scripts
So what I'm looking for is
a language designed with customization and extensibility in mind
lightweight; no TeX/LaTeX please
not something which handles all my specific issues, but not extensible
My output is usually just html, so it doesn't have to support many targets

I created Glyph with extensibility in mind. You can create your own macros either using Glyph itself or Ruby.
Glyph aims to make publishing easier while giving all possible control to the writer, it can manage book metadata, ToC, internal links, snippets, etc. etc.
For documentation on all its features check out the Glyph book, which was created using Glyph itself.

Your "toolchain" approach is a good one - You won't IMO find a single project that will handle your specific needs, best to follow the *nix philosophy and use the best tool for the job that plugs into your open toolchain.
If macro inclusion is an issue, don't worry about solving that by your choice of markup syntax - find the right tool for that specific job and use it upstream.
The choice of markup should be IMO based on the availability of transformation tools to your desired output. IMO Pandoc is by far the most actively developed project in this space, and very flexible, especially with its scripting facility. Note it's also very well supported in GoogleGroups - John will likely respond directly and quickly to any issues you may have.
Note that Pandoc's flexibility also means your master source text isn't as "locked in", as you can easily convert for example from its extended markdown syntax to reST, if say you wanted to take advantage of Sphinx's or DocBook's capabilities. (BTW also check out AsciiDoc, which the latest Pandoc outputs - apparently a reader is also in the works)
Check out Pandoc's "extras" wiki page, I've been particularly excited by the ConTeXt filter script; I'm not sure if it'll be a good fit for you, but it includes some macro include capabilities, and IMO nothing will give you better typographical control.

Related

How can I programmatically generate documents using TeX?

This applies to popular languages used for web development e.g. Python, Java, Rails etc.
I want to be able to programatically generate TeX documents. For example a user submits a form and a field contains the LaTeX code to be typeset and the web service returns the typeset PDF.
Are there libraries available for such a task? I can't find any.
The only other solution I can think of is to use external shell command functionality that's usually available. But this is a bit messy.
Some time ago I created an Etherpad lite plugin that allows you to compile the LaTeX serverside. FlyLaTeX does it similarly, but didn't really work for me and the code looked pretty messy and almost impossible to fix and debug when I was having a look at it about 4 months ago.
Basically you need to generate a temporary file that you can then compile with LaTeX.
I don't know of any generation libraries, but LaTeX is quite easy to generate. However, pandoc can convert different formats into LaTeX.
There is also https://github.com/manuels/texlive.js/ which is an emscripten-based clientside port of LaTeX (that unfortunately has very limited capabilities and is quite large).

When should one use Custom Tag in CFML?

What are some common use cases for implementing CFML Custom Tag (not CFX tag)? In 3 yrs of my CF exp I've never written one. Would someone please enlighten me, under which use case / situation would one choose custom tag over cfc / udf?
Remember that custom tags were, at one time, the only method available to extend CFML (up until version 4) - UDFs came later (CF 5) and CFCs later still (CF MX). They're not as commonly used as they once were for the simple reason that there are more options.
Custom tags are basically procedural in nature in a language that, with CFCs, become more and more OO in practice. This is another reason that they're not very common.
But there's still cases where they come in handy (but are never required) - mostly for interface work. The ability to create both a start and end state can definately come in handy. A simple example could be a "wrapper" for page content the opening tag might add the HTML header and page navigation while the closing tag would add the footer and end the page.
In this way your page content could be nothing more than:
<cfmodule... >
Page Content!
</cfmodule>
Of course there are other ways to do this as well - but sometimes the classics still have value. ;^)
Look at the CFUniform project for a great example of custom tag usage. Custom Tags are great when building reusable pieces for the UI portion of an application.
I think that, for the most part, custom tags have mostly fallen by the wayside since UDFs, CFCs, and integration with Java (and to a lesser degree .NET) allowed easier and more straightforward ways to do similar things.
Looking back to when I started in CF5, I can think of several examples. A good one might be CFX)Zip, which allowed interaction with Zip files before that was available directly through CF.
The only use I can think of offhand in a more modern context would be to provide precompiled code that wasn't written in Java or .NET, such as proprietary doodads written in C. That's a pretty niche use, though.
Honestly, I imagine at this point they exist more or backwards comatibility than anything else.
Ever since CFCs came out I've stopped using custom tags simply because of the overhead. They take too long to initiate and execute. But like #Jim Davis said, they may be useful where you need to write a tag that wraps around other content.
But in a well defined solution, you can do way with them all together.

Internationalization in MFC

It's finally (after years of postponing) the time to localize my app in a few other languages other than English.
The first challenge is to design the integration into my C++ / MFC application that has dozens of dialogs and countless strings. I came across two possible alternative implementations:
Compile and deploy localized resource files as DLLs
Extract and replace all strings with the localized version. For each
language there will be an XML (or simple text) file.
Personally I opt for the second alternative since it seems to me more flexible. The changes are many but not hard to make, and very importantly the XML files will be very easy to modify for the translators.
Any advise is greatly appreciated.
Regards,
Cosmin Unguru
http://www.batchphoto.com/
I did some long-living MFC projects with different languages.
I strongly recommend the first approach with resource-only DLLs.
The reasons:
(1) If the user does a XCOPY install, he always has the default language (English) in the main executables.
(2) If you don't translate everything (e.g. you're late with your release or forget some strings), the Windows resource functions if properly used return the resource in the default language automatically - you don't have to implement it on your own.
(3) My very person opinion: (a) Line breaks, tabs, whitespaces in XML files are a pain in your a**. (b) Merging XML files is even worse...
(4) Don't forget the encoding. It's okay in XML but your translators might use an unsuitable editor and damage the file.
And now for the main reason:
(5) You will have to rearrange many of your dialogs, because many strings are longer in e.g. French or German than in English. And making all statics, buttons, ... larger "just in case" looks crappy.
Another hint: Spend some bucks and buy one of the translation tools which import your projects / binaries and build up a translation database. This will be amortized after the first release.
Another hint (2): If possible make a release which doesn't contain any changes but only the multi-language feature. Also in future, if possible: Release your product in English. Then do the translation in one single step (per language) and release the other languages.
My good and friendly suggestion from somebody who worked a lot with localization:
Grab GNU Gettext,
Mark all your strings as _("English").
Extract all strings using gettext tool xgettext and compile dictionalries
Translate string using great tools like poedit.
Use gettext in your project and make your localization life simpler!
You can also use boost::locale for same purpose - it uses GNU Gettext dictionaries and approach but provides different and more powerful runtime and for windows developer it has very good addon - it supports wide strings that MFC requires to use for normal Unicode support.
Don't use resources and other "translation" tools that are total crap from linguistic point of view (and developer's point of view as well).
Further reading: http://cppcms.sourceforge.net/boost_locale/html/tutorial.html
Using a DLL resource library is a relatively straightforward operation, and allows you to manage not only strings, but other resources as well. And this is its main advantage, because i18n is not only about string translation.
However, depending on your needs, a text-based solution may be a better decision, because of its easier handling - resource scripts being more complex than xml files, especially for the average translator.
I would suggest creating your own abstraction layer, something like "LoadLocalizedString", etc.; in this way, you can start implementing it just with text files, and then change to something more complex when and if required in a transparent way - all the effort for making your software i18n aware would still be valid.
In our case we had diffrent dialogues per Language. The resource file was the same as the multiple laguages were implemented at development time. You could basically append on existing resource files the diferent languages. I hope it helps to find your way.
The DLL option is commonly used for this since the resource loading procedure (e.g. LoadLibrary) is already written - meaning you don't have to write any parsing/loading code. While XML is easier to edit, DLLs have a bit more security (users won't be able to easily edit them) and will allow the developer (meaning you) more time to work on application logic instead of writing a language loading system.
HMODULE hLangDLL = LoadLibrary("text_en.dll");
// more stuff
TCHAR mybuffer[1024] = {0};
LoadString(hLangDLL, IDS_MYSTRING, mybuffer, 1023);
If it is just the strings that are changing then I agree that XML is the way forward here for the exact reasons you outline. Easy for other people to edit, easy to change language at runtime, etc.
The only reason (in my eyes) that you'd choose option 1 is if things other than strings are being localized such as needing different icons.
If it's just text? I say go with the XML.

Python modules for visualization of C++ code

I'm looking for python modules that can help with grepping C++ code. I have a large code base that I would like to do some analysis on. Ultimately I would like to come up with a graphical map of the software. There is lots of message passing going on amongst apps so I would like to be able to capture that information and present it visually. I have been looking around at some of the data visualization packages but have only stumbled on math and plotting related ones.
What are the best tools for this job, preferably in python?
Your best tool for the job is Graphviz. If you look at their gallery you'll find the sort of thing that you're interested in along with links to projects.
Under the language bindings section here there are a few python entries. Personally I don't use them as the dot language format is simple enough that you can build up fairly complex graphs from Python just using print statements.
You ca look at doxygen and see if it does (at least some part of) what you want. It generates call graph and class diagrams directly in html or xml format (I believe you need to have dot installed for fancy graphs).

Workflow to Turn Wiki content into a system manual

We're in the middle of deploying a new software system to lot's of users in lot's of places (200+ users over 8 countries). In the past we've written a manual for the users, then update it every so often. This works ok, in that all the users ahve the same manual and it covers the main things but it has it's problems, like it doesn't get updated that often, we sometimes miss updates, and some users will have old copies.
We've been talking about using a wiki during the testing and deployment phases to build a knowledge base about the system. Ideally we'd then like some way to convert that into some form fo electronic document that we can then 'pretty-fie' and send out as the official manual, as well as letting users use and update the wiki.
Has anyone else done anything similar ? Any suggestions for wiki systems, workflows, document formats etc?
Most wikis support export via PDF e.g.:
MediaWiki PDF Export
DokuWiki PDF Export
TWiki PDF Export
You can write something that generates LaTeX from the wiki and renders a manual to PDF. With packages like hyperref you can retain cross-references as hyperlinks.
Additionally, you can integrate content from multiple sources such as a data dictionary into the LaTeX document, which can be mixed and matched with the wiki content. You could also set the architecture up so it can support cross-referencing that goes either way.
Framemaker could also support this using generated MIF files, and you could also use Lout in a similar way or convert your wiki content to docbook, which would allow you to use any of the many rendering options available to that format.
As an aside, the following Stackoverflow postings discuss various systems for maintaining documentation.
Application (Not a Markup Language) for Producing a User Manual
Can LaTeX be used for producing any documentation that accompanies software?
What tools are used to write documentation?
What tools does your team use for writing user manuals?
How best to write documentation (ideally in latex) targeting both the web (html) and paper (pdf)?
Best tool(s) for working with DocBook XML documents?
What is the recommended toolchain for formatting XML DocBook?
Is a successor for TeX/LaTeX in sight?
Madcap Flare is a help-and-manual authoring tool that uses HTML for the source of each topic. You could pretty easily do a mass import of the Wiki pages. Would then require some cleaning but after that you have a nice single-source system that can output CHM, web-browsable help, PDF, DOC/DOCX, etc.
How are you storing the help source at the moment? Is it MS Word files, MS help, LaTeX?
If you put your help source files under version control then you will get all the benefits of a wiki without having to migrate to a new system - people can make edits to the help files easily - those changes can be tracked, reverted etc. and you get the prettified manuals as before.
I followed Node's links and came across some mediawiki pages that I thought were noteworthy.
Extension:OpenDocument Export
Extension:PDF Writer
Category:Data extraction extensions
I gave a previous answer which may be useful for the "wiki to PDF" part -- look at using the open source PediaPress code or functionality. You can get ODFs from it too, although their PDFs are already quite pretty (but you might want to rebrand it and restyle it for your company I suppose).