how to setup a single install mediawiki in two languages? - wiki

I would like to setup mediawiki in two languages without installing twice and setting them up as a wiki family. The Language Selector extension allows for UI to be switched, but it does not allow for multiple language content. The aim is to have a corresponding pages in both languages (ex About Us in English, About Us in French) and a way to switch between them. Not all pages will have their counter parts.

I see two ways to do this - either instructional (e.g. "For English, use: [[en/Page Title|Page Title]] and for Spanish use: [[es/Título de la página|Título de la página]].
Or by using an extension to detect the language during a page save, edit, encoding on the wiki, and modifying the page to add the en/ or es/ to the page or page link. That would be best, I think.

Related

A tool which checks that a local version of a site is fully translated (for continuous integration)

I'm working on a project, in which we design a localized version of an existing site (written in English) for another country (which is not English-speaking). And the business requirement is "no English text for all possible and impossible cases".
Does anyone know if there is a checker software/service which could check if a site is fully translated, that is which checks that there are no English text in it.
I new that there are sites for checking broken links, html validity etc, I need something like http://validator.w3.org/checklink but for checking that on all pages of the site there is no English text.
The reasons I think this way is needed are:
1. There is a lot of code which is common (both on backend and frontend) for all countries
2. If someone commits anything to the common code I need to be sure that this will not lead to english text issues in localized version.
3. From business point of view it is preferable that site does not support some functionality, than it shows english text ( legal matters)
4. The code both on frontend and backend changes a lot
5. There are a lot of files which affect text on the client's screen. Not just one with messages, unfortunately. And some of messages comes from backend, but most of them are in frontend
6. Due to all those fact currently someone manually fills all the forms and watch with his own eyes, and that is before each deploy...
I think you're approaching the problem from the wrong direction. You're looking for an algorithm or webcrawler that can detect wether any text is English or not? I don't know, but I doubt such a thing even exists.
If you have translated the website, you have full access to the codebase and/or translation texts, right? Can't you just open both the English and non-English strings files (.resx or whatever you are using) in a comparetool like Notepad++ to check the differences to see if there are any missing strings? And check the sourcecode and verify that all parts that can output user-displayable text use the meta:resourceKey property (or whatever you are using).
If you want to go the way of crawling, I'm not aware of an existing crawler that does this, but it sounds like a combination of two simple issues:
Finding existing open-source code for a web crawler should be dead simple
Identifying a language through n-gram analysis is trivial if there's a limited number of languages the text can be in.
The only difficult part would be to ensure that the analyzer always has a decent chunk of text to work with. You could extract stuff paragraph by paragraph. For forms you'd probably have to combine the text of several form labels.

how to handle multiple languages on website

I have a website that I am translating into different languages. I have the content translated and stored in a database. I also wrote, into the php files, different mechanisms that will display the language based on a global define I set high in the code. I am happy with all of this. My question is how do I control this global define?
I currently have a javascript toggle that sets a cookie and then reloads the current page. And every subsequent page just reads that cookie to set the global define. It works very well, however I am running into two big problems. (1) I can't just can't have a url to send to somebody that has the language in it (I could do something like domain.com/forwarder.php?lan=spanish&gotopage=page.php that would set a cookie and then forward, but that's ugly). And (2), search engines can't view the multiple languages since they don't really use cookies and javascript.
So how do I solve this? Does anybody have experience in this? Can you share your experiences?
I'm leaning towards just using the url and dropping the cookie; that seems popular among various international sites I've seen. So I'm guessing the urls would be:
domain.com/page (for english, equivalent to domain.com/en/page)
domain.com/es/page (for spanish)
domain.com/fr/page (for french)
etc ......
Is this a good idea? I will have to go through my code and prepend all my href's with the language code, which might be a pain.
So does anybody have any comments on this? Is this a good plan? Am I neglecting to realize something?
It's been a long time, but can't you use the $_SERVER["HTTP_ACCEPT_LANGUAGE"] and set it automatically. And prior to writing the cookie for the first time, leave message on the screen in either english or another language in the array asking if this is the correct language, with a drop down of available languages? Once it is selected, store that as default website language.
You can use string constants in global resource files. Have only one website that calls those string constants based on the current language.

change the language of a sitecore item(tree)?

We have a small website which was developed with the default language ('en'), without paying mind to the language or versioning capabilities of sitecore (ouch!). We simply forgot to set the correct default language at the start of the project.
Now we have an entire content tree of 'en' items, when they should be 'nl-NL' (it's a Dutch site). And I am wondering if there is an easy way of changing the language for all items in that tree (that does not involve hacking).
I found this Q&A, but it just talks about setting the default language. I'd like to do that, yes, but I would also like to set the correct language for the existing item(versions).
thoughts?
From what I remember we had a similar problem before. Not with filling a website in the wrong language, but having empty content that should be filled with default english content after creating the new language. What we did was export the language. In your case you could export the English language, create a dutch language and replace all entries in the English XML file that comes out with nl-NL values.
After you've done that you could import the language file as the Dutch language and all items are filled.
To me this sounds as the easiest and quickest approach, since you only have to search and replace some xml tags.
Good luck!
You could write a .NET program that would go through your whole content tree and update language parameter of each item accordingly. Sitecore APIs give you access to almost everything you see in the backend (including content manipulation) so it shouldn't be much of a problem to automate this task.
As an anternative you could copy your whole content from one language to another and then remove the language you don't want. Here's how to do it.
I'll caveat that my experience with it is limited, but from what I've seen, the Sitecore Rocks plugin for Visual Studio might allow you to script this.
http://visualstudiogallery.msdn.microsoft.com/44a26c88-83a7-46f6-903c-5c59bcd3d35b

How Do Search Engines See A Localized Django Site?

I have a Django site that uses the localization middleware in combination with gettext and the trans/blocktrans template tags to show visitors different pages depending on the preferred language in their user agent string (which seems to be the standard way of doing things in Django).
This works great for supported languages (currently only Spanish, English, and German with more coming). If I set the preferred language in my browser to a different language, I get the pages for that translation. However, I have no idea how it appears for search engines.
When a search engine crawls a site, does it typically have a preferred language in its agent string? Will German spiders get the German site and will Spanish ones get the Spanish site, or will they just get the default English site that's displayed when a user has no language set? Does this vary by search engines and is there a "standard way" of doing things that individual crawlers may or may not stick to?
bots typically do not have accept-language setting in the http header. which means that django will service your default language.
Regional search engines can have bots with accept-language set to whatever they prefer, but you cannot rely on that.
It is best to have different pages for each language. such as http://your.website.com/english/
and then in your middleware set up a redirect to the right language page if a specific accept-language is present.
Don't rely on what the search engine may do in this regard. You want all versions to be crawled. To achieve that:
Have different URLs for the different language versions.
Make sure the search engines can find the different versions.
Overall, I believe that the way I did it on my homepage is close to ideal in regard to both search engines and regular users:
When a user arrives at, e.g. brazzy.de/index.php, the site tries to determine the language from cookie (if present) or browser settings (Accept-language header), defaults to English, and does not redirect
Every page has links to the different language versions of that page (IMO the most important factor for user convenience, and also makes sure search engines can easily find the different versions).
These links lead to e.g. brazzy.de/en/index.php, which is in my case rewritten to brazzy.de/index.php?lang=en - this ensures that search engines see distinct URLs for the different language versions.
Visiting such a subdirectory sets the language cookie to that language
The pages without a language-specific URL (i.e. where the language depends on client data) use e.g. <link rel="canonical" href="/en/"> to tell the search engine at which language-specific URL that page can be found.
Use XML sitemaps to further make sure search engines can find all pages and all different language versions.
use hreflang meta tag but make sure you use different urls for different languages. even better, use different domain extension (example.de, example.es) in conjunction with Django sites framework.

Workflow to Turn Wiki content into a system manual

We're in the middle of deploying a new software system to lot's of users in lot's of places (200+ users over 8 countries). In the past we've written a manual for the users, then update it every so often. This works ok, in that all the users ahve the same manual and it covers the main things but it has it's problems, like it doesn't get updated that often, we sometimes miss updates, and some users will have old copies.
We've been talking about using a wiki during the testing and deployment phases to build a knowledge base about the system. Ideally we'd then like some way to convert that into some form fo electronic document that we can then 'pretty-fie' and send out as the official manual, as well as letting users use and update the wiki.
Has anyone else done anything similar ? Any suggestions for wiki systems, workflows, document formats etc?
Most wikis support export via PDF e.g.:
MediaWiki PDF Export
DokuWiki PDF Export
TWiki PDF Export
You can write something that generates LaTeX from the wiki and renders a manual to PDF. With packages like hyperref you can retain cross-references as hyperlinks.
Additionally, you can integrate content from multiple sources such as a data dictionary into the LaTeX document, which can be mixed and matched with the wiki content. You could also set the architecture up so it can support cross-referencing that goes either way.
Framemaker could also support this using generated MIF files, and you could also use Lout in a similar way or convert your wiki content to docbook, which would allow you to use any of the many rendering options available to that format.
As an aside, the following Stackoverflow postings discuss various systems for maintaining documentation.
Application (Not a Markup Language) for Producing a User Manual
Can LaTeX be used for producing any documentation that accompanies software?
What tools are used to write documentation?
What tools does your team use for writing user manuals?
How best to write documentation (ideally in latex) targeting both the web (html) and paper (pdf)?
Best tool(s) for working with DocBook XML documents?
What is the recommended toolchain for formatting XML DocBook?
Is a successor for TeX/LaTeX in sight?
Madcap Flare is a help-and-manual authoring tool that uses HTML for the source of each topic. You could pretty easily do a mass import of the Wiki pages. Would then require some cleaning but after that you have a nice single-source system that can output CHM, web-browsable help, PDF, DOC/DOCX, etc.
How are you storing the help source at the moment? Is it MS Word files, MS help, LaTeX?
If you put your help source files under version control then you will get all the benefits of a wiki without having to migrate to a new system - people can make edits to the help files easily - those changes can be tracked, reverted etc. and you get the prettified manuals as before.
I followed Node's links and came across some mediawiki pages that I thought were noteworthy.
Extension:OpenDocument Export
Extension:PDF Writer
Category:Data extraction extensions
I gave a previous answer which may be useful for the "wiki to PDF" part -- look at using the open source PediaPress code or functionality. You can get ODFs from it too, although their PDFs are already quite pretty (but you might want to rebrand it and restyle it for your company I suppose).