How to create a report generator in word - templates

I am currently trying to figure out to to make it so I can load a template for word and be prompted to fill in data that will then be populated on the word document. The report has a 2 main sections. The first in the over view:
The summary will be calculated automatically.
The second section is the summary where data will be appended:

There are several approaches to report generation in Word, each with its pros and cons:
1. Automate report generation in MS Word itself (or other VBA enabled MS Office applications), using VBA modules:
In this case you will need to have MS Word installed on the machine, where the report generation will take place. You will also have to allow execution of scripts in Word VBA modules. After the document is open, VBA can start executing and there you can do pretty much anything - like connecting to some source of external data, filling this data into the Word template. I have been using bookmarks in Word as placeholders for the data. In this way you can also fill tables - just put the bookmark in the first cell, jump to this bookmark and then programmatically fill data one cell at the time and then programmatically move to the adjacent cell and when you are in the last cell in a row the new row opens and you are in the first cell of the new row.
However, you have to be careful how to release WINWORD processes after you are done with document generation. VBA is no longer supported in Office 2013, so this is probably not the best solution.
2. Use Interop assemblies in .NET to do this programmatically: To start with, there is an article on MSDN about using a Word template and manipulate it programmatically. You can find this article here. Again you will need Word installed on the machine where you run the code. You can achieve results very quickly, but note that it is not designed run on a server, because you may soon have performance issues and memory leaks.
3. Use OperXML SDK to programmatically manipulate Word documents from managed .NET application: This gives you total freedom in document generation, but the learning curve is long and steep. Depending on the scale of your business domain you may consider investing your time in learning this technology.
4. Use some of the third party Tools:
In my case I didn't want to use OpenXML. I have replaced VBA-based report automation as explained above with SDK toolkit for Word template generation and report automation. You can check this link for an example. So far this toolkit has covered all our needs, so I am not looking any further. I prepared my first templates about 3 years ago using this toolkit and they smoothly survived migrations from Office 2007 to 2010 to 2013. Now I have a library of about 90 different Word templates which are being used by different applications, using this toolkit in the background. Its kind like 'set it and forget it', and it saves me quite a bit of time compared to VBA document automation which we used before.

Related

A tool which checks that a local version of a site is fully translated (for continuous integration)

I'm working on a project, in which we design a localized version of an existing site (written in English) for another country (which is not English-speaking). And the business requirement is "no English text for all possible and impossible cases".
Does anyone know if there is a checker software/service which could check if a site is fully translated, that is which checks that there are no English text in it.
I new that there are sites for checking broken links, html validity etc, I need something like http://validator.w3.org/checklink but for checking that on all pages of the site there is no English text.
The reasons I think this way is needed are:
1. There is a lot of code which is common (both on backend and frontend) for all countries
2. If someone commits anything to the common code I need to be sure that this will not lead to english text issues in localized version.
3. From business point of view it is preferable that site does not support some functionality, than it shows english text ( legal matters)
4. The code both on frontend and backend changes a lot
5. There are a lot of files which affect text on the client's screen. Not just one with messages, unfortunately. And some of messages comes from backend, but most of them are in frontend
6. Due to all those fact currently someone manually fills all the forms and watch with his own eyes, and that is before each deploy...
I think you're approaching the problem from the wrong direction. You're looking for an algorithm or webcrawler that can detect wether any text is English or not? I don't know, but I doubt such a thing even exists.
If you have translated the website, you have full access to the codebase and/or translation texts, right? Can't you just open both the English and non-English strings files (.resx or whatever you are using) in a comparetool like Notepad++ to check the differences to see if there are any missing strings? And check the sourcecode and verify that all parts that can output user-displayable text use the meta:resourceKey property (or whatever you are using).
If you want to go the way of crawling, I'm not aware of an existing crawler that does this, but it sounds like a combination of two simple issues:
Finding existing open-source code for a web crawler should be dead simple
Identifying a language through n-gram analysis is trivial if there's a limited number of languages the text can be in.
The only difficult part would be to ensure that the analyzer always has a decent chunk of text to work with. You could extract stuff paragraph by paragraph. For forms you'd probably have to combine the text of several form labels.

Getting the amplitude(or rms voltage) of audio signal captured in C++ by wavin lib.?

I am working on a very basic robotics project, and wish to implement voice recognition in it.
i know its a complex thing but i wish to do it for only 3 or 4 commands(or words).
i know that using wavin i can record audio. but i wish to do real-time amplitude analysis on the audio signal, how can that be done, the wave will be inputed as 8-bit, mono.
i have thought of divinding the signal into a set of some specific time, further diving it into smaller subsets, getting the average rms value over the subset and then summing them up and then see how much different they are from the actual stored signal.If the error is below accepted value for all(or most) of the sets, then print the word.
How can this be implemented?
if you can provide me any other suggestion also, it would be great.
Thanks, in advance.
There is no simple way to recognize words, because they are basically a sequence of phonemes which can vary in time and frequency.
Classical isolated word recognition systems use signal MFCC (cepstral coefficients) as input data, and try to recognize patterns using HMM (hidden markov models) or DTW (dynamic time warping) algorithms.
You will also need a silence detection module if you don't want a record button.
For instance Edimburgh University toolkit provides some of these tools (with good documentation).
If you don't want to build it "from scratch" or have a source of inspiration, here is an (old but free) implementation of such a system (which uses its own toolkit) with a full explanation and practical examples on how it works.
This system is a LVCSR (Large-Vocabulary Continuous Speech Recognition) and you only need a subset of it. If someone know an open source reduced vocabulary system (like a simple IVR) it would be welcome.
If you want to make a basic system from your own, I recommend you to use MFCC and DTW:
For each target word to modelize:
record some instances of the word
compute some (eg each 10ms) delta-MFCC through the word to have a model
When you want to recognize a signal:
compute some delta-MFCC of this signal
use DTW to compare these delta-MFCC to each modelized word's delta-MFCC
output the word that fits the best (use a threshold to drop garbage)
If you just want to recognize a few commands, there are many commercial and free products you can use. See Need text to speech and speech recognition tools for Linux or What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? or Speech Recognition on iPhone. The answers to these questions link to many available products and tools. Speech recognition and understanding of a list of commands is a very common problem solved commercially. Many of the voice automated phone systems you call uses this type of technology. The same technology is available for developers.
From watching these questions for few months, I've seen most developer choices break down like this:
Windows folks - use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx
Linux folks - Sphinx seems to have a good following. See http://cmusphinx.sourceforge.net/ and http://cmusphinx.sourceforge.net/wiki/
Commercial products - Nuance, Loquendo, AT&T, others
Online service - Nuance, Yapme, others
Of course this may also be helpful - http://en.wikipedia.org/wiki/List_of_speech_recognition_software

Upgrade Path For Legacy Reporting ("Embedded" Access 2003)?

Update: Albert D. Kallal has kindly started the discussion off, and to get some more opinions I'm adding a bounty.
This is a nontrivial question about maintenance of a legacy application myself and two other developers support. We are not the original developers, and the code base is 300,000 lines of MFC and business logic tightly coupled together. We don't know every single line of code 100%.
We do know the code behind the major components, and we know that it's poorly written. Our objective is to refactor the application out of 1995 and into 2010. Between the three of us there is (in aggregate) enough experience in software architecture and database design for us to fix the components that are poorly architected in code or incorrectly modelled in the database, but we don't have a lot of experience with modern reporting systems. Thus my question (once you get to the end of it...) is about reporting systems.
For anybody who reads this entire post, I am appreciative of your time. For anybody who reads this post and replies with solutions, experience (or sympathy!), I am both appreciative and thankful.
At work I have inherited the maintenance of an Access 2003 database that contains approximately 250 reports (and thousands of supporting queries) that acts as a reporting engine for our application.
The reports all have swathes of VBA in them for particular formatting or pulling extra information into the report. For this reason we are entirely locked into the Access platform, we can't use tools like BIDS to import the Access report objects without messing around to make the report display the same without VBA.
So to get ourselves out of this Access solution we need to put some time in going over every single report. Which means we're looking to pick the best longterm solution, since we're going to have to redevelop every report regardless of the platform we choose.
Furthermore our customers have a choice of Microsoft Access or SQL Server as their database. This means that all our SQL has to be written with the lowest common denominator in mind - JET SQL. We've got some wiggle room to drop support for Microsoft Access, but we'd need to build a case for it. If the best reporting system we can identify has strong support for SQL Server but little or no support for Microsoft Access this will accelerate us dropping support for Microsoft Access as a database.
The overall implementation of the report system is quite mediocre, when we want to display reports in our application we start a Microsoft Access process, find its window and reparent it to our application, strip off its window styles and then use the Access.Application COM interface to invoke some VBA that creates linked tables to the database (either a Microsoft Access MDB or a SQL Server database) and then opens up the report we want. Probably the only supported part of the process is using the public COM interfaces, the rest is an ugly hack. The other components in the application are equally underwhelming.
To "fix" our application we've got a new development plan, with development of our application split into (approximately) three parts every year.
4 months upgrading our application to support the latest government legislation in our industry
4 months delivering a new major feature
4 months "consolidation" (fixing what is broken)
We're currently at #3 now (for this year), and we really want to take advantage of the downtime to fix up the application, refactoring the major components. We have three developers, and want AppName v5.0 out at the end of 2012 (it's currently AppName v4.12). This gives us 36 months of development effort to approportion between several components (user interface, underlying database structure, reporting, etc) over the three consolidation periods we will have before then. The sum of the components that we fix will give us v5.0.
We've scoped out what we'd like to do with most of the components except for our reporting engine, and I'm posting on SO in the hope of getting some good ideas, or at least a feel for the work that's required.
I have two ideas for improving our reporting system. Both of them involve a moderate amount of work, and there is one consideration that neither solution addresses completely: in addition to the reports that we develop, our customers also have the opportunity to request bespoke development of reports. They're customer-specific, we take their Access database, augment it with their report and give it back to the customer. There's hundreds of unique reports out there - unusable if we turned the old system off. (And we have to turn the old system off eventually - we don't know how much longer we're going to be able to mess around with the Microsoft Access window to make it look like an embedded report. We already have two distinct code paths for Access 2003 and 2007. What if we can't hack up a code path for Access 2010 and all our customers have to use Access 2007?)
For both ideas, the intention is to stop supporting our current reporting system and let it run for as long as it will without maintenance. Maybe we can hack in Access 2010 and Access 2014 support, and the customer reports that were developed keep putting along for 5 more years. Over time, we'd migrate the most commonly used reports from the old Access database into their new format.
Idea 1: Microsoft.Reporting.WinForms.ReportViewer
The first idea is to write a wrapper around the ReportViewer control as a replacement reporting engine.
We'd need to move the project to C++/CLI (already on the cards), and instead of having to launch an entire process each time we needed to view a report we could simply instantiate this control. A bonus of this that the RDLC files that contain the reports are much easier to version control in Subversion than the Access 2003 database we currently have (we use Visual SourceSafe because the tools to integrate SVN with Access don't work well with the size of our Access database). The visual designer for RDLC files is also nicely integrated into Visual Studio.
This is more of an evolutionary rather than revolutionary change to the way we do reports, the ReportViewer control will take an RDLC file that has the report layout, and our application will take care of querying the data. Because our database might be SQL Server or Microsoft Access, we still have to write simple JET SQL. We're gaining better reporting (drill down looks nice), stronger authoring tools and easier version control, but is this worth the effort?
Idea 2: SQL Server Reporting Services and SharePoint 2010 with Access Services
The second idea is to kill Access as a database platform and migrate all our customers to SQL Server (we have hosted instances of our application for those customers who don't have the skill set to set up their own SQL Server instances). Once they're migrated we would use SQL Server Reporting Services as the reporting engine, with the ReportViewer control in server rendering mode.
In addition to SQL Server Reporting Services, I am curious as to whether SharePoint 2010 with Access Services could be used to rapidly migrate existing Access reports into a more manageable format. We'd take the Access report that the customer uses, convert it to an Access Web Report then make it available for them on a SharePoint site. This would only be for our hosted customers, but if we find a way to deal quickly massage the VBA out of customer reports we could churn through the several hundred custom reports our customers have.
I'm also interested in the ability to use an Access Web Navigation Form to act as a portal to all our reports. We'd host a web browser control inside our application which would give customers access to their own reports and to our standard suite.
We'd get all the benefits of Idea #1 plus the ability to write in full Transact SQL, a reports portal, and (hopefully) a reasonable upgrade path for customer's proprietary reports.
So, my question is: am I going about this the right way? Are these viable solutions for modern reporting systems, or laughable? We have a strong preference for using the ReportViewer control either in client rendering mode where our application processes the data, or in server rendering mode in conjunction with SQL Server - but are there reporting systems like Crystal Reports which offer better reporting and better migration paths for our legacy Access reports?
If you had up to 36 months of developer time, how would you do this?
Well, ok, no one else jumping in, I give this a go.
Quite interesting how you talking about a report writer that 15+ years old. Back then the Access report writer was beyond state of the art. It was a country mile ahead of everything else in the industry. Even today a lot of competing report writers don't have the concept of sub reports that allows modeling of relational data without having to resort to code or even SQL. Then, throw in programmable VBA, then the result is something that's very unique and powerful.
For access 2007, the report writer received some more nice upgrades in terms of layout controls but that going to be of little help here.
And, for 2010 we can now display reports in a sub-form control. This feature was added to facilitate use of the new access navigation control. Access 2010 has a new web browser control (works in forms or reports), and there also a new navigation control. Your post hints that the new navigation control and the web control are somehow related to each other but they completely different features.
Both the new web browser control, and navigation control can be used in both web appliations or 100% client only applications. The navigation control is nice since you can build that nav contorl by drag and dropping reports onto the nav control to build up a up a list of reports to choose from (it is slick and easy and nice). And with this navigation control, we can actually build some nice drill down type of interfaces for reports.
As you noted for access 2010 we now have web publishing of access reports and this feature is based on SQL server reporting services (they are RDL reports). However, two important issues here is no VBA is allowed inside of the web reports. And, I also point out that there is no automatic conversion utility that is built into access that will convert existing reports into web based reports. So to build a report that's going to be designated and published to the web, you have to choose specifically to create a web report to accomplish this goal. So this answers and clears up one question of yours of will this help you convert existing reports to SQL server, and the answer is no. So, Access will not help you convert existing reports to web based RDL reports (As noted, Access uses RDL and sql reporting for those web reports - those reports also render in the access client side without conversion).
Access has a great path for web based reports via SharePoint and also Access Web is coming to Office 365. However, keep in mind this ability is not going to help much with the existing reports that you have.
In fact one of the things I would be looking at if you're going to use winforms report viewer is the change in where that existing VBA report code will be moved to? You not really mentioned this issue. As noted one really interesting and great feature of those reports is that imbedded VBA code. Often that VBA will have been used because SQL and something like RDL will NOT work because neither of those languages (sql, and RDL) are procedural code.
I can't stress how important this concept is. So, this quite much means any report writer replacements means that code will now have to be OUTSIDE of the reports and moved into your application. So, keep this issue in mind as now when you issue new reports, you also be issuing new procedural code that NOT be contained in those reports. This code will have to become part of your application (so, to issue new reports, you will thus also be issusing a new version of your software).
You are not likely to find much that allows procedural code to be imbedded inside the report like you can with access. So, that report code and logic will now have to be built and maintained within your main application and outside of the reports.
At the end of the day, I should point out the old adage if it ain't broke, then don't change it. Access been around for a very long time, but we seen significant investments from the folks in Redmond into this product during the last few years, so it shows no signs of dying anytime soon.
So, one possible suggestion is to keep the status quo, and continue going the way it works now. I mean you stated that you have to continue supporting JET for this anyway so you not getting away from having to use a major part of Access anyway. So, you continue to have to use JET engine anyway. So, you just dumping the report side and you still have use the JET data engine anyway.
However, assuming this decision's been made, I can't really suggest what report writer you should replace the access one with. Obviously considerations for the next report writer should have a seamless path to web even if they are NOW going to be rendered on the desktop. It makes no sense to make a large investment today without web considerations in some fashion.
I do think SQL server reporting services is a good choice due to the web ability. And, as an access developer we also have the option to create web based reports but they also render perfect in the access client on the desktop side (and this works when you have no server and no conversion issues exist when publishing these reports to the web, or using them local on the client). So, even if you don't use access, do choose something that allows reports to render both desktop and web like access 2010 allows.
I would consider building the report system around some .net tools. This would likely not play too well as an embedded report system inside of your existing application, but it would allow you to issue new reports, and you would not have to touch your existing code base for each new report issued. This issuing of new reports that have procedural code needs to be resolved. You likely can now issue new reports without having to modify the main application because those reports can contain code inside. I would be looking to use something that would allow new reports to be built and issued but you not having to issue new edition of your main software. You might not embeed the code in the reports anymore, but you need to palce it somewhere, and hopefully outside of your main application.
Wow, this is a great question and Albert has given you a teriffic answer.
Unfortunately I do not believe there are any magic bullets to solve your problem. I have used Microsoft Access since it's first version, and always felt it's strongest feature was as a report generator, particulary when used with SQL Server. As you undoubtably know, one can often have issues with corrupted Access databases in a multi-user environment and SQL Server addresses that issue very nicely.
To my way of thinking the biggest problem with Access is that Microsoft brought out managed code (.Net) ten years ago now but Access is still a native application. In an ideal world Microsoft would rewrite Access in C# using all the latest features such as improved support for multiple processors etc. Unfortunately I do not expect this to happen any time soon.
Visual Basic for Applications (VBA) was definately far abead of "state of the art" when it was introduced, but today I believe most would agree that coding in VB.Net with Visual Studio is much more productive than continuing to develop in VBA.
SInce the selection of a new report generator is something you will need to live with for several years, perhaps it would be helpful to consider what an "ideal" report generator for the next ten years should look like?
Personally I would want:
1) All the great graphics and ease of skinning and branding that Silverlight provides.
2) Great multiprocessor support (you must have noticed how the UI thread in Access often appears "unresponsive" when running long queries or reports).
3) Support for lots of devices such as cellphones, iPads etc. While today the desktop and web dominate, these are becoming increasingly important (unless for some particular reason they are not important to your customers going forward).
4) Support for modern programming practices such as test driven development, dependency injection etc.
Please do let us know what you decide upon.
This is a long shot, but is there a possibility of using Access to generate a saved PDF and displaying that in your app in a PDF-viewing control that is part of your app, rather than external? Or export to XML or something (I haven't a clue what XML export options are available for reports in recent versions of Access, if any)?
The point is that you'd not have to rewrite the Access reporting logic, but you'd have eliminated the fake embedding and replaced it with something that's really embedded in your application.
What you'd be giving up is the perhaps the options that the Access UI gives the user, but I'm not sure how useful that is (I'd tend to not want those options available!).
Also, you'd be persisting the reports to disk, but I'm not sure this is much of any kind of significant issue, either, but it would entirely depend on the context (I'm assuming you have no 1000-page reports with heavy graphics, etc.).
You could take a look at ActiveReports by Data Dynamics. We use it within our apps for paperwork type reports (eg, invoices) and it's extremely flexible, far more so than what you can achieve with the MS reporting tools. For reports that are genuine reports rather than paperwork we use reporting services. It's been a while since I had to port an access report to active reports, but there is little or nothing you could do in access that you can't do in active reports. I'm also fairly certain that it has a decent tool for import access reports. There's a fully functional evaluation version available for download, which, unless they've changed things, just printes a watermark in the report footer rather than expires after a fixed evaluation period. Well worth a look, I'd say - Here's a link to their site
I won't get into any specifics since I'm not a Microsoft developer, but I can answer on how to integrate a legacy product into the current or new product. As for the 36-month question, see the end of this answer.
Usage Requirements - how do you intend to use the legacy code in the context of your new code?
Identify Use Cases - drill down into usage and create a use case for each transaction between the new and old code.
Identify I/O - drill down into each use case and identify I/O requirements
Write Tests - for each I/O pair, write tests to determine the best way to handle that I/O pair.
Reuse - reuse your tests to create a wrapper/API for the legacy code.
Future - as you replace legacy code with new code, let it match your wrapper/API so you can keep refactoring to a minimum.
If I had 36 months of development time to spend, I would spend 3-6 months writing a wrapper/API and then replace each unit tested I/O pair with new code every 7-10 days utilizing sprints (scrum/agile).
For the data store, I would absolutely move from Access to some SQL server product and prioritize that requirement for the new code.
I've used Crystal, Access (2 - 2007), SQL Reporting and now DevExpress and am very happy with DevExpress's reporting engine. It is specific to .net, but can be utilized by Windows Forms, ASP.net Web pages, WPF and Silverlight. If you are willing to utilize some .net controls, I highly recomend it. It can use just about anything as a datasource and is very flexible. My current projects aren't as complex as some things I have done in the past, but I would venture to say that I would rather do complex reports using the DX engine over any other I have used.
They have an End User designer that includes scripting capabiliities and DX is actively adding functionality.
I would recommend taking a look at: http://devexpress.com/Products/Index/Reporting.xml

How to replace text in a PowerPoint (.ppt) document?

What solutions are there? I know only solutions for replacing Bookmarks in Word (.doc) files with Apache POI?
Are there also possibilities to change images, layouts, text-styles in .doc and .ppt documents?
I think about replacement of areas in Word and PowerPoint documents for bulk processing.
Platform: MS-Office 2003
What are your platform limitations?
Obviously Apache POI will get you at least part of the way there.
Microsoft's own COM API's are fairly powerful and are documented here. I would recommend using them if a) you are not running in a server (many users, multithreaded) environment; b) you can have a proper version of powerpoint installed on the production machine; and c) you can code against a COM object model.
It's a bit pricey, but Aspose.Slides is a very powerful library for manipulating PowerPoint files
If you include using other Office suits as an option, here's a list of possible solutions:
Apache POI-HSLF
PowerPoint 2007 APIs
OpenOffice.org UNO
Using POI you can't edit .pptx file format, but you don't depend on the apps installed on the system. Other two options, on the contrary, make use of other apps, but they are definitely better for dealing with presentations. OpenOffice has better compability with older formats, by the way. Also if you use UNO, you'll have a great choice of languages, UNO exists for Java, C++, Python and other languages.
My experience is not directly with Power Point, but I've actually rolled my own WordML (XML) generator. It a) removed all dependencies on Word, b) was very fast c) and let me build up documents from scratch.
But it was a lot of work to create. And I was only creating a write only implementation.
I'm not as familiar with Power Point, so this is conjecture, but you may be able to roll your own by reading XML (Power Point 2003??) and/or cracking the Office Open XML file (zipped XML), then using XPath to manipulate the data, and then saving everything back to disk.
This won't work on older OLE Compound Document based Power Point files though.
I've done something like that before: programmatically accessed and manipulated PowerPoint presentations. Back when I did it, it was all in C++ using COM, but similar principles apply to C#/VB .NET apps, since they do COM interop very easily.
What you're looking for is called the Office Document Model. Basically, Office applications expose their documents programmatically, as trees of objects that define their contents. These objects are accessible via an API, and you can manipulate them, add new ones, and do whatever other processing you want. It's exceedingly powerful; you can use it to manipulate pretty much all aspects of a document. But you'll need an installation of Office and Visual Studio to be able to use it.
Some links:
Intro: http://msdn.microsoft.com/en-us/library/d58327k6.aspx
Hope this helps!
Apparently new users can only include one link per posting. How lame! :)
Here's the other link I meant to include:
Example of manipulating PowerPoint documents programmatically: http://msdn.microsoft.com/en-us/library/cc668192.aspx

Enterprise-grade template printing system

I'm looking for an enterprise-grade template printing system. I'm interested in every software I can get my hands on to evaluate. Commercial or not.
What I need - a separate system ready to receive tags in order to print (digital or paper) a template (like a contract, invoice, etc). Templates should be managed by the same software. It should operate via web services or via enterprise bus (preferable JMS or MQSeries connectors).
Can I ask for some names and possibly some URLs? Anything will be helpful even if it does not fit the requirements exactly.
Thanks.
This is an old question, but for the Googlers out there, we use a couple of products to render documents in XSL-FO (a W3C standard paper specification that we generate using XSL) either to PDF, PostScript, etc. We use it to show documents online as well as bulk print a few hundred thousand of them monthly.
RenderX (.NET, Java, whatever)
provides a very powerful solution for
our bulk printing needs
IBEX PDF Creator (.NET
only) for online rendering to PDF
Calligo is a commercial package from InSystems. Can't reach the web site right now; could be a bad sign.
Then there are these open source possibilities.