Comparing Oracle Forms modules - compare

We've inherited a large Oracle project that was originally Oracle 9 and is now Oracle 11.
We don't have much confidence in the Forms change register.
Are there any tools we could use to compare the two Forms projects?
(By virtue of the fact that two different versions of Forms are involved, there will be differences even though the underlying code hasn't changed).

The general consensus of the Oracle Tools Developers Users Group is that the ORCL Toolbox FormsTool diff ability is the best and brightest. Well worth the cost of a license if you have a diff need for form.
The problem with the FMB->FMT is that the queries in the forms are converted to a binary format which is unreadable by humans.
Forms version 10 has an ability to convert FMB to XML files, assuming you can read the XML files and do diffs.
Don't assume that just because you are using two versions of forms, the the underlying FMB files need to change. I've taken FMB files for Forms 4.5, and run them through the compiler for the Forms 4.5, 6i and 9i and gotten all of them to work just fine.
Which versions of Forms are you using?

No answers yet but doing some research:
Caffo Forms Diff which is free
Forms Tool 30 days free trial with some restrictions
Convert FMB (binary file) to FMT (text file). From the form, choose menu File/Administration/Convert/Binary_to_text, This generates "source code" which can be compared with a normal text compare tool e.g. diff / Beyond Compare etc. But then you have to do each form individually and we have lots of forms!
Anyone used any of these? Any recommendations? Other tools / better way?
Update: Beyond Compare also has a compare facility. Refer here.

Related

"Best" Input File Formats for C++? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am starting work on a new piece of software that will end up needing some robust and expandable file IO. There are a lot of formats out there. XML, JSON, INI, etc. However, there are always plusses and minuses so I thought I would ask for some community input.
Here are some rough requirements:
The format is a "standard"...I don't want to reinvent the wheel if I don't have to. It doesn't have to be a formal IEEE standard, but something you could Google and get some information on as a new user, may have some support tools (editors) beyond vi. (Though the software users will generally be computer savvy and happy to use vi.)
Easily integrates with C++. I don't want to have to pull along a 100mb library and three different compilers to get it up and running.
Supports tabular input (2d, n-dimensional)
Supports POD types
Can expand as more inputs are required, binds well to variables, etc.
Parsing speed is not terribly important
Ideally, as easy to write (reflect) as it is to read
Works well on Windows and Linux
Supports compositing (one file referencing another file to read, and so on.)
Human Readable
In a perfect world, I would use a header-only library or some clean STL implementation, but I'm fine with leveraging Boost or some small external library if it works well.
So, what are your thoughts on various formats? Drawbacks? Advantages?
Edit
Options to consider? Anything else to add?
XML
YAML
SQLite
Google Protocol Buffers
Boost Serialization
INI
JSON
There is one excellent format that meets all your criteria:
SQLite!
Please read article about using SQLite as an application file format. Also, please watch Google Tech Talk by D. Richard Hipp (SQLite author) about this very topic.
Now, lets see how SQLite meets your requirements:
The format is a "standard"
SQLite has become format of choice for most mobile environments, and for many desktop apps (Firefox, Thunderbird, Google Chrome, Adobe Reader, you name it).
Easily integrates with C++
SQLite has standard C interface, which is only one source file and one header file. There are C++ wrappers too.
Supports tabular input (2d, n-dimensional)
SQLite table is as tabular as you could possibly imagine. To represent say 3-dimensional data, create table with columns x,y,z,value and store your data as a set of rows like this:
x1,y1,z1,value1
x2,y2,z2,value2
...
Supports POD types
I assume by POD you meant Plain Old Data, or BLOB. SQLite lets you store BLOB fields as is.
Can expand as more inputs are required, binds well to variables
This is where it really shines.
Parsing speed is not terribly important
But SQLite speed is superb. In fact, parsing is basically transparent.
Ideally, as easy to write (reflect) as it is to read
Just use INSERT to write and SELECT to read - what could be easier?
Works well on Windows and Linux
You bet, and all other platforms as well.
Supports compositing (one file referencing another file to read)
You can ATTACH one database to another.
Human Readable
Not in binary, but there are many excellent SQLite browsers/editors out there. I like SQLite Expert Personal on Windows and sqliteman on Linux. There is also SQLite editor plugin for Firefox.
There are other advantages that SQLite gives you for free:
Data is indexable which makes it very fast to search. You just cannot do this using XML, JSON or any other text-only formats.
Data can be edited partially, even when amount of data is very large. You do not have to rewrite few gigabytes just to edit one value.
SQLite is fully transactional: it guarantees that your data is consistent at all times. Even if your application (or whole computer) crashes, your data will be automatically restored to last known consistent state on next first attempt to connect to the database.
SQLite stores your data verbatim: you do not need to worry about escaping junk characters in your data (including zero bytes embedded in your strings) - simply always use prepared statements, that's all it takes to make it transparent. This can be big and annoying problem when dealing with text data formats, XML in particular.
SQLite stores all strings in Unicode: UTF-8 (default) or UTF-16. In other words, you do not need to worry about text encodings or international support for your data format.
SQLite allows you to process data in small chunks (row by row in fact), thus it works well in low memory conditions. This can be a problem for any text based formats, because often they need to load all text into memory to parse it. Granted, there are few efficient stream-based XML parsers out there, but in general any XML parser will be quite memory greedy compared to SQLite.
Having worked quite a bit with both XML and json, here's my rather subjective opinion of both as extendable serialization formats:
The format is a "standard": Yes for both
Easily integrates with C++: Yes for both. In each case you'll probably wind up with some kind of library to handle it. On Linux, libxml2 is a standard, and libxml++ is a C++ wrapper for it; you should be able to get both of those from your distro's package manager. It will take some small effort to get those working on Windows. There appears to be some support in Boost for json, but I haven't used it; I've always dealt with json using libraries. Really, the library route is not very onerous for either.
Supports tabular input (2d, n-dimensional): Yes for both
Supports POD types: Yes for both
Can expand as more inputs are required: Yes for both - that's one big advantage to both of them.
Binds well to variables: If what you mean is some way inside the file itself to say "This piece of data must be automatically deserialized into this variable in my program", then no for both.
As easy to write (reflect) as it is to read: Depends on the library you use, but in my experience yes for both. (You can actually do a tolerable job of writing json using printf().)
Works well on Windows and Linux: Yes for both, and ditto Mac OS X for that matter.
Supports one file referencing another file to read: If you mean something akin to a C #include, then XML has some ability to do this (e.g. document entities), while json doesn't.
Human readable: Both are typically written in UTF-8, and permit line breaks and indentation, and thus can be human-readable. However, I've just been working with a 479 KB XML file that's all on one line, so I had to run it through a prettyprinter to make sense of it. json can also be pretty unreadable, but in my experience is often formatted better than XML.
When starting new projects, I generally prefer json; it's more compact and more human-readable. The main reason I might select XML over json would be if I were worried about receiving badly-formed documents, since XML supports automated document format validation, while you have to write your own validation code with json.
Check out google buffers. This handles most of your requirements.
From their documentation, the high level steps are:
Define message formats in a .proto file.
Use the protocol buffer compiler.
Use the C++ protocol buffer API to write and read messages.
For my purposes, I think the way to go is XML.
The format is a standard, but allows for modification and flexibility for the schema to change as the program requirements evolve.
There are several library options. Some are larger (Xerces-C) some are smaller (ezxml), but there are many options, so we won't be locked in to a single provider or very specific solution.
It can supports tabular input (2d, n-dimensional). This requires more parsing work on "our" end, and is likely the weakest point for XML.
Supports POD types: Absolutely.
Can expand as more inputs are required, binds well to variables, etc. through schema modifications and parser modifications.
Parsing speed is not terribly important, so processing a text file or files is not an issue.
XML can be programmatically written just as easily as read.
Works well on Windows and Linux or any other OS that supports C and text files.
Supports compositing (one file referencing another file to read, and so on.)
Human Readable with many text editors (Sublime, vi, etc.) supporting syntax highlighting out of the box. Many web browsers display the data well.
Thanks for all the great feedback! I think if we wanted a purely binary solution, Protocol Buffers or boost::serialization is likely the way that we would go.

KORMARC to MARC21 converter

Does anyone know if there is a free open-source solution to convert KORMARC (Korean MARC) into MARC21 (aka USMARC)?
While I'm not certain it has KORMARC support, you may want to try USEMARCON if you can find a mapping. From the USEMARCON page:
USEMARCON facilitates the conversion of catalogue records from one MARC format to another e.g. from UKMARC to UNIMARC. The software was designed as a toolbox-style application, allowing users with detailed knowledge of the source and target MARC formats to develop rules governing the behaviour of the conversion. Rules files may be supplemented by additional tables for more accurate conversion of MARC-specific character sets or coded information. The tables and rules files are simple ASCII text files and can be created using any standard text editor such as MS Windows Notepad.
Also, this thread from the Ask a Korean Studies Librarian Google Group might be useful, particularly the following message:
Library of Congress once tried to download records from the National
Library of Korea (NLK) to use as order records. LC wrote a
specification and developed a in-house program to convert KORMARC to
USMARC. Since NLK records only provide script, LC used a
transliterator to provide romanization for Voyager system developed by
non-LC programmer. The feedback of this method is not very positive
by LC staff. ... In stead of converting KORMARC to USMARC, a few research libraries
including LC is currently using MarcEdit with Excel spreadsheets which
are provided by Korean vendors based on contract. Vendors provide
both Korean script and romanization for several elements of MARC
fields (ISBN, title, author, publisher, place, series, etc.) in
different columns of spreadsheet for your order items. It sounds a
lot simpler to set up initially. And once MarcEdit is set up
properly, it creates MARC records.

Is there such a thing like a Printer-Markup-Language

I like to print a document. The content of the document are tables and text with different colors. Does a lightwight printer-file-format exist, which can be used like a template?
PS, PDF, DOC files in my opinion are to heavy to parse. May there exist some XML or YAML file format which supports:
Easy creation (maybe with a WYSIWYG-Editor)
Parsing and manipulation with Library-Support
Easy sending to the printer (maybe with Library-Support)
Or do I have to do it the usual way and paint within a CDC?
I noticed you’re using MFC (so, Windows). In that case the answer is a qualified yes. In recent versions of Windows, Microsoft offers the XPS Document API which lets you create and manipulate a PDF-like document using XML, which can then be printed using the XPS Print API.
(For earlier versions of Windows that don’t support this API, you could try to deal with the XPS file format directly, but that is probably a lot harder than using CDC. Even with the API you will be working at a fairly low level.)
End users can generate XPS documents using the XPS print driver that is available for free from Microsoft (and bundled with certain MS products—they probably already have it on their system).
There is no universal language that is supported across all (or even many) printers. While PCL and PS are the most used, there are also printers which only work with specific printer drivers because they only support a proprietary data format (often pre-rendered on the client).
However, you could use XSL-FO to create documents which can then be rendered to a printer driver using library support.
I think something like TeX or LaTeX (or even troff or groff) may meet your needs. Google them and see.
There are also libraries to render documents for print from HTML source. Look at http://libharu.sourceforge.net/ for example. This outputs a printer-ready .PDF
A think that Post Script is a really good choice for that.
It is actually a very simple language, and it must be very easy to parse becuse it is stack-oriented. Then -- most printers supprort it, and even if you have no support you can use GhostScript to convert for many different formats (Consider GS as a "virtual PS supporting printer").
Finally there are a lot of books and tutorials for the language.
About the parsing -- you can actually define new variables and functions in PS. So, maybe, your problem can be solved (almost) entirely using PS.
HTML + CSS can be printed -- properly. CSS was designed to support this with the media attribute to specify that your CSS is for printer layout, not for screen layout. Tools like PRINCE (free + commercial versions) exist to render this for printing.
I think postscript is the markup language used by printers. I read this somewhere, so correct me if postscript is now outdated.
http://en.wikipedia.org/wiki/PostScript
For more powerful suite you can use Latex. It will give options of creating templates where you can just copy the text.
On a more GUI friendly note, MS-Word and other word processors have templates. The issue is they are not of a common standard or markup.
You can also use HTML to render stuff in a common markup but it will not be very printer friendly.

library for doing diffs

I've been tasked with creating a tool that can diff and merge the configuration files for my company's product. The configurations are stored as either XML or URL-encoded strings. I'm looking for a library, preferably open source with a license compatible with commercial software, that can do these diffs. Our app is written in C++, so C++ libraries would be best, but I'm willing to look at libraries that are C#-specific since I can write a wrapper that exposes it to C++ via COM. Three-way diffs would be ideal, but two-way is acceptable. If it has an understanding of XML, that would also be a plus (since XML nodes can be reordered without changing the document, etc). Any library suggestions? Should I even consider writing my own diff tools in the hopes of giving it semantic knowledge of our formats?
Thanks to this similar question, I've already discovered this google library, which seems really great, but I'm still looking for other options. It also seems to be able to output the diffs in HTML format (using the <ins> and <del> tags that I didn't know existed before I discovered it), which could be really handy, but it seems to be a unified diff only. I'm going to need to display the results in a web browser, and probably have to build an interface for doing the merges in the browser as well. I don't expect a library to be able to help with these tasks, but it must produce output in a format that is amenable to me building this on top of it. I'm currently envisioning something along the lines of TortoiseMerge (side-by-side diffs, not unified), except browser-based. Any tips/tricks/design ideas on how to present this would be appreciated too.
Subversion comes with libsvn_diff and libsvn_delta licensed under Apache Software License.
Here is a C++ library that can diff what the author calls semistructured data. It deals nicely with HTML and XML. Since your data is XML it would make a lot of sense to use this instead of plain text diff. This is especially the case when the files are machine generated.
I am currently trying to use this library to build a tool that diffs Visual Studio project files. These are basically XML files and using a plain diff tool like Winmerge is too painful because Visual Studio pretty much mucks up the whole file by crazy reordering. The idea is to do some kind of a structured diff to address the problem.
For diffing the XML I would propose that you normalize it first: sort all the elements in alphabetic order, then generate a stream of tokens/xml that represents the original document but is independent of the original formatting. After running the diff, parse the result to get a tree containing what was added / removed.

C++ Header files - put them in one directory or merged in a tree structure?

I have a substantial body of source code (OOFILE) which I'm finally putting up on Sourceforge. I need to decide if I should go with a monolithic include directory or keep the header files with the source tree.
I want to make this decision before pushing to the svn repo on SourceForge. I expect a lot of people who use it after that move will keep a working copy checked out directly from SF so won't want to change their structure.
The full source tree has about 262 files in 25 folders. There are a lot more classes than that suggests as due to conforming to 8.3 character names (yes it dates back to Win3.1) many classes are in one file. As I used to develop with ObjectMaster, that never bothered me but I will be splitting it up to conform to more recent trends to minimise the number of classes per file. From a quick skim of the class list, there are about 600 classes.
OOFILE is a cross-platform product expected to be built on Mac, Windows and assorted Unix platforms. As it started life on Mac, with compilers that point to include trees rather than flat include dirs, headers were kept with the source.
Later, mainly to keep some Visual Studio users happy, a build was reorganised with a single include directory. I'm trying to choose between those models.
The entire OOFILE product covers quite a few domains:
database front-end
range of database backends
simple 2D graphing engine for Mac and Windows
simple character-mode report-writer for trivial html and text listing
very rich banding report-writer with Mac and Windows Preview and Printing and cross-platform generation of text, RTF, HTML and XML reports
forms integration engine for easy CRUD forms binding to the database, with implementations on PowerPlant and MFC
cross-platform utility classes
file and directory manipulation
strings
arrays
XML and tag generation
Many people only want to use it on a single platform and some of those code areas are pure legacy (eg: PowerPlant UI framework on classic Mac). It therefore seems people would appreciate not having headers from those unwanted areas dumped in their monolithic include directory.
I started thinking about having an include directory split up into a few of the domains above and then realised that was sounding more like the original structure.
In summary, the choices seem to be:
Keep original model, all headers adjacent to source - max flexibility at cost of some complex includes in projects.
one include directory with everything inside
split includes by domain, so there may be about 6 directories for someone using the lot but a pure database user would probably have a single directory.
From a Unix build aspect, the recommended structure has been 2. My situation is complicated by needing to keep Visual Studio and XCode users happy (sniff, CodeWarrior, how I doth miss thee!).
Edit - the chosen solution:
I went with four subdirectories in include. I started trying to divide them up further by platform but it just got very noisy very quickly.
Personally I would go with 2, or 3 if really pushed.
But whichever you choose, please make it crystal clear in the build instructions how to set up the include paths. Nothing dooms an open source project more than it being really difficult to build - developers want a quick out-of-the-box experience and if it involves faffing around with many undocumented environment variables (or whatever) most will simply go away.