EMF - Convert XML model to XMI - eclipse-emf

I have created an .ecore and .genmodel metamodel from an .xsd file. I'am trying to create a model instance from an .xml file conforming to .xsd file (and in consequence to .ecore metamodel). How can I achieve this goal?

Only you have to load your XML file in a EMF resource, setting the load option XMLResource.OPTION_SUPPRESS_DOCUMENT_ROOT as true. After that you have to create an output resource setting the URI as your file .xmi. Finally you get your root element from the XML model resource and insert it in your XMI model resource, after that you save your output model and it is done.
Resource loadResource = new ResourceImpl(sourceURI); //We create a resource with XML file uri as parameter, to load de XML model.
// Set option to load configuration file
Map options = new HashMap();
// The option below deleted Document root in output file
options.put(XMLResource.OPTION_SUPPRESS_DOCUMENT_ROOT, true);
loadResource.load(options); // Now we have load the XML model
// Create an output resource where copy element from input resource
Resource resourceOut = new Resource(targetURI); //We create a resource to XMI file
// Copying elements from input resource to output resource
EList<EObject> listObj = loadResource.getContents();
EObject obj = listObj.get(0);
resourceOut.getContents().add(obj);
resourceOut.save() //We serialize the resource to the XMI file

At the end I just need to change root node name. To achieve this goal you just need to follow the next steps:
At your ecore diagram right click your root node (the equivalent of your root node in your xml file).
Click on create dynamic instance.
Make a test model. This model is an XMI instance.
At the end you just have to change your rood node information for the new one (the one generated in the XMI model).
In my case I replaced
/* At XML file */
<featureModel>
//Here you find the model nodes
...
</ featureModel>
With
/* XML file converted to XMI file. This file conforms to XSD and ecore model. */
<ide:FeatureModelType [here you will find some attributes]>
//Here you find the model nodes just as they where defined earlier
...
</ide:FeatureModelType>
Of course this could be made programmatically.

Related

How to Read multiple parquet files or a directory using apache arrow in cpp

I am new to apache arrow cpp api.
I want to read multiple parquet files using apache arrow cpp api, similar to what is there in apache arrow using python api(as a table).
However I don't see any example of it.
I know I can read a single parquet file using :
arrow::Status st;
arrow::MemoryPool* pool = arrow::default_memory_pool();
arrow::fs::LocalFileSystem file_system;
std::shared_ptr<arrow::io::RandomAccessFile> input = file_system.OpenInputFile("/tmp/data.parquet").ValueOrDie();
// Open Parquet file reader
std::unique_ptr<parquet::arrow::FileReader> arrow_reader;
st = parquet::arrow::OpenFile(input, pool, &arrow_reader);
Please let me know if you have any questions.
Thanks in advance
The feature is called "datasets"
There is a fairly complete example here: https://github.com/apache/arrow/blob/apache-arrow-5.0.0/cpp/examples/arrow/dataset_parquet_scan_example.cc
The C++ documentation for the feature is here: https://arrow.apache.org/docs/cpp/dataset.html
I'm working on a recipe for the cookbook but I can post some snippets here. These come from this work-in-progress: https://github.com/westonpace/arrow-cookbook/blob/feature/basic-dataset-read/cpp/code/datasets.cc
Essentially you will want to create a filesystem and select some files:
// Create a filesystem
std::shared_ptr<arrow::fs::LocalFileSystem> fs =
std::make_shared<arrow::fs::LocalFileSystem>();
// Create a file selector which describes which files are part of
// the dataset. This selector performs a recursive search of a base
// directory which is typical with partitioned datasets. You can also
// create a dataset from a list of one or more paths.
arrow::fs::FileSelector selector;
selector.base_dir = directory_base;
selector.recursive = true;
Then you will want to create a dataset factory and a dataset:
// Create a file format which describes the format of the files.
// Here we specify we are reading parquet files. We could pick a different format
// such as Arrow-IPC files or CSV files or we could customize the parquet format with
// additional reading & parsing options.
std::shared_ptr<arrow::dataset::ParquetFileFormat> format =
std::make_shared<arrow::dataset::ParquetFileFormat>();
// Create a partitioning factory. A partitioning factory will be used by a dataset
// factory to infer the partitioning schema from the filenames. All we need to specify
// is the flavor of partitioning which, in our case, is "hive".
//
// Alternatively, we could manually create a partitioning scheme from a schema. This is
// typically not necessary for hive partitioning as inference works well.
std::shared_ptr<arrow::dataset::PartitioningFactory> partitioning_factory =
arrow::dataset::HivePartitioning::MakeFactory();
arrow::dataset::FileSystemFactoryOptions options;
options.partitioning = partitioning_factory;
// Create a dataset factory
ASSERT_OK_AND_ASSIGN(
std::shared_ptr<arrow::dataset::DatasetFactory> dataset_factory,
arrow::dataset::FileSystemDatasetFactory::Make(fs, selector, format, options));
// Create the dataset, this will scan the dataset directory to find all of the files
// and may scan some file metadata in order to determine the dataset schema.
ASSERT_OK_AND_ASSIGN(std::shared_ptr<arrow::dataset::Dataset> dataset,
dataset_factory->Finish());
Finally, you will want to "scan" the dataset to get the data:
// Create a scanner
arrow::dataset::ScannerBuilder scanner_builder(dataset);
ASSERT_OK(scanner_builder.UseAsync(true));
ASSERT_OK(scanner_builder.UseThreads(true));
ASSERT_OK_AND_ASSIGN(std::shared_ptr<arrow::dataset::Scanner> scanner,
scanner_builder.Finish());
// Scan the dataset. There are a variety of other methods available on the scanner as
// well
ASSERT_OK_AND_ASSIGN(std::shared_ptr<arrow::Table> table, scanner->ToTable());
rout << "Read in a table with " << table->num_rows() << " rows and "
<< table->num_columns() << " columns";

Listener to execute at the end of the input file

I have a process that uses a chunk to read a file and insert the records into a table. I need to be able to insert a row into a parent table at the opening of the input file and when the file is closed I need to update that parent table row inserted at the start. Is there a listener or approach to make this happen?
The closest one is StepListener, where you can implement its beforeStep and afterStep method to update the parent table. You can inject StepContext into the step listener class to access context data through step metrics or step transient data.
But beforeStep is called before opening the input file. Not sure if this diff is significant for your case.
Otherwise, you can implement your own item reader class to achieve your requirement.

How To Generate Dynamic Target File In Informatica Based On Column Value

Scenario: Generate different flat file target based on the Location name, like separate files for Mumbai.dat, Bangalore.dat, and Delhi.dat
Source file:
Dept name Dept ID Location
DWH 1 Mumbai
Java 2 Bangalore
Dot net 3 Delhi
I am able to achieve it by transaction control component and output file target field but the problem is I am trying to create workflow and in session associated to this mapping I want to pass input file and output file as parameters that will populate through parameter file but I am getting error in reading input file however when I hard code the path with filename it's reading that perfectly.Apart from this output file is getting created with zero byte and not dynamically when I am trying to pass parameters for that.Can someone please help with workflow parameter file and how to use it in this case?
Edit your target file definition and tick to include a FileName port. Then its normal mapping logic to pass the desired filename through to this port. Informatica will send the record to file with name as specified by the port

Simple document switcher functionality?

I'm writing an application that will allow a user to drag/drop specific files onto the application window, parse those files, put the contents into a table (via a QStandardItemModel), and add each file's name (or alias) to a separate tree view (which acts as the document switcher).
I'll use NotePad++ as a simple example.
When I click any of the new files in the leftmost "Doc Switcher," it shows the contents in the right pane. Imagine that right pane is a table. And for instance, imagine that the list on the left is a list of .csv files that were imported into the application.
What I want to do is, upon clicking each item in the list, I want the corresponding parsed .csv file to show up in the table pane on the right.
My table is just a QTableView that displays the contents of the .csv files in a QStandardItemModel. Everything works when it comes to implementing the table and parsing the files.
I also set up a QTreeWidget as the "document switcher." Now, I need to link the document switcher selection to the table so that each file's respective contents will be shown in the table view.
I can have the application populate the tableView with the model contents when the QTreeView's top level item selection changes. That's no problem. The problem is with what I should be checking for when that selection changes and how.
I'm unsure of how to implement this. How do I store a bunch of QStandardItemModel objects and then link them to their names in the document switcher? Should I even be doing that? Do I have to create a new QStandardItemModel for each file that is imported? Should I create one QStandardItemModel, then somehow save it to be pulled back up later and re-use that same table model object for each file that is added? I'm just unsure how how this is supposed to work and feel like I am missing a fundamental part of all of this.
I would suggest two approaches to solve your problem:
You can watch document switcher signal (selection changed) and create new model for the currently selected data. Your table view in the right should show the data, when you set the model. When new file item selected, delete existing model and create new one with new data,
The same as first approach, but instead of recreating model for each data change you can use a single model, but reset its data each time you switch the file.

Creating IXMLDOMnode from string

I have a string which contains the XML representation of an XML node which I intend to insert in a XML document loaded in memory. The XML string (of node)is something like this:
<ns1:Feature name=\"PageSize\">\
<ns1:Option name=\"A4\" />\
</ns1:Feature>
So, it has got namespace for the tag names as well.
Is there a way I can achieve this?
I tried to user XMLDomNode->put_text(), but it does not work as it replaces the "<" and ">" chars by their text representations (< etc.)
I was wondering if loading the string buffer in a separate in-memory XML document and then getting the node pointer from there will work on my original document. But again, not sure if the XMLDOMnodes are transferable within documents.
I solved this myself using the 2nd approach:
1) Create an in-memory xml document based on IXMLDOMDocument3 interface and load the xml string in there.
2) Select the node you require using the selectNode() method.
3) Now go back to your orinial xml document where you want the node placed and load it again as a IXMLDOMDocument3 interface.
4) Use the importNode() method of IXMLDOMDocument3 from step 3 to clone the node obtained in step 2.
5) You can now use the cloned node to do an appendChild() to the original xml.