parse ACDSee categories - c++

How could one parse a xml-like string and convert it a separated list?
I am trying to convert the following string:
<Categories>
<Category Assigned="0">
6 Level
<Category Assigned="1">
6.2 Level
<Category Assigned="0">
6.3 Level
<Category Assigned="0">
6.4 Level
<Category Assigned="1">
6.5 Level
</Category>
</Category>
</Category>
</Category>
</Category>
</Categories>
To a separated list like:
6 Level/6.2 Level/6.3 Level/6.4 Level/6.5 Level, 6 Level/6.2 Level
Robin Mills of exiv2 provided a perl script:
http://dev.exiv2.org/boards/3/topics/1912?r=1923#message-1923
That would need to also parse Assigned="1". How can this be done in C++ to use in digikam, inside dmetadata.cpp with a structure like:
QStringList ntp = tagsPath.replaceInStrings("<Category Assigned="0">", "/");
I don't have enough programming background to figure this out, and haven't found any code sample online that do something similar. I'd also like to include the code in exiv2 itself, so that other applications can benefit.
Working code will be included in digikam: https://bugs.kde.org/show_bug.cgi?id=345220

The code you have linked makes use of Perl's XML::Parser::Expat module, which is a glue layer on top of James Clark's Expat XML parser.
If you want to follow the same route you should write C++ that uses the same library, but it can be clumsy to use as the API is via callbacks that you specify to be called when certain events in the incoming XML stream occur. You can see them in the Perl code, commented process an start-of-element event etc.
Once you have linked to the library, it should be simple to write C code that is equivalent to the Perl in the callbacks — they are only a single line each. Please open a new question if you are having problems with understanding the Perl
Note also that Expat is a non-validating parser, which will let through malformed data without comment
Given that the biggest task is to parse the XML data in the first place, you may prefer a different solution that allows you to build an in-memory document structure from the XML data, and interrogate it using the Document Object Model (DOM). The libxml library allows you to do that, and has its own Perl glue layer in the XML::LibXML module

Maik Qualmann has provided a working patch for digikam!
QString xmlACDSee = getXmpTagString("Xmp.acdsee.categories", false);
if (!xmlACDSee.isEmpty())
{
xmlACDSee.remove("</Categories>");
xmlACDSee.remove("<Categories>");
xmlACDSee.replace("/", "|");
QStringList tagsXml = xmlACDSee.split("<Category Assigned");
int category = 0;
int length;
int count;
foreach(const QString& tags, tagsXml)
{
if (!tags.isEmpty())
{
count = tags.count("<|Category>");
length = tags.length() - (11 * count) - 5;
if (category == 0)
{
tagsPath << tags.mid(5, length);
}
else
{
tagsPath.last().append(QString("/") + tags.mid(5, length));
}
category = category - count + 1;
if (tags.left(5) == QString("=\"1\">") && category > 0)
{
tagsPath << tagsPath.value(tagsPath.size() - count - 1);
}
}
}
if (!tagsPath.isEmpty())
{
return true;
}
}

Related

XML Name space issue revisited

XML Name space issue revisited:
I am still not able to find a good solution to the problem that the findnode or findvalue does not work when we have xmlns has some value.
The moment I set manually xmlns="", it starts working. At least in my case. Now I need to automate this.
consider this
< root xmlns="something" >
--
---
< /root>
My recommended solution :
dynamically set the value to xmlns=""
and when the work is done automatically we can reset to the original value xmlns="something"
And this seems to be a working solution for my XMLs only but its stll manual.
I need to automate this:
How to do it 2 options:
using Perl regex, or
using proper LibXML setNamespace etc.
Please put your thought in this context.
You register the namespace. The point of XML is not having to kludge around with regexes!
Besides, it's easier: you create an XML::LibXML::XPathContext, register your namespaces, and use its find* calls with your chosen prefixes.
The following example is verbatim from a script of mine to list references in Visual Studio projects:
(...)
# namespace handling, see the XML::LibXML::Node documentation
my $xpc = new XML::LibXML::XPathContext;
$xpc->registerNs( 'msb',
'http://schemas.microsoft.com/developer/msbuild/2003' );
(...)
my $tree; eval { $tree = $parser->parse_file($projfile) };
(...)
my $root = $tree->getDocumentElement;
(...)
foreach my $attr ( find( '//msb:*/#Include', $root ) )
{
(...)
}
(...)
sub find { $xpc->find(#_)->get_nodelist; }
(...)
That's all it takes!
I only have one xmlns attribuite at the top of the XML once only so this works for me.
All I did was first to remove the namespace part i.e. remove the xmlns from my XML file.
NODE : for my $node ($conn->findnodes("//*[name()='root']")) {
my $att = $node->getAttribute('xmlns');
$node->setAttribute('xmlns', "");
last NODE;
}
using last just to make sure i come of the for loop in time.
And then once I am done with the XML parsing I will replace the
<root>
with
<root xmlns="something">
using simple Perl file operation or sed editor.

Parse XML with unknown elements in C++ and Qt

I've got an XML document that I receive from a REST service that I wish to parse. The service might change their element and tag names so I'm trying to come up with some completely generic solution that saves the document structure as some kind of object with attributes.
I'm not sure how/if the C++ and Qt API's allow me to accomplish such a thing however. I've thought of creating some kind of keyed map that can hold element names as string keys and values would be their children in some recursive fashion. Being very new to Qt and C++ I'm not sure how I can accomplish this.
This could be an example XML document:
<root>
<element id="1">
<name>SomeName</name>
<desc>SomeDesc</desc>
<params>
<param pid="1">True</param>
<param pid="2">False</param>
</params>
...
<Some unknown element></Some unknown element>
</element>
</root>
This is how I convert the HTTP response (QNetworkReply) to a DOM document I can use in Qt:
QByteArray data = reply->readAll();
QDomDocument doc;
doc.setContent(data);
QDomNodeList nodes = doc.elementsByTagName("root");
if (nodes.size() > 0) {
// Prints all elements, should be able to save in a map somehow? Perhaps there is a better way?
qDebug() << nodes.at(0).toElement().text();
}
I would love some input on how I can parse this in a way that allows me to keep all information in the XML even without knowing the element, attribute and tag names. Something like this:
element = {
id : 1,
name : 'SomeName',
desc : 'SomeDesc',
params : [{
pid : 1,
param : True
}, {
pid : 2,
param : False
}
],
some unknown element : some unknown values
}

Reading XML data using boost::property_tree library functions in C++

<?xml version="1.0"?>
<sked>
<version>2</version>
<flight xmlns:xsi="some_uri" xsi:type="emirates">
<carrier>BA</carrier>
<number>4001</number>
<date>2011-07-21</date>
</flight>
<flight xmlns:xsi="some_uri" xsi:type="cathey-pacific">
<flight_class>
<type>Economy</type>
<fare>400</fare>
</flight_class>
<date>2011-07-21</date>
</flight>
</sked>
I have a XML document which describe 2 types of flights by same keywords. Sub fields are depend on the type of the flight. I have to read the XML and store it into C++ data classes according to the type of flights.
This is my code segment that is used for this purpose.
typedef boost::property_tree::ptree Node;
Node pt;
read_xml(test.xml, pt);
Node skedNode = pt.get_child("sked");
Node flightNode = skedNode.get_child("flight");
BOOST_FOREACH(Node::value_type const& v, skedNode.get_child("sked"))
{
if (v.first == "flight")
{
if (v.second.get("<xmlattr>.xsi:type", "Null") == "cathey-pacific")
{
BOOST_FOREACH(Node::value_type const& v1, flightNode.get_child("flight"))
{
if(v1.first == "flight_class")
FlightClass fclass = FlightClass(static_cast<Node>(flightNode));
}
}
}
}
When I try to run the above code, I got nothing inside FlightClass. I tried to debug the above code and found, v1.first is getting the values "carrier", "number" and "value" only.
I surprised because, those are the parameter of emirates type of flights. I couldn't receive cathey-pacific flight information.
Please help me to find out what the issue is.
I really want to get the information of cathy-pacific flights from this XML file and store into C++ data classes.
What should I do to correct this?
Note: Instead of second BOOST_FOREACH, I tried v.second.get_child("flight"); but it's throwing an exception. Then I replaced above by v.second.get_child("flight_class"); and it's giving it's sub-fields like: type and fare.
What may be the reason for that? It seems it's returning its grandchild nodes.
Boost doesn't provide any functionality like "get_next_child" to get different child nodes if there are more than one children nodes with same name.
So I just removed the unwanted fields from the tree before I iterate for above purpose.
flightNode.pop_front(); // to remove xmlattr
flightNode.pop_front(); // to remove version field
flightNode.pop_front(); // to remove first flight field.
Then used BOOST_FOREACH to reach above goal.

C++ Boost Property Tree Update Existing Node By Attribute Qualifier

Ok, so here's a sample of the XML structure:
<config>
<Ignored>
<Ignore name="Test A">
<Criteria>
<value>actual value</value>
</Criteria>
</Ignore>
<Ignore name="Test B">
<Criteria>
<value>actual value</value>
</Criteria>
</Ignore>
</Ignored>
<config>
I would like to be able to do two things:
Perform a get directly to the Test A element without having to loop all Ignore elements..like a selector on an attribute.
If nothing else, I need a method of updating either of the Ignore elements and can't seem to figure it out
Do I have to delete the element and recreate it? I can't seem to figure out a way to perform a put which qualifies an element (where there a many with the same name at the same level) by an attribute (which would be unique at that level).
Something like:
pt.put("config.Ignored.Ignore.<xmlattr>.name='Test A'.Criteria.value",some_var)
Or anything else that can achieve the end goal. Thank you very much!
Full disclosure: I'm pretty new to C++ and may be missing something blatantly obvious.
Boost.property_tree xml parser (RapidXML) doesn't support this.
Consider using something like TinyXPath is you want such functionality out of the box.
Or use explicit loop to find Ignore node with required attribute. Then you can use
someIgnoreNode.put("Criteria.value", some_var);
you may use a method like:
auto & pt_child = pt.getchild("config.Ignored");
BOOST_FOREACH(ptree::value_type &v1, pt_child)
{
if (v1.first == Ignore && v1.second.get<std::string>("<xmlattr>.name") == "Test A")
{
ptree & ptGrandChild = v1.second;
ptGrandChild.put<std::string>("Criteria.value", some_var);
}
}
boost::property_tree::xml_writer_settings<std::string> settings =
boost::property_tree::xml_writer_make_settings<std::string>('\t', 1);
write_xml(xmlPath, pt, std::locale(), settings);

How can I carry out math functions in the Ant 'ReplaceRegExp' task?

I need to increment a number in a source file from an Ant build script. I can use the ReplaceRegExp task to find the number I want to increment, but how do I then increment that number within the replace attribute?
Heres what I've got so far:
<replaceregexp file="${basedir}/src/path/to/MyFile.java"
match="MY_PROPERTY = ([0-9]{1,});"
replace="MY_PROPERTY = \1;"/>
In the replace attribute, how would I do
replace="MY_PROPERTY = (\1 + 1);"
I can't use the buildnumber task to store the value in a file since I'm already using that within the same build target. Is there another ant task that will allow me to increment a property?
You can use something like:
<propertyfile file="${version-file}">
<entry key="revision" type="string" operation="=" value="${revision}" />
<entry key="build" type="int" operation="+" value="1" />
so the ant task is propertyfile.
In ant, you've always got the fallback "script" tag for little cases like this that don't quite fit into the mold. Here's a quick (messy) implementation of the above:
<property name="propertiesFile" location="test-file.txt"/>
<script language="javascript">
regex = /.*MY_PROPERTY = (\d+).*/;
t = java.io.File.createTempFile('test-file', 'txt');
w = new java.io.PrintWriter(t);
f = new java.io.File(propertiesFile);
r = new java.io.BufferedReader(new java.io.FileReader(f));
line = r.readLine();
while (line != null) {
m = regex.exec(line);
if (m) {
val = parseInt(m[1]) + 1;
line = 'MY_PROPERTY = ' + val;
}
w.println(line);
line = r.readLine();
}
r.close();
w.close();
f.delete();
t.renameTo(f);
</script>
Good question, it can be done in perl similar to that, but I think its not possible in ant, .NET and other areas.. If I'm wrong, I'd really like to know, because that's a cool concept that I've used in Perl many times that I could really use in situations like you've mentioned.