can't get xhtml <script> content with libxml++ using xpath expression

can't get xhtml <script> content with libxml++ using xpath expression - c++

#include <libxml++/libxml++.h>
xmlpp::NodeSet xmlP(std::string xml_string, std::string xpath) {
xmlpp::DomParser doc;
// 'response' contains your HTML
doc.parse_memory(xml_string);
xmlpp::Document* document = doc.get_document();
xmlpp::Element* root = document->get_root_node();
xmlpp::NodeSet elemns = root->find(xpath);
xmlpp::Node* element = elemns[0];
std::cout << elemns.size() << std::endl;
std::cout << element->get_line() << std::endl;
//const auto nodeText = dynamic_cast<const xmlpp::TextNode*>(element);
const auto nodeText = dynamic_cast<const xmlpp::ContentNode*>(element);
if (nodeText && nodeText->is_white_space()) //Let's ignore the indenting - you don't always want to do this.
{
std::cout << nodeText->get_content() << std::endl;
}
}
The xml_string is something like this :
std::string xml_strings("
<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
<html lang=\"en\" xml:lang=\"en\" xmlns=\"http://www.w3.org/1999/xhtml\">
<head>
<title>Demo page</title></head>
<body>
<div class=\"item\">
<div class=\"row\">
<div class=\"col-xs-8\">Item</div>
<div class=\"col-xs-4 value\">
<script type=\"text/javascript\">fruit('orange');</script>
</div></div></div>
</body></html>");
The function called with the page and the xpath expression like this : xmlpp::NodeSet xmlNodes = xmlP(xml_strings, "/html/body/div/div/div[2]/script");
The problem is i couldn't get the text inside the <script>, i tried dynamic_cast'ing to ContentNode, nothing helped...
is libxml++ worth it or i need to solve my problem with another xml library?
Please, i appreciate all suggestions that can get me the text value from the <script> tag.

I tried reproducing your issue locally and could not get root->find(xpath) to produce any nodes.
According to this issue, you need to tell XPath which namespace your nodes are under, even if it is the default namespace.
I changed the XPath string and find invocation as follows:
std::string xpath("/x:html/x:body/x:div/x:div/x:div[2]/x:script");
xmlpp::Node::PrefixNsMap nsMap = {{"x",root->get_namespace_uri()}};
xmlpp::Node::NodeSet elemns = root->find(xpath, nsMap);
xmlpp::Node* element = elemns[0];
const auto nodeText = dynamic_cast<const xmlpp::Element*>(element);
if (nodeText) {
std::cout << nodeText->get_first_child_text()->get_content() << std::endl;
}

Related

Getting error from sscanf when trying to read string using CGI

I am getting an error form sscanf when I am trying to read a string. Get a certain part of the string. I am getting the string from an HTML file and I am using CGI as the server side to read from the form.
<!DOCTYPE html>
<html>
<head>
<title>CGI-zoo</title>
<style>
body {
text-align: center;
}
#nameErrorMsg {
color: red;
}
#title {
color: orange;
}
#animalDiv {
border-style: groove;
}
</style>
</head>
<body>
<script>
//global variable for name
var name;
//function :check()
//Parameters : void
//Returns : void
//Description : This function will make sure that the user enters something for the name and does not leave the text box blank
function checkName()
{
name= name=document.getElementById("nameEntered").value;
if(name==""||name==undefined)
{
//error msg
document.getElementById("nameErrorMsg").innerHTML="This text box can not be blank";
return false;
}
else {
//hide the name text box and button
document.getElementById("nameDiv").hidden=true;
//unhides the drop list
document.getElementById("askingUserWhatAnimal").innerHTML="Hello "+name+ ", Please choose one of the six animals to learn some facts from";
document.getElementById("animalDiv").hidden=false;
//unhide the submit form button
document.getElementById("submitForm").hidden=false;
return true;
}
}
</script>
<form id="cgiZoo" action="animal.exe" method="get">
<h1 id="title"> cgi-Zoo</h1>
<hr>
<div id="nameDiv">
<p id="nameErrorMsg"></p>
<label id="nameLabel">Name:</label>
<input type="text" id="nameEntered" placeholder="Name" name="nameEntered" v>
<button type="button" onclick="checkName()" id="nameCheckBtn">Submit </button>
</div>
<!---Animal drop down list -->
<div hidden id="animalDiv">
<p id="askingUserWhatAnimal"></p>
<label>Animals</label>
<select name="animalList" id="animalList" required="required">
<option></option>
<option value="Lion">Lion</option>
<option value="Elephant">Elephant</option>
<option value="Giraffe">Giraffe</option>
<option value="Moneky">Moneky</option>
<option value="Zebra">Zebra</option>
<option value="Panda">Panda</option>
</select>
<br>
<br>
</div>
<br>
<input hidden type="submit" id="submitForm" form="cgiZoo" name="submit">
</form>
</body>
</html>
When you submit this form it is suppose to take you to a server side script that basically tells the use what name they entered and what animal they picked using CGI.
But I am having an issue trying to read and splice the string from the data using sscanf
CGI code using C++
#include <iostream>
#include <string.h>
#pragma warning(disable: 4996)
#define CGI_VARS 30
#define SEND_METHOD 19 // REQUEST_METHOD tells us if the form was sent using GET or POST
#define DATA_LENGTH 14
using namespace std;
int main()
{
cout << "Content-type:text/html\r\n\r\n";
cout << "<html>\n";
cout << "<head>\n";
cout << "<title>CGI Environment Variables</title>\n";
cout << "</head>\n";
cout << "<body>\n";
char* data;
char* name;
char *animal;
//returns pointer
data = getenv("QUERY_STRING");
if (sscanf(data, "nameEntered=%s&animalName=%s", name, animal));
cout << name;
cout << animal;
cout << "</body>";
return 0;
}
Much help would be appreciated with this. I don't know what I am doing wrong. Below this I will show what the output should look like.
Output:
name="bob"
animal="lion"

name and animal doesn't point at any memory that you've allocated. One simple fix could be to make them into char[]s:
char name[256];
char animal[256];
char* data = getenv("QUERY_STRING");
// check that data is non-null and check the sscanf result:
if(data && sscanf(data, "nameEntered=%255[^&]&animalName=%255[^&]", name, animal) == 2)
{ // ^^^ ^^^
// limit the scan to the size of the char[] - 1
std::cout << "name=\"" << name << "\"\n\nanimal=\"" << animal << "\"\n";
}
Note %[^&] instead of %s for the scans. It scans up until but not including &.
Demo

Getting garbage when reading string from std::smatch

I'm trying to fix a memory issue, I think, for several hours now and I can't understand the issue of the problem.
I have a classes StaticTag and ExtendsTag that I create with result of std::smatch (<regex>). Method creating StaticTag always works, but method creating ExtendsTag never works. The only difference is regular expression which works all the time because the substring is matched. Problem here is even printing a captured group which mostly all the time gives me a garbage with suffix of correct string. Here is the code.
Execution of two methods find_static_tag and find_extends_tag (the latter does not work):
StaticTag static_tag;
if (Regex::find_static_tag(this->_content, static_tag)) {
}
ExtendsTag extends_tag;
if (Regex::find_extends_tag(this->_content, extends_tag)) {
}
Both methods presented below:
bool Regex::find_extends_tag(std::string in, ExtendsTag& tag) {
std::string pattern = "\\{%\\s*extends\\s+\"(\\S+)\"\\s*%\\}";
std::smatch match;
if (Regex::find_first(pattern, in, match)) {
std::cout << match[1] << std::endl;
tag = ExtendsTag(match[1], match.position(), match.length());
return true;
}
return false;
}
bool Regex::find_static_tag(std::string in, StaticTag& tag) {
std::string pattern = "\\{%\\s*static\\s+\"(\\S+)\"\\s*%\\}";
std::smatch match;
if (Regex::find_first(pattern, in, match)) {
std::cout << match[1] << std::endl;
tag = StaticTag(match[1], match.position(), match.length());
return true;
}
return false;
}
They need to match: {% static "style.css" %} and {% extends "master.html" %}.
Usually I got something like this instead of "master.html": �0e.html.
I think it might be memory issue, but I don't know where to start because two methods looks identical for me. It is not about creating objects of StaticTag and ExtendsTag, it fails printing match[1] in find_extends_tag method.
EDIT: Look like when I change from static to extends in the second method the result is the same. Is this something wrong with <regex> then?
Thank you for your help in advance!
The file I am trying to parse:
{% extends "master.html" %}
{% block content %}
<p>Hello ios/index.html</p>
<img src="{% static "html-css.png" %}"/>
{% endblock %}
EDIT 2:
Not sure why it started to work. I took out implementation of Regex::find_first(pattern, in, match) and put it directly in find_static_tag and find_extends_tag and it started to work correctly.
Here is implementation of find_extends_tag that works always:
bool Regex::find_extends_tag(std::string in, ExtendsTag& tag) {
std::smatch match;
if (std::regex_search(in, match, std::regex("\\{%\\s*extends\\s+\"(\\S+)\"\\s*%\\}"))) {
tag = ExtendsTag(match[1], match.position(), match.length());
return true;
}
return false;
}
Not sure what's the difference here.

Pretty printing XML in wxWidgets

I'm writing a class derived from wxStyledTextCtrl and I want it to prettify given XML without adding anything other than whitespaces. I cannot find simple working solution. I can only use wxStyledTextCtrl, wxXmlDocument and libxml2.
The result I'm aiming for is that after calling SetText with wxString containing following text
<!-- comment1 --> <!-- comment2 --> <node><emptynode/> <othernode>value</othernode></node>
the control should show
<!-- comment1 -->
<!-- comment2 -->
<node>
<emptynode/>
<othernode>value</othernode>
</node>
using libxml2 I managed to almost achieve this, but it also prints XML declaration (eg. <?xml version="1.0" encoding="UTF-8"?>) and I don't want this.
inb4, I'm looking for simple and clean solution - i don't want to manually remove first line of formatted XML
Is there any simple solution to this using given tools? I feel like I'm missing something.

Is there a simple solution? No. But if you want to write you're own pretty print function, you basically need to make a depth first iteration over the xml document tree, printing it as you go. There's a slight complication in that you also need some way of knowing when to close a tag.
Here's an incomplete example of one way to do this using only wxWidgets xml classes. Currently, it doesn't handle attributes, self closing elements (such as '' in your sample text), or any other special element types. A complete pretty printer would need to add those things.
#include <stack>
#include <set>
#include <wx/xml/xml.h>
#include <wx/sstream.h>
wxString PrettyPrint(const wxString& in)
{
wxStringInputStream string_stream(in);
wxXmlDocument doc(string_stream);
wxString pretty_print;
if (doc.IsOk())
{
std::stack<wxXmlNode*> nodes_in_progress;
std::set<wxXmlNode*> visited_nodes;
nodes_in_progress.push(doc.GetDocumentNode());
while (!nodes_in_progress.empty())
{
wxXmlNode* cur_node = nodes_in_progress.top();
nodes_in_progress.pop();
int depth = cur_node->GetDepth();
for (int i=1;i<depth;++i)
{
pretty_print << "\t";
}
if (visited_nodes.find(cur_node)!=visited_nodes.end())
{
pretty_print << "</" << cur_node->GetName() << ">\n";
}
else if ( !cur_node->GetNodeContent().IsEmpty() )
{
//If the node has content, just print it now
pretty_print << "<" << cur_node->GetName() << ">";
pretty_print << cur_node->GetNodeContent() ;
pretty_print << "</" << cur_node->GetName() << ">\n";
}
else if (cur_node==doc.GetDocumentNode())
{
std::stack<wxXmlNode *> nodes_to_add;
wxXmlNode *child = cur_node->GetChildren();
while (child)
{
nodes_to_add.push(child);
child = child->GetNext();
}
while (!nodes_to_add.empty())
{
nodes_in_progress.push(nodes_to_add.top());
nodes_to_add.pop();
}
}
else if (cur_node->GetType()==wxXML_COMMENT_NODE)
{
pretty_print << "<!-- " << cur_node->GetContent() << " -->\n";
}
//insert checks for other types of nodes with special
//printing requirements here
else
{
//otherwise, mark the node as visited and then put it back
visited_nodes.insert(cur_node);
nodes_in_progress.push(cur_node);
//If we push the children in order, they'll be popped
//in reverse order.
std::stack<wxXmlNode *> nodes_to_add;
wxXmlNode *child = cur_node->GetChildren();
while (child)
{
nodes_to_add.push(child);
child = child->GetNext();
}
while (!nodes_to_add.empty())
{
nodes_in_progress.push(nodes_to_add.top());
nodes_to_add.pop();
}
pretty_print <<"<" << cur_node->GetName() << ">\n";
}
}
}
return pretty_print;
}

Drupal7 custom menu code in template adds stray div for no reason

I am hoping someone more knowledgeable here can point out what the problem is.
I am making a custom menu for Drupal7 for a particular theme I am working on, which is using the menu_views module. Everything works pretty nicely until I pass the view menu entry over to menu_views to parse, in which case drupal adds a broken <div class=">...</div> around the parent UL element of the view menu.. I have gone through the code and don't see how this is even happening.. If I comment out the call to the view parsing, then it doesn't add this DIV, but that view parsing shouldnt' be touching the parent UL element?
Here is how the HTML is output:
<ul class="sub-menu collapse" id="parent_">
<div class="> <li class=" first=" " expanded=" " active-trail "=" ">Por nome
<ul class="menu-content collapsed in " id=" ">
<div class="view view-nameofview view-id-nameofview etc ">
<div class="view-content ">
<div class="item-list ">
<ul class="views-summary ">
<li>Á
</li>
</ul>
</div>
</div>
</div>
</ul>
</div>
</ul>
Here is the template code that causes this:
function bstheme_menu_link__main_menu($variables) {
$element = $variables['element'];
// resolve conflict with menu_views module
if (module_exists('menu_views') && $element['#href'] == '<view>') {
return _bstheme_menu_views_menu_link($variables); //<<<< IF I COMMENT OUT THIS THE OUTPUT IS FINE
}
static $item_id = 0;
// Add an ID for easy identifying in jquery and such
$element['#attributes']['id'] = 'menu_'.str_replace(' ', '_',strtolower($element['#title']));
if(!empty($element['#original_link']['menu_name']) && $element['#original_link']['menu_name'] == 'main-menu'){
if($element['#original_link']['has_children'] == 1){
$element['#attributes']['data-target'] = "jquery_updates_this";
$element['#attributes']['data-toggle'] = "collapse";
}
// add class parent and remove leaf
$classes_count = count($element['#attributes']['class']);
for($i=0;$i<$classes_count;++$i){
if($element['#attributes']['class'][$i] == 'expanded'){
//$element['#attributes']['class'][$i] = 'collapse';
}
if($element['#original_link']['plid'] == 0){
if($element['#attributes']['class'][$i] == 'leaf'){
unset($element['#attributes']['class'][$i]);
}
}
else{
if($element['#attributes']['class'][$i] == 'leaf'){
$element['#attributes']['class'][$i] = '';
}
}
}
}
// code to add a span item for the glythicons
$switch = $element['#original_link']['has_children'];
$element['#localized_options']['html'] = TRUE;
if($switch == 1) {
$linktext = $element['#title'] . '<span class="arrow"></span>';
} else {
$linktext = $element['#title'];
}
// if there's a submenu, send the parsing to the custom function instead of the main one to wrap different classes
if ($element['#below']) {
foreach ($element['#below'] as $key => $val) {
if (is_numeric($key)) {
$element['#below'][$key]['#theme'] = 'menu_link__main_menu_inner'; // 2 lavel
}
}
$element['#below']['#theme_wrappers'][0] = 'menu_tree__main_menu_inner'; // 2 lavel
$sub_menu = drupal_render($element['#below']);
$element['#attributes']['class'][] = 'menu-toggle';
}
//$sub_menu = $element['#below'] ? drupal_render($element['#below']) : '';
$output = l($linktext, $element['#href'], $element['#localized_options']);
return '<li' . drupal_attributes($element['#attributes']) . '>' . $output . $sub_menu . '</li>'."\n";
}
function _bstheme_menu_views_menu_link(&$variables) {
// Only intercept if this menu link is a view.
$view = _menu_views_replace_menu_item($variables['element']);// <<< MENU VIEWS PARSING
if ($view !== FALSE) {
if (!empty($view)) {
$sub_menu = '';
if ($variables['element']['#below']) {
$sub_menu = render($variables['element']['#below']);
}
return '' . $view . $sub_menu . "\n"; // <<< RETURN PATH
}
return '';
}
return theme('menu_views_menu_link_default', $variables);
}
Any pointers on how to troubleshoot something like this, or if someone has encountered this problem before and has a solution, would be greatly helpful!

From your code, it's apparent you're using Drupal 7.
First things first, you may want to enable theme debug mode. This allows for you to see where the theming function that caused your
You can do so by putting the following line in your settings.php file
$conf['theme_debug'] = TRUE;
Flush your caches after you make this change.
You will now have debug code output to your Drupal HTML source, when you view the site's source. An example of the type of output is shown below:
<!-- THEME DEBUG -->
<!-- CALL: theme('page') -->
<!-- FILE NAME SUGGESTIONS:
x page--front.tpl.php
* page--node.tpl.php
* page.tpl.php
-->
With this debug, you should be able to see exactly which theme functions run, in which order, and by working through them from start to finish, you should be able to determine between which theme is responsible.
At this point, if you want to keep Drupal-best-practices, copy the file name suggestion from the debug output to a folder inside your theme folder. I usually put all template overrides in a sub-directory inside it.
In the case above, if it was page.tpl.php, I'd copy it to /themes/mytheme/templates/, and go hack on it to see whether the offending div is being generated there.
Best of luck, and if you hit a stuck end, I'd be happy to help point you in a direction more specific to your specific user case.
Best,
Karl

Find Key in XML with Boost

I am using boost for the first time within an old code base that we have
iptree children = pt.get_child(fieldName);
for (const auto& kv : children) {
boost::property_tree::iptree subtree = (boost::property_tree::iptree) kv.second ;
//Recursive call
}
My problem is sometimes the fieldName doesn`t exist in the XML file and I have an exception
I tried :
boost::property_tree::iptree::assoc_iterator it = pt.find(fieldName);
but I dont know how to use the it I can`t use: if (it != null)
Any help please will be appreciated
I am using VS 2012
If it`s very complicated is there any other way to read a XML with nested nodes? I am working on that since 3 days
This is an Example of the XML
<?xml version="1.0" encoding="utf-8"?>
<nodeA xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<nodeA.1>This is the Adresse</nodeA.1>
<nodeA.2>
<node1>
<node1.1>
<node1.1.1>Female</node1.1.1>
<node1.1.2>23</node1.1.2>
<node1.1.3>Engineer</node1.1.3>
</node1.1>
<node1.2>
<node1.2.1>Female</node1.2.1>
<node1.2.2>35</node1.2.2>
<node1.2.3>Doctors</node1.2.3>
</node1.2>
</node1>
</nodeA.2>
<nodeA.3>Car 1</nodeA.3>
</nodeA>

Use pt.get_child_optional(...) to prevent an exception. pt.find(...) returns an iterator which compares true to pt.not_found() on failure.
EDIT: How to use boost::optional<--->
boost::optional< iptree & > chl = pt.get_child_optional(fieldname);
if(chl) {
for( auto a : *chl )
std::cerr << ":" << a.first << ":" << std::endl;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

can't get xhtml <script> content with libxml++ using xpath expression - c++

Related

Getting error from sscanf when trying to read string using CGI

Getting garbage when reading string from std::smatch

Pretty printing XML in wxWidgets

Drupal7 custom menu code in template adds stray div for no reason

Find Key in XML with Boost

Categories

Resources