Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 (test-xml.clj:6) - clojure

I'm trying to write a function in clojure that prints out xml using clojure.xml/emit.
(ns test.xml.emit
(:use clojure.core)
(:require [clojure.xml :as xml]))
(defn testemit []
(xml/emit {:tag :web-app
:attrs {:xmlns:xsi "http://www.w3.org/2001/XMLSchema-instance"
:xmlns "http://java.sun.com/xml/ns/javaee"
:xmlns:web "http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
:xsi:schemaLocation "http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd"
:id "Foo"
:version "1.0"},
:content [{:display-name "FooBar+"}
{:listener
{:listener-class "com.example.server.Main"}}
{:filter
{:filter-name "guiceFilter"}
{:filter-class "com.google.inject.servlet.GuiceFilter"}}
{:filter-mappings
{:filter-name "guiceFilter"}
{:url-pattern "/*"}}]}))
I know what the exception means, but I'm not sure howit relates to my code. Could someone please point me in the right direction?
The full stack trace is available at https://gist.github.com/1838248
Thank you for your time and consideration

The error for this in Clojure 1.3 is a bit more helpful: "Map literal must contain an even number of forms".
The problem is in the last two entries of the :content vector: they are literal maps containing three forms. Maps consist of key-value pairs, so must contain an even number.
Further, the stuff in :content doesn't look like valid data to pass to emit. Each node should have :tag, :attrs, and :content attributes, or be a string.

Related

Generating from recursive definitions with Clojure Spec

Let's consider a Clojure Spec regexp for hiccup syntax
(require '[clojure.spec :as spec])
(spec/def ::hiccup
(spec/cat :tag keyword?
:attributes (spec/? map?)
:content (spec/* (spec/or :terminal string?
:element ::hiccup))))
which works splendidly
(spec/conform ::hiccup [:div#app [:h5 {:id "loading-message"} "Connecting..."]])
; => {:tag :div#app, :content [[:element {:tag :h5, :attributes {:id "loading-message"}, :content [[:terminal "Connecting..."]]}]]}
until you try to generate some example data for your functions from the spec
(require '[clojure.spec.gen :as gen])
(gen/generate (spec/gen ::hiccup))
; No return value but:
; 1. Unhandled java.lang.OutOfMemoryError
; GC overhead limit exceeded
Is there a way to rewrite the spec so that it produces a working generator? Or do we have to attach some simplified generator to the spec?
The intent of spec/*recursion-limit* (default 4) is to limit recursive generation such that this should work. So either that's not working properly in one of the spec impls (* or or), or you are seeing rapid growth in something else (like map? or the strings). Without doing some tinkering, it's hard to know which is the problem.
This does generate (a very large example) for me:
(binding [spec/*recursion-limit* 1] (gen/generate (spec/gen ::hiccup)))
I do see several areas where the cardinalities are large even in that one example - the * and the size of the generated attributes map?. Both of those could be further constrained. It would be easiest to break these parts up further into more fine-grained specs and supply override generators where necessary (the attribute map could just be handled with map-of and :gen-max).

Map literal must contain an even number of forms

I'm trying to make an requiest to a web service which requires data in json and a secret (:key)
(ns fdsfdsfds.core
(:require [clj-http.client :as client])
(:require [clojure.data.json :as json]))
(defn -main [& args]
(client/post "https://fsdfdsfd.com/api/fdsfds"
{:body {(json/write-str {:key "fdsfdsfdsfd"})}}))
I'm having an error:
Exception in thread "main" java.lang.RuntimeException: Map literal must contain an even number of forms
There're the even number of them, though.
The problem is here:
{:body { (json/write-str {:key "fdsfdsfdsfd"}) }}
^-- single item missing value? --^
^----- this is a map too
There's nothing paired with the function call.
The :body has a map as a value, but you only have a function in there, not a possible key for its value, or if that is the key, there's no value for it.
You probably want to remove the outer map brackets and leave:
{:body (json/write-str {:key "fdsfdsfdsfd"})}
EDIT AFTER COMMENTS:
You're asking why the example on the site is using a map. Look carefully at the value being used, it's a string
(client/post "url://site.com/api"
{:basic-auth ["user" "pass"]
:body "{\"json\": \"input\"}"
;; ...
The map is made up of lines of key/value pairs. The first is
key = :basic-auth, value = ["user" "pass"]
The value here is an array.
The second line is:
key = :body, value = "any old string"
In this case the string is an escaped map, the same that would be returned from calling json/write-str

enlive: smashing vectors of nodes together

So I have finally realized that I can use selectors to limit the portions of the page nodes that enlive transforms, that way I can create vectors of non-intersecting nodes.
Lots of words to say:
(defn b-content-transform []
(def b-area (eh/select global-page [:.b])) ;;cuts out all irrelevant nodes
(eh/transform b-area [:.b]
(eh/clone-for [i (range numberOfB)]
(eh/content (b-sample-content i)))))
So this returns something like..
[{:tag :div, :attrs {:class "b"}, :content ({:tag :div, :attrs {:id "b0", :class "topB"}]
Which is excellent, enlive nodes yay!
Now I have several transforms that act the same way.
My question is: how can I mash all the resultant vectors (?) together?
Well it turns out there is a knee-slappingly simple solution:
(concat transform1 transform2 transform3)
then enlive-html/emit* .

Available Clojure XML parsing libraries that complement clojure-xml/parse

Are there secondary clojure xml parsing projects that could be used after or in conjunction with clojure-xml/parse, and, if so, what are they?
clojure-xml/parse works wonderfully, but the map returned by clojure-xml/parse is deeply nested, at least after parsing one of our water cuts/tampers xml files. I am wondering if a secondary library exists that would allow me to parse further.
Here is just part of our xml file deliberately folded so you do not have to scroll.
:content [{:tag :Header, :attrs nil, :content [{:tag :ExportType,
:attrs nil, :content ["Tamper Export"]}
{:tag :CurrentDateTime, :attrs nil, :content ["
Notice the vector with embedded maps.
I can certainly develop something that could be used to parse this further, but I was just wondering if a module already exists.
Thank You.
The library to "parse" the content further is clojure.core. The functions and macros there can do a very good job of transforming the data structure generated from the XML into something useful. My personal favorite technique is using the two threading macros while making use of first and the keyword functions. If I need to do more than just digging deep, I'll write a quick function I can use map on.
The data structure you get back from the clojure.xml/parse is just as deep as the xml - each element has one map with three items, the content being a vector of child elements and strings. It may look a little bit deeper, but it's just an open representation of what would be stored, say, in the Java XML objects. It's biggest advantage is you don't need a special API to work with it - the functions you use on normal data work on the XML just as well. If anything, you write a few functions to translate into your domain and that's it.
Say you have something like the following (I'm leaving out attrs for brevity):
{:tag :stuff
:content [{:tag item
:content [{:tag :key :content ["Key one"]}
{:tag :value :content ["Item one"]}]}
{:tag item
:content [{:tag :key :content ["Key two"]}
{:tag :value :content ["Item two"]}]}]}
It's nested, but make a utility function for transforming each item into something usable.
(defn transform-item [item]
(let [key-element (-> item :content first)
value-element (-> item :content second)]
[(-> key-element :content first)
(-> value-element :content first)]))
And then map that on the content of the root element.
(defn transform-stuff [stuff-xml]
(into {} (map transform-item (:content stuff-xml)))
And you should end up with some data which actually represents your domain.
{"Key one" "Item One", "Key two" "Item 2"}
The key is to not think of it as parsing, but just as translating one data structure into another.

How to select nth element of particular type in enlive?

I am trying to scrape some data from a page with a table based layout. So, to get some of the data I need to get something like 3rd table inside 2nd table inside 5th table inside 1st table inside body. I am trying to use enlive, but cannot figure out how to use nth-of-type and other selector steps. To make matters worse, the page in question has a single top level table inside the body, but (select data [:body :> :table]) returns 6 results for some reason. What the hell am I doing wrong?
For nth-of-type, does the following example help?
user> (require '[net.cgrand.enlive-html :as html])
user> (def test-html
"<html><head></head><body><p>first</p><p>second</p><p>third</p></body></html>")
#'user/test-html
user> (html/select (html/html-resource (java.io.StringReader. test-html))
[[:p (html/nth-of-type 2)]])
({:tag :p, :attrs nil, :content ["second"]})
No idea about the second issue. Your approach seems to work with a naive test:
user> (def test-html "<html><head></head><body><div><p>in div</p></div><p>not in div</p></body></html>")
#'user/test-html
user> (html/select (html/html-resource (java.io.StringReader. test-html)) [:body :> :p])
({:tag :p, :attrs nil, :content ["not in div"]})
Any chance of looking at your actual HTML?
Update: (in response to the comment)
Here's another example where "the second <p> inside the <div> inside the second <div> inside whatever" is returned:
user> (def test-html "<html><head></head><body><div><p>this is not the one</p><p>nor this</p><div><p>or for that matter this</p><p>skip this one too</p></div></div><span><p>definitely not this one</p></span><div><p>not this one</p><p>not this one either</p><div><p>not this one, but almost</p><p>this one</p></div></div><p>certainly not this one</p></body></html>")
#'user/test-html
user> (html/select (html/html-resource (java.io.StringReader. test-html))
[[:div (html/nth-of-type 2)] :> :div :> [:p (html/nth-of-type 2)]])
({:tag :p, :attrs nil, :content ["this one"]})