I've been trying to get a simple reg-ex working in Clojure to test a string for some SQL reserved words (select, from, where etc.) but just can't get it to work:
(defn areserved? [c]
(re-find #"select|from|where|order by|group by" c))
(I split a string by spaces then go over all the words)
Help would be greatly appreciated,
Thanks!
EDIT: My first goal (after only reading some examples and basic Clojure materials) is to parse a string and return for each part of it (i.e. words) what "job" they have in the statement (a reserved word, a string etc.).
What I have so far:
(use '[clojure.string :only (join split)])
(defn isdigit? [c]
(re-find #"[0-9]" c))
(defn isletter? [c]
(re-find #"[a-zA-Z]" c))
(defn issymbol? [c]
(re-find #"[\(\)\[\]!\.+-><=\?*]" c))
(defn isstring? [c]
(re-find #"[\"']" c))
(defn areserved? [c]
(if (re-find #"select|from|where|order by|group by" c)
true
false))
(defn what-is [token]
(let [c (subs token 0 1)]
(cond
(isletter? c) :word
(areserved? c) :reserved
(isdigit? c) :number
(issymbol? c) :symbol
(isstring? c) :string)))
(defn checkr [token]
{:token token
:type (what-is token)})
(defn goparse [sql-str]
(map checkr (.split sql-str " ")))
Thanks for all the help guys! it's great to see so much support for such a relatively new language (at least for me :) )
I'm not entirely sure what you want exactly, but here's a couple of variations to coerce your first regex match to a boolean:
(defn areserved? [c]
(string?
(re-find #"select|from|where|order by|group by"c)))
(defn areserved? [c]
(if (re-find #"select|from|where|order by|group by"c)
true
false))
UPDATE in response to question edit:
Thanks for posting more code. Unfortunately there are a number of issues here that we could
try to address by patching your existing code in a simplistic and naïve fashion, but it will
only get you so far, before you hit the next problem with this single iteration approach.
#alex is correct, that your areserved? method will fail to match order by if you have already
split your string by white space. That said, a simple fix is to treat order and by as separate keywords (which they are, even though they always appear together).
The next issue is that the areserved? function will match keywords in a string, but you are dispatching it against a character in the what-is function. You nearly always get a match in your cond for isletter?, so you will everything is marked as a 'word'.
All in all, it looks like you are trying to do too much work in a single application of map.
I'm not sure if you are just doing this for fun to play with Clojure (which is admirable - keep going!), in which case, maybe it doesn't matter if you press on with this simple parsing approach... you'll definitely learn something; but if you would like to take it further and parse SQL more successfully, then I would suggest that you may find it helpful to to read a little on Lexing, Parsing and building Abstract Syntax Trees (AST).
Brian Carper has written about using the Java parser generator "ANTLR" from Clojure - it's a few years old, but might be worth looking at.
You also might be able to get some transferrable ideas from this chapter from the F# programming book on lexing and parsing SQL.
Related
Here's some code I wrote, using clojure.core.match, which performs a pretty common programmng task. A function takes some "commands" (or "objects", "records", or whatever you prefer to call them), has to do something different with each type, and has to destructure them to figure out exactly what to do, and different command types might have to be destructured differently:
(defn action->edits [g action]
"Returns vector of edits needed to perform action in graph g."
(match action
[:boost from to]
[[:add-edge from to 1.0]]
[:retract from to]
[[:remove-edge from to]]
[:normalize from to] ; a change has just been made to from->to
(map (fn [that] [:remove-edge from that])
(successors-except g from to))
[:recip-normalize to from] ; a change has just been made to from->to
[]
[:reduce-to-unofficial from to competitor]
[[:remove-edge from to] (make-competitive-edge from competitor]))
I'm mostly imitating the way people commonly use the pmatch macro in Scheme. I'd like to know what's the idiomatic way to do this in Clojure.
Here's what I like about the above code:
It's very readable.
It was effortless to write.
Here's what I don't like:
Accessing the from and to fields from anywhere but inside a match macro is extremely unreadable and error-prone. For example, to extract the from element from most of the action vectors, you write (action 1). That code will break if I ever add a new action, and it breaks right now on :recip-normalize.
The code generated by match is inefficient: it searches by repeatedly throwing and catching exceptions. It doesn't just generate a big nested if.
I experimented a little with representing the commands as maps, but it seemed to get verbose, and the name of the command doesn't stand out well, greatly reducing readability:
(match action
{:action :boost :from from :to to}
[{:edit :add-edge :from from :to to :weight 1.0}]
{:action :retract :from from :to to}
[{:edit :remove-edge :from from :to to}]
. . .)
Probably future versions of match will generate better code, but the poor code generated now (and lack of support for records) suggests that in Clojure, people have been handling this kind of thing happily for years without match. So how do you do this kind of thing in Clojure?
I would utilize clojure's build-in destructuring facilities, since I do not see a requirement for core.match here - but I might be missing something.
For example:
(defn action->edits [g [action from to]]
(condp = action
:boost "boosting"
:retract "retracting"
:normalize-ksp-style (recur g [:boost from to])
nil))
(action->edits 2 [:normalize-ksp-style 1 2])
;=> "boosting"
I am a beginner in clojure. I am trying to solve this simple problem on codechef using clojure. Below is my clojure code but this code is taking too long to run and gives TimeoutException. Can someone please help me to optimize this code and make it run faster.
(defn checkCase [str]
(let [len (count str)]
(and (> len 1) (re-matches #"[A-Z]+" str))))
(println (count (filter checkCase (.split (read-line) " "))))
Note: My program is not getting timedout due to input error. On codechef input is handled automatically (probably through input redirection. Please read the question for more details)
Thank you!
Most text finding exercises are exercizes in regexps, this one no different. It's usually pretty hard to find a more efficient way in whatever programming language that will outpace good regexp implementations.
In this case re-seq, look around regexps, repetition limiting and the multiline regexp flag (?m) are your friends
(defn find-acronyms
[s]
(re-seq #"(?m)(?<=\W|^)[A-Z]+(?=\W|$)" s))
(find-acronyms "I like coding and will participate in IOI Then there is ICPC")
=> ("IOI" "ICPC")
Let's dissect the regex:
(?m) The multiline flag: lets you match your regex over multiple lines, so no need to split into multiple strings
(?<=\W|^) The match should follow a non-word character or the beginning of the (multiline) string
[A-Z]{2,} Match concurrent capital letters, a minimum of 2
(?=\W|$) The match should be followed by a non-word character or the end of the (multiline) string
I can only guess that wherever you run this snippet of code, it doesn't feed anything to your read-line invocation. Or maybe it does, but doesn't send a newline as the last thing. So it hangs waiting.
(defn checkCase [str]
(let [len (count str)]
(and (> len 1) (re-matches #"[A-Z]+" str))))
(defn answer [str]
(println (count (filter checkCase (.split str " ")))))
So at the REPL:
=> (answer "GGG fff TTT")
;-> 2
;-> nil
The answer is being printed to the screen. But probably best to have your function return the answer rather than print it out:
(defn answer [str]
(count (filter checkCase (.split str " "))))
All I have done is replaced your (read-line) with an argument. (read-line) is expecting input from stdin and waiting for it forever - or until a timeout happens in your case.
I am not sure if this is the slow part of your code, but if it is your could try to split up the execution and safe gard the very slow regexp part by executing it when it is necessary. I think the current version with AND already does that. If it does not you can try to do something else, like this:
(defn checkCase [^String str]
(cond
(< (.length str) 2)
false
(re-matches #"[A-Z]+" str)
true
:else
false))
maybe you could try using re-seq instead of spltting the string and checking every item? So you will lose the filter, .split, and additional function call. Something like this:
(println (count (re-seq #"\b[A-Z]{2,}?\b" (read-line))))
You need to submit a Java program. You can test it on the command line before you submit it. You can but don't need to use redirection symbols (<,>). Just type the input and see that every time you do it returns the count after you have typed enter.
You will need aot compilation (Ahead Of Time, which means that .class files are included) and a main that is exported. Only then will it become a Java program.
Actually when they ask for a Java program they probably mean a .class file. You can run a .class file with the java program (which I imagine is what their test-runner does). Put it in a shell or batch file when testing, but just submit the .class file.
I am trying to make a very simple Nim game that probably isn't even considered a proper implementation of Nim, but I am only starting Clojure. Not sure why this subtraction on line four doesn't work...
1. (def nimBoard 10)
2. (println "There are" nimBoard "objects left")
3. (def in (read-line))
4. (- nimBoard in)
I can't seem to come up with a solid algorithm for asking the user if they want to remove one or two "objects" from the board until it is empty. I am coming from Java, but loops in this language just confuse me a lot. I know what I am trying to make isn't exactly the Game of Nim, but it is for practice.
I would appreciate any help:)
Since in is a string you read from standard in, you need to convert in to a number first before the subtraction. Try this:
(defn parse-int [s]
(Integer. (re-find #"\d+" s )))
(- nimBoard (parse-int in))
Background
I've written a hack for Emacs that lets me send a Clojure form from an editor buffer to a REPL buffer. It's working fine, except that if the two buffers are in different namespaces the copied text doesn't usually make sense, or, worse, it might make sense but have a different meaning to that in the editor buffer.
I want to transform the text so that it makes sense in the REPL buffer.
A Solution in Common Lisp
In Common Lisp, I could do this using the following function:
;; Common Lisp
(defun translate-text-between-packages (text from-package to-package)
(let* ((*package* from-package)
(form (read-from-string text))
(*package* to-package))
(with-output-to-string (*standard-output*)
(pprint form))))
And a sample use:
;; Common Lisp
(make-package 'editor-package)
(make-package 'repl-package)
(defvar repl-package::a)
(translate-text-between-packages "(+ repl-package::a b)"
(find-package 'editor-package)
(find-package 'repl-package))
;; => "(+ A EDITOR-PACKAGE::B)"
The package name qualifications in the input string and the output string are different—exactly what's needed to solve the problem of copying and pasting text between packages.
(BTW, there's stuff about how to run the translation code in the Common Lisp process and move stuff between the Emacs world and the Common Lisp world, but I'm ok with that and I don't particularly want to get into it here.)
A Non-Solution in Clojure
Here's a direct translation into Clojure:
;; Clojure
(defn translate-text-between-namespaces [text from-ns to-ns]
(let [*ns* from-ns
form (read-string text)
*ns* to-ns]
(with-out-str
(clojure.pprint/pprint form))))
And a sample use:
;; Clojure
(create-ns 'editor-ns)
(create-ns 'repl-ns)
(translate-text-between-namespaces "(+ repl-ns/a b)"
(find-ns 'editor-ns)
(find-ns 'repl-ns))
;; => "(+ repl-ns/a b)"
So the translation function in Clojure has done nothing. That's because symbols and packages/namespaces in Common Lisp and Clojure work differently.
In Common Lisp symbols belong to a package and the determination of a symbol's package happens at read time.
In Clojure, for good reasons, symbols do not belong to a namespace and the determination of a symbol's namespace happens at evaluation time.
Can This Be Done in Clojure?
So, finally, my question: Can I convert Clojure code from one namespace to another?
I don't understand your use case, but here is a way to transform symbols from one namespace to another.
(require 'clojure.walk 'clojure.pprint)
(defn ns-trans-form [ns1 ns2 form]
(clojure.walk/prewalk
(fn [f] (if ((every-pred symbol? #(= (namespace %) ns1)) f)
(symbol ns2 (name f))
f))
form))
(defn ns-trans-text [ns1 ns2 text]
(with-out-str
(->> text
read-string
(ns-trans-form ns1 ns2)
clojure.pprint/pprint)))
(print (ns-trans-text "editor-ns" "repl-ns" "(+ editor-ns/a b)" ))
;=> (+ repl-ns/a b)
So, editor-ns/a was transformed to repl-ns/a.
(Answering my own question...)
Given that it's not easy to refer to a namespace's non-public vars from outside the namespace, there's no simple way to do this.
Perhaps a hack is possible, based on the idea at http://christophermaier.name/blog/2011/04/30/not-so-private-clojure-functions. That would involve walking the form and creating new symbols that resolve to new vars that have the same value as vars referred to in the original form. Perhaps I'll investigate this further sometime, but not right now.
My code is a mess many long lines in this language like the following
(defn check-if-installed[x] (:exit(sh "sh" "-c" (str "command -v " x " >/dev/null 2>&1 || { echo >&2 \"\"; exit 1; }"))))
or
(def Open-Action (action :handler (fn [e] (choose-file :type :open :selection-mode :files-only :dir ListDir :success-fn (fn [fc file](setup-list file)))) :name "Open" :key "menu O" :tip "Open spelling list"))
which is terrible. I would like to format it like so
(if (= a something)
(if (= b otherthing)
(foo)))
How can I beautify the source code in a better way?
The real answer hinges on whether you're willing to insert the newlines yourself. Many systems
can indent the lines for you in an idiomatic way, once you've broken it up into lines.
If you don't want to insert them manually, Racket provides a "pretty-print" that does some of what you want:
#lang racket
(require racket/pretty)
(parameterize ([pretty-print-columns 20])
(pretty-print '(b aosentuh onethunoteh (nte huna) oehnatoe unathoe)))
==>
'(b
aosentuh
onethunoteh
(nte huna)
oehnatoe
unathoe)
... but I'd be the first to admit that inserting newlines in the right places is hard, because
the choice of line breaks has a lot to do with how you want people to read your code.
I use Clojure.pprint often for making generated code more palatable to humans.
it works well for reporting thought it is targeted at producing text. The formatting built into the clojure-mode emacs package produces very nicely formatted Clojure if you put the newlines in your self.
Now you can do it with Srefactor package.
Some demos:
Formatting whole buffer demo in Emacs Lisp (applicable in Common Lisp as well).
Transform between one line <--> Multiline demo
Available Commands:
srefactor-lisp-format-buffer: format whole buffer
srefactor-lisp-format-defun: format current defun cursor is in
srefactor-lisp-format-sexp: format the current sexp cursor is in.
srefactor-lisp-one-line: turn the current sexp of the same level into one line; with prefix argument, recursively turn all inner sexps into one line.
Scheme variants are not as polished as Emacs Lisp and Common Lisp yet but work for simple and small sexp. If there is any problem, please submit an issue report and I will be happy to fix it.