Large defrecord causes "Method code too large" - clojure

Is there a way to build a defrecord with lots of fields? It appears there is a limit of around 122 fields, as this gives a "Method code too large!" error:
(defrecord WideCsvFile
[a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19
a20 a21 a22 a23 a24 a25 a26 a27 a28 a29 a30 a31 a32 a33 a34 a35 a36 a37 a38 a39
a40 a41 a42 a43 a44 a45 a46 a47 a48 a49 a50 a51 a52 a53 a54 a55 a56 a57 a58 a59
a60 a61 a62 a63 a64 a65 a66 a67 a68 a69 a70 a71 a72 a73 a74 a75 a76 a77 a78 a79
a80 a81 a82 a83 a84 a85 a86 a87 a88 a89 a90 a91 a92 a93 a94 a95 a96 a97 a98 a99
a100 a101 a102 a103 a104 a105 a106 a107 a108 a109 a110 a111 a112 a113 a114 a115 a116 a117 a118 a119
a120 a121 a122])
while removing any of the fields allows record creation.

Java has a maximum size for its methods (see the answers to this question for specifics). defrecord creates methods whose size depends on the number of values the record will contain.
To deal with this issue, I see two options:
macroexpand-1 your call to defrecord, copy the results, and find a way to re-write the generated methods to be smaller.
Take a different approach to storing your data, such as using Clojure's vector class.
EDIT:
Now that I know what you want to do, I am more convinced that you should use vectors. Since you want to use indexes like a101, I've written you a macro to generate them:
(defmacro auto-index-vector [v prefix]
(let [indices (range (count (eval v)))
definitions (map (fn [ind]
`(def ~(symbol (str prefix ind)) ~ind)) indices)]
`(do ~#definitions)))
Let's try it out!
stack-prj.bigrecord> (def v1 (into [] (range 122)))
#'stack-prj.bigrecord/v1
stack-prj.bigrecord> (auto-index-vector v1 "a")
#'stack-prj.bigrecord/a121
stack-prj.bigrecord> (v1 a101)
101
stack-prj.bigrecord> (assoc v1 a101 "hi!")
[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
95 96 97 98 99 100 "hi!" 102 103 104 105 106 107 108 109 110 111 112
113 114 115 116 117 118 119 120 121]
To use this: you'll read your CSV data into a vector, call auto-index-vector on it with the prefix of your choosing, and use the resulting indices to perform vector operations on your data.

Related

Clojure: How do you apply a function to values at a specific nesting level?

I'm a beginner at Clojure so I'll do my best to phrase this as well as I can,
I have a function that returns a list of nested lists
after parsing a dataset of daily temperatures,
each nested list corresponds to daily temps of a specific month e.g Feb 2014, Feb 2015 etc. and is padded out to 31 items using "-999" as filler to retain the dataset's structure.
raw dataset: https://www.metoffice.gov.uk/hadobs/hadcet/cetdl1772on.dat
(partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong")))
=>((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999)
(0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999)
(-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999)
(97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)
(-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))
I'm trying to remove the -999 values from all nested lists in the list, I need to do this after partitioning the data to avoid having to partition the data arbitrarily by a number of days in each month.
The closest I've got is below but it has no effect as it's only being applied to the top-level list instead of the values in each nested list, How would I need to modify this to get the result I'm looking for, Or to ask my original question;
How do you apply a function to values at a specific nesting level?
(remove #(= -999 %)(partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong"))))
Below is the minimal code with a chunk of the results from my partitioning function, I think it's very close but if you can show me what I'm missing I would really appreciate it, Thanks
(remove #(= -999 %)'(((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999)
(0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999)
(-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999)
(97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)
(-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))))
I've tried the below and loads of variations on it with map etc, but haven't got anywhere, Seeing a correct example would really help me understand where I'm going wrong.
(apply #(remove -999 %) (partition 31 (monthly-helper 2 (parse-into-list "CETdataDailyLong"))))
Exception: Wrong number of args (21) passed
So iiuc, the:
Overall list contains year lists, and the
Year lists contain month lists, and the
Month lists contain the temperatures for the days, and
The month lists are each padded w/ -999's to make them uniform in size: 31 entries long
What I see that you've tried:
You've used the remove function w/ a predicate to remove if the value equals -999. The value in this case is '((-15 7 15 -25 -5 -45 12 ...)) which does not equal -999, so you end up w/ what you started with.
apply takes a function and a single sequence of args. You passed in 21 lists to apply.
With all this probably understood, I think the easiest solution is a nested for loop. A for loop returns a list of your values, optionally modified by a function. Each value is a list, so you need to go deeper w/ another for loop.
; Remove -999's, three levels deep, with for.
(defn remove-999s [s-of-s]
; All data
(for [year s-of-s]
; For all years
(for [month year]
; For all months
; (filter #(not (= % -999)) month) would also work
(remove #(= % -999) month))))
(remove-999s '(((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56 -999 -999) (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70 -999 -999 -999) (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81 -999 -999 -999) (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66 -999 -999 -999)(-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40 -999 -999))))
Here's the result, without the -999's.
; (((-15 7 15 -25 -5 -45 12 47 56 28 20 40 57 38 2 5 25 -3 0 7 7 -3 -10 -10 30 85 46 77 56)
; (0 17 -28 -23 -30 5 -18 -3 -33 -23 -18 -3 -10 50 82 72 62 42 15 57 75 40 92 52 42 62 72 70)
; (-2 -12 4 28 12 0 44 27 -12 16 74 61 76 87 77 78 51 51 59 56 64 52 78 63 39 28 33 81)
; (97 58 75 103 33 46 88 101 56 47 66 36 52 47 58 42 42 37 63 77 76 43 55 85 58 57 55 66)
; (-59 19 28 55 47 30 52 49 42 50 45 25 34 70 40 54 24 13 25 54 85 29 27 38 25 73 44 50 40))) [End of data]
Because Clojure doesn't allow nested #'s, and nesting fn's gets gross, if you want to use maps like Biped suggests, you'll probably want to use it with letfn or defn. Here's how I did it:
; Remove -999's, three levels deep, with maps.
(defn remove-999s [s-of-s]
(letfn [(is-999 [v] (= v -999))
( map-month [s] (remove is-999 s))
( map-year [s] (map map-month s)) ]
(map map-year s-of-s))) ; Gives the same results.
After writing this, I realized that for is like a weird map, so either can be used.
Another alternative's loop and recur or otherwise classic recursion.
i would start with an utility function, updating nested sequences at any level.
it could look like this:
(defn update-nested [level f]
(cond (neg? level) identity
(zero? level) f
:else (partial map (update-nested (dec level) f))))
user> ((update-nested 0 (partial remove #{1})) [1 1 0 1])
;;=> (0)
user> ((update-nested 1 (partial remove #{1})) [[1 1 0 1] [0 0 1 0]])
;;=> ((0) (0 0 0))
user> ((update-nested 2 (partial remove #{1})) [[[1 1] [0 1]] [[0 0] [1 0]]])
;;=> ((() (0)) ((0 0) (0)))
user> ((update-nested 3 (partial remove #{1})) [[[[1 1] [0 1]]] [[[0 0] [1 0]]]])
;;=> (((() (0))) (((0 0) (0))))
user> ((update-nested 3 reverse) [[[[1 1] [0 1]]] [[[0 0] [1 0]]]])
;;=> ((((1 1) (1 0))) (((0 0) (0 1))))
Your first exhibit is a list-of-lists. And your desired output is also a list-of-lists -- but different lists. Therefore, you want map instead of apply.
(require '[com.rpl.specter :as s])
(def data '(your list here))
(s/setval (s/walker #(= % -999)) s/NONE data)

How do I extract data as a data frame from a text file in R? The data has names in it and the middle names are messing with my method

I have a text file where strings are separated by whitespaces. I can easily extract these into R as a data frame, by first using the scan command and then seeing that each record has 15 strings in them.
So data[1:15} is one row, data[16:30} is the other row and so on. In each of these records, the name is composed of two strings, say FOO and BAR. But some records have names such as FOO BOR BAR or even FOO BOR BOO BAR. This obviously messes with my 15 string theory. How can I easily extract the data into a data frame?
So my data is in my working directory called results.txt.
I use this to scan my data:
mech <- scan("results.txt", "")
Then I can make the data frames like this:
d1 <- t(data.frame(mech[1:15]))
d2 <- t(data.frame(mech[16:30]))
d3 <- t(data.frame(mech[31:45]))
My plan was to iterate this in a for loop and rbind the data into one consolidated data frame.
d1 results in something like
1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
d2 results in
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
Here, FOO and BAR are first and last names, respectively. Most records are like this. But d3:
3 FOO3 BOR BAR3 2K12/ME/03 72 83 61 75 44 88 75 165 91 30
Because of the extra middle name, I lose the final string of the text, the part right after 30. This then spills over to the next record. So row 46:60, instead of starting with 4, begins with the omitted data from the previous record.
How can I extract the data by treating the names as a single string?
EDIT: Stupid of me for not providing the data frame itself. Here is a sample.
1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93
s1 <- "1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93"
s2 <- readLines(textConnection(s1)) #read from your file here
s2 <- strsplit(s2, "\\s+") #splits by white space
s3 <- lapply(s2, function(s) {
n <- length(s)
s[2] <- paste(s[2:(2 + (n - 14))], collapse = " ")
s[-(3:(2 + (n - 14)))]
})
DF <- do.call(rbind, s3)
DF <- as.data.frame(DF, stringsAsFactors = FALSE)
DF[] <- lapply(DF, type.convert, as.is = TRUE)
str(DF)
#'data.frame': 4 obs. of 14 variables:
# $ V1 : int 1 2 3 4
# $ V2 : chr "FOO BAR" "FOO2 BAR2" "FOO3 BOR BAR3" "FOO4 BOR BAR4"
# $ V3 : chr "2K12/ME/01" "2K12/ME/02" "2K12/ME/03" "2K12/ME/04"
# $ V4 : int 96 72 63 89
# $ V5 : int 86 83 84 88
# $ V6 : int 86 61 62 74
# $ V7 : int 92 75 62 79
# $ V8 : int 73 44 50 77
# $ V9 : int 86 88 79 83
# $ V10: int 72 75 74 68
# $ V11: int 168 165 157 182
# $ V12: int 82 91 85 82
# $ V13: int 30 30 30 30
# $ V14: num 84.9 72.6 69.1 81.9
One approach is to use regex to enclose the names in quotes and then a simple read table. This approach has the advantage of allowing for cases with any number of names.
s1 <- "1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93"
s2 <- gsub("^ *|(?<= ) | *$", "", s1, perl = T)
read.table(text=gsub("(?<=[[:digit:]] )(.*)(?= 2K12)", "'\\1'", s2, perl = T), header = F)
Which gives:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14
1 1 FOO BAR 2K12/ME/01 96 86 86 92 73 86 72 168 82 30 84.93
2 2 FOO2 BAR2 2K12/ME/02 72 83 61 75 44 88 75 165 91 30 72.60
3 3 FOO3 BOR BAR3 2K12/ME/03 63 84 62 62 50 79 74 157 85 30 69.13
4 4 FOO4 BOR BAR4 2K12/ME/04 89 88 74 79 77 83 68 182 82 30 81.93

Build a huffman tree from a code table

I am confused about how to build a huffman tree from a code table. The code table consists of 2 columns, the string code (binary presentation) and symb (the hexadecimal value)
the symbCode struct:
struct symbCode
{
char symb;
string code; //string of '0' and '1'
};
the Function:
void huffmanTree::buildTreeFromCodeTable(symbCode *table, int n)
{
//construct the Huffman tree from the code table
//n = number of symbols in the code table
}
I have googled a few websites providing tutorials for huffman tree. But I still can't figure it out.
Should i new a tree node or do something else?
The reference table:
Num_Alphabet 96
ASCII Huffman_Code
a 011000
20 0100
21 11101110110
22 1110111010
23 11101111000
24 11101111001
25 11110100001
26 0000010000
27 0000010001
28 100100100
29 100100101
2a 000000100
2b 000000101
2c 0110010
2d 101101000
2e 1110010
2f 0000000110
30 11110101
31 11110110
32 11110111
33 11111010
34 11111011
35 11111100
36 11111101
37 11111110
38 11111111
39 0000011
3a 101101001
3b 01100110
3c 000001001
3d 00000011
3e 000000000
3f 01100111
40 0000000111
41 00011
42 1110011
43 001010
44 011010
45 01010
46 001000
47 1110110
48 1001000
49 10101
4a 0010011
4b 1000001
4c 011111
4d 100001
4e 010110
4f 00010
50 1111100
51 00000101
52 010111
53 10100
54 110000
55 110011
56 10010011
57 100000010
58 000000001
59 10110101
5a 000000010
5b 11101111100
5c 11101111101
5d 11101111110
5e 11101111111
5f 11101110010
60 11101110011
61 10001
62 100101
63 110001
64 110010
65 10011
66 101111
67 101100
68 011110
69 11010
6a 001011
6b 011011
6c 111100
6d 00001
6e 111010
6f 01110
70 101110
71 111101001
72 111000
73 11011
74 00110
75 00111
76 0010010
77 10000000
78 100000011
79 1011011
7a 1111010001
7b 1110111101
7c 11110100000
7d 1110111000
7e 11101110111
Use binary tree, and for every node you have two path: to left child, and to right child. To left child means 0, to right child means 1. After whole tree is completed, from root to a leaf node, the path (sequenct of 1s and 0s) is the code of the value in leaf node.
So, every code in your table is actually a path from root to leaf. Pick codes one by one, check the path from root(and the code from left), if not exist, creat all nodes (including leaf); if partially exist(leftist numbers in code are same), complete the path to leaf.

c++ rand() % 100 [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
im trying to fill an array with random 200 numbers that can very from 0-100. I get it populated except the last couple number are very odd.
here my code.
for (int i = 0; i < NUM_LIST_ELEMENTS; i++)
{
int j = rand() % 100;
list[i] = j;
}
my output comes out at follows
Original Arrays:
41 67 34 0 69 24 78 58 62 64 5 45 81 27 61 91 95 42 27 36 91 4 2 53 92 82 21 16 18
95 47 26 71 38 69 12 67 99 35 94 3 11 22 33 73 64 41 11 53 68 47 44 62 57 37 59 23 41
29 78 16 35 90 42 88 6 40 42 64 48 46 5 90 29 70 50 6 1 93 48 29 23 84 54 56 40 66
76 31 8 44 39 26 23 37 38 18 82 29 41 33 15 39 58 4 30 77 6 73 86 21 45 24 72 70 29
77 73 97 12 86 90 61 36 55 67 55 74 31 52 50 50 41 24 66 30 7 91 7 37 57 87 53 83 45
9 9 58 21 88 22 46 6 30 13 68 0 91 62 55 10 59 24 37 48 83 95 41 2 50 91 36 74 20
96 21 48 99 68 84 81 34 53 99 18 38 0 88 27 67 28 93 48 83 7 21 10 17 13 14-858993460
9 16 35 51 0 49 19 56 98 3 24 8 44 9 89 2 95 85 93 43 23 87 14 3 48 0 58 18 80
96 98 81 89 98 9 57 72 22 38 92 38 79 90 57 58 91 15 88 56 11 2 34 72 55 28 46 62 86
75 33 69 42 44 16 81 98 22 51 21 99 57 76 92 89 75 12 0 10 3 69 61 88 1 89 55 23 2
85 82 85 88 26 17 57 32 32 69 54 21 89 76 29 68 92 25 55 34 49 41 12 45 60 18 53 39 23
79 96 87 29 49 37 66 49 93 95 97 16 86 5 88 82 55 34 14 1 16 71 86 63 13 55 85 53 12
8 32 45 13 56 21 58 46 82 81 44 96 22 29 61 35 50 73 66 44 59 92 39 53 24 54 10 45 49
86 13 74 22 68 18 87 5 58 91 2 25 77 14 14 24 34 74 72 59 33 70 87 97 18 77-33686019
notice that last number in each array is really weird. Is there anything I can do to avoid this? btw this is two different arrays.
Thanks everyone that posted! I got it working!
You are reading one beyond the end of the array.
e.g. if you populate an array with 200 elements, you should write to and read from 0 to 199 not 0 to 200 or 1 to 200.
By the way - rand() % 100 will not make numbers from 0 to 100. It will make numbers from 0 to 99 only.
Also, as Randy Howard says (thanks), you can get a more even random generation by following the advice at http://www.azillionmonkeys.com/qed/random.html .
This is probably because there is something wrong with your code that prints the result. You might be looping from index 0 to 200, which has 201 items.
I counted your outputs and found there is 201 items, if the last 77-33686019 are actually 2 separate numbers.
If it's not that, you might have some printf/cout somewhere further down your code that actually prints some other value. To confirm this you can probably try printf ("\n"); right after your loop that outputs the array. If your negative number ends up on a different line, you'll know it's some other printf further down your code.
You might want to use int j = rand() % 101; instead so that you get 0 to 100. Your original code gives you the random range from 0 to 99.

Why doesn't Clojure let me define zero-padded numbers?

I'm trying to bind the following grid to a symbol
(def grid [08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48])
This yields Exception in thread "main" java.lang.NumberFormatException: Invalid number: 08 (11.clj:1). Why can't I do this in Clojure? Are there any workarounds?
Clarification
All I want to do is paste this grid somewhere and have it act as if there were no leading zeros, even if it takes a little coercion. I don't want to have to drop all of the zeros in my editor, I'd just like to paste it in there and have each number behave as if there were no leading zeros.
One other strange detail
The REPL seems to allow zero-padded numbers, but executing a .clj file with java -cp clojure.jar -i some_file.clj will throw the error.
Leading zeros imply an octal number, so 08 is not valid. Many programming languages use this convention, starting with C.
SPOILER ALERT:
Since you're solving a Project Euler problem, you might not want to read this, even though it's only about the "how to read in the data?" part of it...
The reason this happens is as explained in the other answers. The correct solution would be to embed the input in your code as a string -- with linebreaks! -- and use something like the following:
(->> the-string
(.split #"\n")
(map #(.split #"\s+" %))
(map (partial drop-while empty?))
;; this just doesn't care about the leading 0
(mapcat (partial map #(Integer/parseInt %)))
vec)
This should produce a vector of your numbers. For a two-dimentional vector, you could replace the mapcat with a regular map and put in an extra (map vec) before the final vec.
If you prefer to put the input in a separate file and have Clojure read it from there, replace the-string and (.split #"\n") with a call to line-seq on a reader on your file.x
numbers with a leading 0 are read as if they where in base 8 so any charcter not between 0-7 will not work. to fix this you can append 10r08 to explicity specify the base.
user> 10r08
8
user> 08
; Evaluation aborted.
This messes up your nice formatting though :( sorry about that. you could write a little macro to change this for a block if you want to preserve your nicely formatted code.
Regular expressions will remove leading zeros
(re-seq #"[1-9]+[0-9]*|0{2}" the-string)
The regex phrase breaks down as follows:
[1-9]+ ;; one or more repetitions of 1-9 (i.e. must start with 1-9)
[0-9]* ;; zeros are ok after the first non-zero number has been found
|0{2} ;; or if the above can't be found, just look for two zeros
A more general expression is
#"[1-9]+[0-9]*|(?<=\s)0+(?=\s)"
which does the same thing but in the 'or' portion it uses positive lookahead and lookbehind assertions to look for a sequence of one or more zeros preceded and followed by whitespace.
With the leading zeros stripped (map read-string (re-seq ....)) works just fine
Since it only took about 3 minutes to remove all leading zeroes, I'll just paste the above vector with the zeroes removed in case anyone else wants to copy/paste the euler problem.
(def grid [ 8 2 22 97 38 15 0 40 0 75 4 5 7 78 52 12 50 77 91 8
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 4 56 62 0
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 3 49 13 36 65
52 70 95 23 4 60 11 42 69 24 68 56 1 32 56 71 37 2 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 3 45 2 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 8 40 91 66 49 94 21
24 55 58 5 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 9 75 0 76 44 20 45 35 14 0 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 3 80 4 62 16 14 9 53 56 92
16 39 5 42 96 35 31 47 55 58 88 24 0 17 54 24 36 29 85 57
86 56 0 48 35 71 89 7 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 5 94 47 69 28 73 92 13 86 52 17 77 4 89 55 40
4 52 8 83 97 35 99 16 7 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 3 46 33 67 46 55 12 32 63 93 53 69
4 42 16 73 38 25 39 11 24 94 72 18 8 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 4 36 16
20 73 35 29 78 31 90 1 74 31 49 71 48 86 81 16 23 57 5 54
1 70 54 71 83 51 54 69 16 92 33 48 61 43 52 1 89 19 67 48])