How to perform sub-string substitutions in LOGO? - replace

I'd like to get string from user, parse it, then run the parsed commands.
The string input will be something like "F20N20E10L10", guaranteed no spaces.
This input I want to convert to LOGO commands with substitutions like these:
"F" → fd
"N" → seth 0 fd
"E" → seth 90 fd
"L" → lt 90 fd
So the string input above would be converted to these LOGO commands:
fd 20 seth 0 fd 20 seth 90 fd 10 lt 90 fd 10
All Forth dialects allow input, and interpreting a string of commands.
But I can't find any with search and replace string operations. Is this possible in any dialect of LOGO? Willing to consider any.
Thank you for reading.

It's been a while since I've written any Logo, so I'm not sure if this is the easiest way, but here's one way you can do it. The general idea is you can work with strings as lists of characters, using FIRST, LAST, BUTFIRST, and BUTLAST to get at different parts of the string. (I tested this on the first two online Logo interpreters I could find -- http://www.logointerpreter.com/turtle-editor.php and http://www.calormen.com/jslogo/ -- and it ran fine on both, but you might need some small changes for other Logo dialects.)
TO RUN_COMMANDS :commands
IF (EMPTY? :commands) [STOP]
MAKE "first_command (FIRST :commands)
MAKE "rest_of_commands (BUTFIRST :commands)
IF (NOT EMPTY? :rest_of_commands) [MAKE "split (GET_NUMBER :rest_of_commands ")]
MAKE "numeric_argument (LAST :split)
MAKE "rest_of_commands (FIRST :split)
RUN_SINGLE_COMMAND :first_command :numeric_argument
RUN_COMMANDS :rest_of_commands
END
TO MERGE_STRING :word :characters
IF (NOT EMPTY? :characters) [OP (MERGE_STRING (WORD :word (FIRST :characters)) (BUTFIRST :characters))]
OP :WORD
END
TO GET_NUMBER :word :number
IF (AND (NOT (EMPTY? :word)) (IS_DIGIT (FIRST :word))) [OP (SE (GET_NUMBER (BUTFIRST :word) (LPUT (FIRST :word) :number))]
OP (SE (MERGE_STRING " :word) (MERGE_STRING " :number))
END
TO IS_DIGIT :character
OP (OR
:character = "0
:character = "1
:character = "2
:character = "3
:character = "4
:character = "5
:character = "6
:character = "7
:character = "8
:character = "9)
END
TO RUN_SINGLE_COMMAND :command :parameter
(PRINT_COMMAND :command :parameter)
IF (:command = "F) [FD :parameter]
IF (:command = "B) [BK :parameter]
IF (:command = "L) [LT 90 FD :parameter]
IF (:command = "R) [RT 90 FD :parameter]
IF (:command = "N) [SETH 0 FD :parameter]
IF (:command = "S) [SETH 180 FD :parameter]
IF (:command = "E) [SETH 90 FD :parameter]
IF (:command = "W) [SETH 270 FD :parameter]
END
TO PRINT_COMMAND :command :parameter
IF (:command = "F) [PRINT (SE "FD :parameter)]
IF (:command = "B) [PRINT (SE "BK :parameter)]
IF (:command = "L) [PRINT (SE "LT 90 "FD :parameter)]
IF (:command = "R) [PRINT (SE "RT 90 "FD :parameter)]
IF (:command = "N) [PRINT (SE "SETH 0 "FD :parameter)]
IF (:command = "S) [PRINT (SE "SETH 180 "FD :parameter)]
IF (:command = "E) [PRINT (SE "SETH 90 "FD :parameter)]
IF (:command = "W) [PRINT (SE "SETH 270 "FD :parameter)]
END
Then, try running:
RUN_COMMANDS "F20N20E10L10
This prints and executes the following:
FD 20
SETH 0 FD 20
SETH 90 FD 10
LT 90 FD 10
Some Explanation
RUN_COMMANDS is the main function. It:
Extracts the first letter from the sting (I'm assuming each command is abbreviated as a single letter)
Calls GET_NUMBER which extracts a number (which could be multiple characters) from the start of the string.
Passes the single-letter abbreviated command and number to RUN_SINGLE_COMMAND
Recurses to repeat the process
IS_DIGIT is used within GET_NUMBER to check if a character is numeric (although I would bet some Logo dialects have a built-in function for this.)
MERGE_STRING is used because I had some multi-character Words ("word" is Logo-speak for a string) which I had turned into Lists of single-character Words, and I wanted to merge the list back into a single Word. This might not actually be necessary, though.
RUN_SINGLE_COMMAND executes each individual command that was parsed from the input string. I just used a big IF statement, rather than using a function which interprets the string as code as you suggested. (Some Logo dialects might have such a function, but I'm not sure of a standard one.) RUN_SINGLE_COMMAND also calls PRINT_COMMAND, which prints out the unabbreviated command as it's run.
Potential Stack Overflows
I used lots of recursion because it's more idiomatic of Logo, and because Logo dialects often don't have a lot of looping constructs (other than REPEAT). But I did it in a careless way, since I was just writing this quickly to give you the general idea. In particular, I didn't worry about stack overflows (no pun intended), which I think could occur if you provided a long input string. To deal with this, you should make sure any recursive path that can be called arbitrarily many times is a tail call, which Logo will optimize away.
At a glance, it looks like RUN_COMMANDS is fine but MERGE_STRING and GET_NUMBER are reversed -- instead of IF <condition> [<recursive_call>] followed by OUTPUT <return_value>, it would be better to do IF (NOT <condition>) [OUTPUT <return_value>] followed by <recursive_call>. Testing for stack overflows and applying this fix I have left as an exercise for the reader. :)

Related

Select value in column that matches value in list (UPDATED FOR CLARITY)

If I have a column of street addresses and want to select only the address's directional, what syntax would I use to accomplish that in Excel Power Query?
For instance, how do I get "NE" from "357 Pyrite Dr NE" even if the address is incorrectly formatted as "357 NE Pyrite Dr" or "357 Pyrite NE Dr"? Likewise, how would I get "NW" from "506 Mark NW St"?
As far as I can figure out, I would hit add column > custom column and enter a syntax similar to the following...
= if List.ContainsAny([Address], {"NE", "NW", "SE", "SW"}) = TRUE then Text.Select([Address], {"NE", "NW", "SE", "SW"} else null
...except I know that's not the correct syntax since it always produces an error. The same thing happens when I replace "Text.Select" with "List.Select" in the above formula.
For greater clarification, I'm posting the query as it stands now, whittled down to one column from a table with 100 columns and 4000 rows:
let
Source = q_NMAACC,
#"Removed Other Columns" = Table.SelectColumns(Source,{"Address - Street 1", "Address - Street 2"}),
#"Merged Columns" = Table.CombineColumns(#"Removed Other Columns",{"Address - Street 1", "Address - Street 2"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Street Address"),
#"Trimmed Text" = Table.TransformColumns(#"Merged Columns",{{"Street Address", Text.Trim, type text}}),
#"Filtered Rows" = Table.SelectRows(#"Trimmed Text", each [Street Address] <> null and [Street Address] <> "")
in
#"Filtered Rows"
Here are the first 25 rows to give you some data to work off.
Street Address
PO Box 3416 Nr57 #165a
1016 Copper NE Ave Apt C
217 Garcia St NE
232 17th St SE
560 60th St NW
2935 Madeira Dr NE
9677 Eagle Ranch Rd NW Apt 415
5320 Roanoke Ave NW
17 Hwy 304
HCR 79 Box 46
6524 Camino Rojo
3518 Vail Ave SE
6412 Torreon Dr NE
6136 Flor de Rio Ct NW
1712 36th Street SE
734 Columbia Street
716 Morning Meadows Dr NE
6601 Tennyson St NE Apt 10207
Alamo - Rio Salado PO Box 804
206 Aragon Rd
6901 Verano Ct NW
6709 Siesta Pl NE
10 Meadow Hills Loop
98 Avenida Jardin
6903 Prairie Rd NE Apt 216
Try
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
List={"NE","NW","SW","SE"},
LocateTable = Table.FromList(List, null, {"Locate"}),
Find = Table.AddColumn(Source, "Found", (x) => Text.Combine(Table.SelectRows(LocateTable, each Text.Contains(x[Address],[Locate], Comparer.OrdinalIgnoreCase))[Locate],", "))
in Find
You could also use another table to contain the search criteria
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Find = Table.AddColumn(Source, "Found", (x) => Text.Combine(Table.SelectRows(LocateTable, each Text.Contains(x[Address],[Locate], Comparer.OrdinalIgnoreCase))[Locate],", "))
in Find
the , Comparer.OrdinalIgnoreCase part is ignoring case for comparison, which you can remove if you want to match case

Do records guarantee order when seq'd?

I'm writing some code where I need to map format over each value of a record. To save myself some duplicate writing, it would be super handy if I could rely on records having a set order. This is basically what it looks like right now:
(defrecord Pet [health max-health satiety max-satiety])
(let [{:keys [health max-health satiety max-satiety]} pet
[h mh s ms] (mapv #(format "%.3f" (double %))
[health max-health satiety max-satiety])]
...)
Ideally, I'd like to write this using vals:
(let [[h mh s ms] (mapv #(format "%.3f" (double %)) (vals pet))]
...)
But I can't find any definitive sources on if records have a guaranteed ordering when seq'd. From my testing, they seem to be ordered. I tried creating a massive record (in case records rely on a sorted collection when small):
(defrecord Order-Test [a b c d e f g h i j k l m n o p q r s t u v w x y z
aa bb cc dd ee ff gg hh ii jj kk ll mm nn oo pp qq rr ss tt uu vv ww xx yy zz])
(vals (apply ->Order-Test (range 52)))
=> (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51)
And they do seem to maintain order.
Can anyone verify this?
For this exact scenario, I supposed I could have reduce-kv'd over the record and reassociated the vals, then deconstructed. That would have gotten pretty bulky though. I'm also curious now since I wasn't able to find anything.
As with many things in Clojure, there is no guarantee because there is no spec. If it's not in the docstring for records, you assume it at your own risk, even if it happens to be true in the current version of Clojure.
But I'd also say: that's not really what records mean, philosophically. Record fields are supposed to have individual domain semantics, and it looks like in your record they indeed do. It is a big surprise when an operation like "take the N distinctly meaningful fields of this record, and treat them all uniformly" is the right thing to do, and it deserves to be spelled out when you do it.
You can at least do what you want with a bit less redundancy:
(let [[h mh s ms] (for [k [:health :max-health, :satiety :max-satiety]]
(format "%.3f" (get pet k)))]
...)
Personally I would say that you are modeling your domain wrong: you clearly have a concept of a "resource" (health and satiety) which has both a "current" and a "max" value. Those deserve to be grouped together by resource, e.g.
{:health {:current 50 :max 80}
:satiety {:current 3 :max 10}}
and having done that, I'd say that a pet's "set of resources" is really just a single map field, rather than N fields for the N resources it contains. Then this whole question of ordering of record fields doesn't come up at all.

Remove regex pattern from string and store in csv

I am trying to clean up a CSV by using regex. I have accomplished the first part which extracts the regex pattern from the address table and writes it to the street_numb field. The part I need help with is removing that same pattern from the street field so I only end up with the following (i.e., Steinway St, 31 St, 82nd Rd, and 19th St) stored in the street field. Hence these values would be removed (-78, -45, -35, -54) from the street field.
b street_numb street address zipcode
1 246 FIFTH AVE 246 FIFTH AVE 11215
2 30 -78 -78 STEINWAY ST 30 -78 STEINWAY ST 11016
3 25 -45 -45 31ST ST 25 -45 31ST ST 11102
4 123 -35 -35 82ND RD 123 -35 82ND RD 11415
5 22 -54 -54 19TH ST 22 -54 19TH ST 11105
Sample Data (above)
import csv
import re
path = '/Users/darchcruise/Desktop/bldg_zip_codes.csv'
with open(path, 'rU') as infile, open(path+'out.csv', 'w') as outfile:
fieldnames = ['b', 'street_numb', 'street', 'address', 'zipcode']
readablefile = csv.DictReader(infile)
writablefile = csv.DictWriter(outfile, fieldnames=fieldnames)
for row in readablefile:
add = re.match(r'\d+\s*-\s*\d+', row['address'])
if add:
row['street_numb'] = add.group()
# row['street'] = remove re.string (add.group()) from street field
writablefile.writerow(row)
else:
writablefile.writerow(row)
What code in line 12 (# remove re.string from row['street']) could be used to resolve my issue (removing -78, -45, -35, -54 from the street field)?
You can use capturing group with findall like this
[x for x in re.findall("(\d+\s*(-\s*\d+\s+)?)((\w|\s)+)", row['address'])][0][0]-->gives street number
[x for x in re.findall("(\d+\s*(-\s*\d+\s+)?)((\w|\s)+)", row['address'])][0][2]-->gives address

Repeating Capture Groups Regex

I have a large chunk of class data that I need to run a regular expression on and get data back from. The problem is that I need a repeating capturing group in order to acomplish that.
Womn St 157A QUEERHISTORY MAKING
CCode Typ Sec Unt Instructor Time Place Max Enr Req Rstr Status
32680 LEC A 4 SHAH, P. TuTh 11:00-12:20p IAB 131 35 37 60 FULL
Womn St 171 SEX/RACE & CONQUEST
CCode Typ Sec Unt Instructor Time Place Max Enr Req Rstr Status
32710 LEC A 4 O'TOOLE, R. TuTh 2:00- 3:20p DBH 1300 52 13/45 24 OPEN
~ Same as 25610 (GlblClt 103B, Lec A); 26350 (History 169, Lec A); and
~ 60320 (Anthro 139, Lec B).
32711 DIS 1 0 MONSON, A. W 9:00- 9:50 HH 105 25 5/23 8 OPEN
O'TOOLE, R.
~ Same as 25612 (GlblClt 103B, Dis 1); 26351 (History 169, Dis 1); and
~ 60321 (Anthro 139, Dis 1).
The result I need would return two matches
Match
Group1:Womn St 157A
Group2:QUEERHISTORY MAKING
Group3:32680
Group4:LEC
Group5:A
Group6:SHAH, P.
Group7:TuTh 11:00-12:20p
Group8:IAB 13
Match
Group1:Womn St 171
Group2:SEX/RACE & CONQUEST
Group3:32710
Group4:LEC
Group5:A
Group6:O'TOOLE, R.
Group7:TuTh 2:00- 3:20p
Group8:DBH 1300
Group9:25610
Group10:26350
Group11:60320
Group12:32711
Group13:DIS
Group14:1
Group15:MONSON, A.
Group16: W 9:00- 9:50
Group17:HH 105
Group18:25612
Group19:26351
Group20:60321

match elements from two files, how to write the intended format to a new file

I am trying to update my text file by matching the first column to another updated file's first column, after match it, it will update the old file.
Here is my oldfile:
Name Chr Pos ind1 in2 in3 ind4
foot 1 5 aa bb cc
ford 3 9 bb cc 00
fake 3 13 dd ee ff
fool 1 5 ee ff gg
fork 1 3 ff gg ee
Here is the newfile:
Name Chr Pos
foot 1 5
fool 2 5
fork 2 6
ford 3 9
fake 3 13
The updated file will be like:
Name Chr Pos ind1 in2 in3 ind4
foot 1 5 aa bb cc
fool 2 5 ee ff gg
fork 2 6 ff gg ee
ford 3 9 bb cc 00
fake 3 13 dd ee ff
Here is my code:
#!/usr/bin/env python
import sys
inputfile_1 = sys.argv[1]
inputfile_2 = sys.argv[2]
outputfile = sys.argv[3]
inputfile1 = open(inputfile_1, 'r')
inputfile2 = open(inputfile_2, 'r')
outputfile = open(outputfile, 'w')
ind = inputfile1.readlines()
cm = inputfile2.readlines()[1:]
outputfile.write(ind[0]) #add header
for i in ind:
i = i.split()
for j in cm:
j = j.split()
if j[0] == i[0]:
outputfile.writelines(j[0:3] + i[3:])
outputfile.write('\n')
inputfile1.close()
inputfile2.close()
outputfile.close()
When I ran it, ./compare_substitute_2files.py oldfile newfile output
the values were updated for the file, but they did not follow the order of the new file, and no space was there as indicated in the output below.
Name Chr Pos ind1 in2 in3 ind4
foot15aabbcc
ford39bbcc00
fake313ddeeff
fool25eeffgg
fork26ffggee
My question is how to match to the exact order and give spaces to each element in the list when write them out? Thanks!
file.write accepts string as its parameter.
If you want write sequences of strings instead of string, use file.writelines method instead:
outputfile.writelines(j[0:2] + i[3:])