Lets say I have a list of type integer [blah;blah;blah;...] and i don't know the size of the lis and I want to pattern match and not print the first element of the list. Is there any way to do this without using a if else case or having a syntax error?
because all i'm trying to do is parse a file tha looks like a/path/to/blah/blah/../file.c
and only print the path/to/blah/blah
for example, can it be done like this?
let out x = Printf.printf " %s \n" x
let _ = try
while true do
let line = input_line stdin in
...
let rec f (xpath: string list) : ( string list ) =
begin match Str.split (Str.regexp "/") xpath with
| _::rest -> out (String.concat "/" _::xpath);
| _ -> ()
end
but if i do this i have a syntax error at the line of String.concat!!
String.concat "/" _::xpath doesn't mean anything because _ is pattern but not a value. _ can be used in the left part of a pattern matching but not in the right part.
What you want to do is String.concat "/" rest.
Even if _::xpath were correct, String.concat "/" _::xpath would be interpreted as (String.concat "/" _)::xpath whereas you want it to be interpreted as String.concat "/" (_::xpath).
Related
Because String.tokens is a curried function, I know I can change
String.tokens (fn c =\> c = #" ") "hello world";
to a string that would contain all the delimiters, but I am just confused about the actual dictation of how.
One of the forms that I tried was:
fun splitter nil = nil
| splitter str =
let
val c = " ,.;?:!\t\n"
val s = String.tokens (fn (c:string,x:char) => c=Char.toString c x) str
in
s
end;
With c being the string of the delimiters, but I know something is very wrong. If anyone could point me into the right direction that would be greatly appreciated.
String.tokens takes two arguments: a predicate to determine if a character is a token; and a string to split. The first argument is the important part. We don't have to specify a character to split on, just a rule to identify that character.
If you turn a string containing the token characters into a list with String.explode, then it's easy to use List.exists to find out if a character is in that token string.
fun splitOn(str, tokens) =
let
val tokens' = String.explode tokens
fun isToken c = List.exists (fn c' => c = c') tokens'
in
String.tokens isToken str
end;
splitOn("hello world | wooble. foo? bar!", " |.?!");
(* ["hello", "world", "wooble", "foo", "bar"] *)
As I understand it, OCaml doesn't require explicit return statements to yield a value from a function. The last line of the function is what returns something.
In that case, could someone please let me know what the following function foo is returning? It seems that it's returning a stream of data. Is it returning the lexer?
and foo ?(input = false) =
lexer
| 'x' _
-> let y = get_func lexbuf
get_text y
| ',' -> get_func lexbuf
| _ -> get_text lexbuf
I'm trying to edit the following function, bar, to return a data stream, as well, so that I can replace foo with bar in another function. However, it seems that bar has multiple lexers which is preventing this return. How can I rewrite bar to return a data stream in a similar way that foo appears to?
let bar cmd lexbuf =
let buff = Buffer.create 0 in
let quot plus =
lexer
| "<" -> if plus then Buffer.add_string b "<" quot plus lexbuf
and unquot plus =
lexer
| ">" -> if plus then Buffer.add_string b ">" unquot plus lexbuf
in
match unquot true lexbuf with
| e -> force_text cmd e
First, your code is probably using one of the old camlp4 syntax extension, you should precise that.
Second, foo is returning the same type of value as either get_text or get_funct. Without the code for those functions, it is not really possible to say more than that.
Third,
Buffer.add_string b ">" unquot plus lexbuf
is ill-typed. Are you missing parentheses:
Buffer.add_string b ">" (unquot plus lexbuf)
?
Im trying to replace all strings which contain a substring by itself, in a list.
I've tried it by using the map function:
cleanUpChars = map(\w -> if isInfixOf "**" w then map(\c -> if c == "*" then ""; else c); else w)
To me this reads as: map elements in a list, such that if a character of a word contains * replace it with nothing
To Haskell: "Couldnt match expected type [[Char]] -> [[Char]] with actual type [Char] in the expression: w" (and the last w is underlined)
Any help is appreciated
To answer the revised question (when isInfixOf has been imported correctly):
cleanUpChars = map(\w -> if isInfixOf "**" w then map(\c -> if c == "*" then ""; else c); else w)
The most obvious thing wrong here is that c in the inner parentheses will be a Char (since it's the input to a function which is mapped over a String) - and characters use single quotes, not double quotes. This isn't just a case of a typo or wrong syntax, however - "" works fine as an empty string (and is equivalent to [] since Strings are just lists), but there is no such thing as an "empty character".
If, as it seems, your aim is to remove all *s from each string in the list that contains **, then the right tool is filter rather than map:
Prelude Data.List> cleanUpChars = map(\w -> if isInfixOf "**" w then filter (/= '*') w; else w)
Prelude Data.List> cleanUpChars ["th**is", "is", "a*", "t**es*t"]
["this","is","a*","test"]
(Note that in the example I made up, it removes all asterisks from t**es*t, even the single one. This may not be what you actually wanted, but it's what your logic in the faulty version implied - you'll have to be a little more sophisticated to only remove pairs of consecutive *'s.)
PS I would certainly never write the function like that, with the semicolon - it really doesn't gain you anything. I would also use the infix form of isInfixOf, which makes it much clearer which string you are looking for inside the other:
cleanUpChars :: [String] -> [String]
cleanUpChars = map (\w -> if "**" `isInfixOf` w then filter (/= '*') w else w)
I'm still not particularly happy with that for readability - there's probably some nice way to tidy it up that I'm overlooking for now. But even if not, it helps readability imo to give the function a local name (hopefully you can come up with a more concise name than my version!):
cleanUpChars :: [String] -> [String]
cleanUpChars = map possiblyRemoveAsterisks
where possiblyRemoveAsterisks w = if "**" `isInfixOf` w then filter (/= '*') w else w
I'd like to match a string containing any of the characters "a" through "z", or "[" or "]", but nothing else. The regexp should match
"b"
"]abc["
"ab[c"
but not these
"2"
"(abc)"
I tried this:
let content_check(s:string):bool =
Str.string_match (Str.regexp "^[a-z[\]]*$") s 0;;
content_check "]abc[";;
and got warned that the "escape" before the "]" was illegal, although I'm pretty certain that the equivalent in, say, sed or awk would work fine.
Anyhow, I tried un-escaping the cracket, but
let content_check(s:string):bool =
Str.string_match (Str.regexp "^[a-z[]]*$") s 0;;
doesn't work at all, since it should match any of a-z or "[", then the first "]" closes the "any" selection, after which there must be any number of "]"s. So it should match
[abc]]]]
but not
]]]abc[
In practice, that's not what happens at all; I get the following:
# let content_check(s:string):bool =
Str.string_match (Str.regexp "^[a-zA-Z[]]*$") s 0;;
content_check "]abc[";;
content_check "[abc]]]";;
content_check "]abc[";;
val content_check : string -> bool = <fun>
# - : bool = false
# - : bool = false
# - : bool = false
Can anyone explain/suggest an alternative?
#Tim Pietzker's suggestion sounded really good, but appears not to work:
# #load "str.cma" ;;
let content_check(s:string):bool =
Str.string_match (Str.regexp "^[a-z[\\]]*$") s 0;;
content_check "]abc[";;
# val content_check : string -> bool = <fun>
# - : bool = false
#
nor does it work when I double-escape the "[" in the pattern, just in case. :(
Indeed, here's a MWE:
#load "str.cma" ;;
let content_check(s:string):bool =
Str.string_match (Str.regexp "[\\]]") s 0;;
content_check "]";; (* should be true *)
This is not going to really answer your question, but it will solve your problem. With the re library:
let re_set = Re.(rep (* "rep" is the star *) ## alt [
rg 'a' 'z' ; (* the range from a to z *)
set "[]" ; (* the set composed of [ and ] *)
])
(* version that matches the whole text *)
let re = Re.(compile ##
seq [ start ; re_set ; stop ])
let content_check s =
Printf.printf "%s : %b\n" s (Re.execp re s)
let () =
List.iter content_check [
"]abc[" ;
"[abc]]]" ;
"]abc[" ;
"]abc[" ;
"abc##"
]
As you noticed, str from the stdlib is akward, to put it midly. re is a very good alternative, and it comes with various regexp syntax and combinators (which I tend to use, because I think it's easier to use than regexp syntax).
I'm an idiot. (But perhaps the designers of Str weren't so clever either.)
From the "Str" documentation: "To include a ] character in a set, make it the first character in the set."
With this, it's not so clear how to search for "anything except a ]", since you'd have to place the "^" in front of it. Sigh.
:(
I have some basic ocamllex code, which was written by my professor, and seems to be fine:
{ type token = EOF | Word of string }
rule token = parse
| eof { EOF }
| [’a’-’z’ ’A’-’Z’]+ as word { Word(word) }
| _ { token lexbuf }
{
(*module StringMap = BatMap.Make(String) in *)
let lexbuf = Lexing.from_channel stdin in
let wordlist =
let rec next l = match token lexbuf with
EOF -> l
| Word(s) -> next (s :: l)
in next []
in
List.iter print_endline wordlist
}
However, running ocamllex wordcount.mll produces
File "wordcount.mll", line 4, character 3: syntax error.
This indicates that there is an error at the first [ in the regex in the fourth line here. What is going on?
You seem to have curly quotes (also called "smart quotes" -- ugh) in your text. You need regular old single quotes.
curly quote: ’
old fashioned single quote: '