I am new to the SML language and I want to do this:
to ask a person "what is your full name?"
to get answer from user's keyboard
and to write "your full name is"+name (his or her name that answered)
This answer has three parts. The first part answers your only question. The second part answers a question you seem to be fishing for without asking it, and the third part addresses how to find answers to questions by your own means.
How to read string from user keyboard in SML language?
You use TextIO.inputLine TextIO.stdIn:
- val wat = TextIO.inputLine TextIO.stdIn;
Hello, World!
val wat = SOME "Hello, World!\n" : string option
Notice that this actually doesn't work in my Poly/ML REPL (aka "the top-level" or "the prompt"), but it does work in both of my SML/NJ and Moscow ML REPLs, but it will probably work from within an .sml file that you compile or run.
Notice also that you'll get the linebreak as well. Maybe you don't want that.
Although you didn't ask, you can print a string in much the same way:
- TextIO.output (TextIO.stdOut, Option.valOf wat);
Hello, World!
val it = () : unit
The catch here is that when you read a line from the user, you might not get anything, which results in the value NONE rather than an empty string (what you'd expect in Python) or an exception (what you'd expect in Java). And when you get something, to be able differentiate between getting an empty response and not getting a response, you get SOME "...".
If you don't care about this distinction, you can also make life easier and build some helper functions:
(* helper functions *)
fun getLine () =
Option.getOpt (TextIO.inputLine TextIO.stdIn, "")
fun putLine s =
TextIO.output (TextIO.stdOut, s)
(* examples of use *)
val wat = getLine ()
val _ = putLine (wat ^ "!!!")
When you get around to want to ask similar questions, you can find some of these answers yourself by typing open TextIO; Enter in your REPL. This tells you what functions are available inside the TextIO module, but not necessarily what they do. So what you can do is also look up the documentation by googling around:
https://smlfamily.github.io/Basis/text-io.html
Related
We take the normal regex processor and pass the input text and the regex phrase to capture the desired output text.
output = the_normal_regex(
input = "12$abc##EF345",
phase = "\d+|[a-zA-Z]+")
= ["12", "abc", "EF", "345"]
Can we inverse the processing that receives both the input text and the output text to produce the adequate regex phrase, specially if the text size is limited to the practical minimum e.g. some dozens of characters? Is any tool available in this regard?
phrase = the_inverse_tool(
input = "12$abc##EF345",
output=["12", "abc", "EF", "345"])
= "\d+|[a-zA-Z]+"
What you're asking appears to be whether there is some algorithm or existing library that takes an input string (like "12$abc##EF345") and a set of matches (like ["12", "abc", "EF", "345"]) and produces an "adequate" regex that would produce the matches, given the input string.
However, what does 'adequate' mean in this context? For your example, a simple answer would be: "12|abc|EF|345". However, it appears you expect something more like the generalised "\d+|[a-zA-Z]+"
Note that your generalisation makes a number of assumptions, for example that words in French, Swedish or Chinese shouldn't be matched. And numbers containing , or . are also not included.
You cannot expect a generalised algorithm to make those kinds of distinctions, as those are essentially problems requiring general AI, understanding the problem domain at an abstract level and coming up with a suitable solution.
Another way of looking at it is: your question is the same as asking if there is some function or library that automates the work of a programmer (specific to the regex language). The answer is: no, not yet anyway, and by the time there is, there won't be people on StackOverflow asking or answering these question, because we'll all be out of a job.
However, some more optimistic viewpoints can be found here: Is it possible for a computer to "learn" a regular expression by user-provided examples?
My given file /path/file.txt contains for example the following:
Hello World! Try to read me.
How can I read the entire content into one single string inside my code?
For this specific example, the string should look like this:
"Hello World!\nTry to read me."
If you don't want to use Core, the following works with functions from the built-in Stdlib module (formerly called Pervasives):
let read_whole_file filename =
let ch = open_in filename in
let s = really_input_string ch (in_channel_length ch) in
close_in ch;
s
For the solution below to work, you need to use Core library by Jane Street by writing open Core on any line above the place where you use any of the code below.
In_channel.read_all "./input.txt" returns you the content of input.txt in the current folder in a single string.
Also useful:
In_channel.read_lines "./input.txt" returns a list of lines in the file
In_channel.fold_lines allows to "fold over" all lines in the file.
I need to split string by comma, that not quoted like:
foo, bar, "hello, user", baz
to get:
foo
bar
hello, user
baz
Using std.csv:
import std.csv;
import std.stdio;
void main()
{
auto str = `foo,bar,"hello, user",baz`;
foreach (row; csvReader(str))
{
writeln(row);
}
}
Application output:
["foo", "bar", "hello, user", "baz"]
Note that I modified your CSV example data. As std.csv wouldn't correctly parse it, because of space () before first quote (").
You can use next snippet to complete this task:
File fileContent;
string fileFullName = `D:\code\test\example.csv`;
fileContent = File (fileFullName, "r");
auto r = regex(`(?!\B"[^"]*),(?![^"]*"\B)`);
foreach(line;fileContent.byLine)
{
auto result = split(line, r);
writeln(result);
}
If you are parsing a specific file format, splitting by line and using regex often isn't correct, though it will work in many cases. I prefer to read it in character by character and keep a few flags for state (or use someone else's function where appropriate that does it for you for this format). D has std.csv: http://dlang.org/phobos/std_csv.html or my old old csv.d which is minimal but basically works too: https://github.com/adamdruppe/arsd/blob/master/csv.d (haha 5 years ago was my last change to it, but hey, it still works)
Similarly, you can kinda sorta "parse" html with regex... sometimes, but it breaks pretty quickly outside of simple cases and you are better off using an actual html parser (which probably is written to read char by char!)
Back to quoted commas, reading csv, for example, has a few rules with quoted content: first, of course, commas can appear inside quotes without going to the next field. Second, newlines can also appear inside quotes without going to the next row! Third, two quote characters in a row is an escaped quote that is in the content, not a closing quote.
foo,bar
"this item has
two lines, a comma, and a "" mark!",this is just bar
I'm not sure how to read that with regex (eyeballing, I'm pretty sure yours gets the escaped quote wrong at least), but it isn't too hard to do when reading one character at a time (my little csv reader is about fifty lines, doing it by hand). Splitting the lines ahead of time also complicates compared to just reading the characters because you might then have to recombine lines later when you find one ends with a closing quote! And then your beautiful byLine loop suddenly isn't so beautiful.
Besides, when looking back later, I find simple character readers and named functions to be more understandable than a regex anyway.
So, your answer is correct for the limited scope you asked about, but might be missing the big picture of other cases in the file format you are actually trying to read.
edit: one last thing I want to pontificate on, these corner cases in CSV are an example of why people often say "don't reinvent the wheel". It isn't that they are really hard to handle - look at my csv.d code, it is short, pretty simple, and works at everything I've thrown at it - but that's the rub, isn't it? "Everything I've thrown at it". To handle a file format, you need to be aware of what the corner cases are so you can handle them, at least if you want it to be generic and take arbitrary user input. Knowing these edge cases tends to come more from real world experience than just taking a quick glance. Once you know them though, writing the code again isn't terribly hard, you know what to test for! But if you don't know it, you can write beautiful code with hundreds of unittests... but miss the real world case your user just happens to try that one time it matters.
Normally I want a variable contain this "Hey you!".
In Javascript we can
var str = 'Hey' + 'you!';
In Web language we can
$str = 'Hey'.'you!';
but in c++
+ or . also cannot combine it..
Any ideas? I believe maybe it's just a simple thing but i really have no idea how to combine this in c++, please help...
If I well understood, you just need
"Hey" "you"
(no punctuation in between)
Just a note about the space:
NOTE: in all the OP provided samples, you will get "Heyyou" with no spaces in between.
I just reproduced the OP request. (so adding a space in this answer is wrong, since it will not match the requirement)
Whether that can be not the real intention (he just wanted "Hey you") than a space after Hey or before you is required.
I am writing sml programs which run on SML/NJ and MLton (not interactive). When I use print statements in the the sml file, SML/NJ always adds
val it = () : unit
to the output, which clutters up the output. MLton does not do this.
Is there a way to remove this output? I have tried CM_VERBOSE=false, which did not help.
Running SML/NJ v110.73.
Without examples of the code that produces this, it is a bit hard to help, however it seems that your "issues" are somewhat related to this question.
In summary, remember to bind all result values to something, such that the it variable don't get assigned to the result:
val _ = print "fooo"