Ruby has a universal idea of "truthiness" and "falsiness".
Ruby does have two specific classes for Boolean objects, TrueClass and FalseClass, with singleton instances denoted by the special variables true and false, respectively.
However, truthiness and falsiness are not limited to instances of those two classes, the concept is universal and applies to every single object in Ruby. Every object is either truthy or falsy. The rules are very simple. In particular, only two objects are falsy:
nil, the singleton instance of NilClass and
false, the singleton instance of FalseClass
Every single other object is truthy. This includes even objects that are considered falsy in other programming languages, such as
the Integer 0,
the Float 0.0,
the empty String '',
the empty Array [],
the empty Hash {},
These rules are built into the language and are not user-definable. There is no to_bool implicit conversion or anything similar.
Here is a quote from the ISO Ruby Language Specification:
6.6 Boolean values
An object is classified into either a trueish object or a falseish object.
Only false and nil are falseish objects. false is the only instance of the class FalseClass (see 15.2.6), to which a false-expression evaluates (see 11.5.4.8.3). nil is the only instance of the class NilClass (see 15.2.4), to which a nil-expression evaluates (see 11.5.4.8.2).
Objects other than false and nil are classified into trueish objects. true is the only instance of the class TrueClass (see 15.2.5), to which a true-expression evaluates (see 11.5.4.8.3).
The executable Ruby/Spec seems to agree:
it "considers a non-nil and non-boolean object in expression result as true" do
if mock('x')
123
else
456
end.should == 123
end
According to those two sources, I would assume that Regexps are also truthy, but according to my tests, they aren't:
if // then 'Regexps are truthy' else 'Regexps are falsy' end
#=> 'Regexps are falsy'
I tested this on YARV 2.7.0-preview1, TruffleRuby 19.2.0.1, and JRuby 9.2.8.0. All three implementations agree with each other and disagree with the ISO Ruby Language Specification and my interpretation of the Ruby/Spec.
More precisely, Regexp objects that are the result of evaluating Regexp literals are falsy, whereas Regexp objects that are the result of some other expression are truthy:
r = //
if r then 'Regexps are truthy' else 'Regexps are falsy' end
#=> 'Regexps are truthy'
Is this a bug, or desired behavior?
This isn’t a bug. What is happening is Ruby is rewriting the code so that
if /foo/
whatever
end
effectively becomes
if /foo/ =~ $_
whatever
end
If you are running this code in a normal script (and not using the -e option) then you should see a warning:
warning: regex literal in condition
This is probably somewhat confusing most of the time, which is why the warning is given, but can be useful for one lines using the -e option. For example you can print all lines matching a given regexp from a file with
$ ruby -ne 'print if /foo/' filename
(The default argument for print is $_ as well.)
This is the result of (as far as I can tell) an undocumented feature of the ruby language, which is best explained by this spec:
it "matches against $_ (last input) in a conditional if no explicit matchee provided" do
-> {
eval <<-EOR
$_ = nil
(true if /foo/).should_not == true
$_ = "foo"
(true if /foo/).should == true
EOR
}.should complain(/regex literal in condition/)
end
You can generally think of $_ as the "last string read by gets"
To make matters even more confusing, $_ (along with $-) is not a global variable; it has local scope.
When a ruby script starts, $_ == nil.
So, the code:
// ? 'Regexps are truthy' : 'Regexps are falsey'
Is being interpreted like:
(// =~ nil) ? 'Regexps are truthy' : 'Regexps are falsey'
...Which returns falsey.
On the other hand, for a non-literal regexp (e.g. r = // or Regexp.new('')), this special interpretation does not apply.
// is truthy; just like all other object in ruby besides nil and false.
Unless running a ruby script directly on the command line (i.e. with the -e flag), the ruby parser will display a warning against such usage:
warning: regex literal in condition
You could make use of this behaviour in a script, with something like:
puts "Do you want to play again?"
gets
# (user enters e.g. 'Yes' or 'No')
/y/i ? play_again : back_to_menu
...But it would be more normal to assign a local variable to the result of gets and perform the regex check against this value explicitly.
I'm not aware of any use case for performing this check with an empty regex, especially when defined as a literal value. The result you've highlighted would indeed catch most ruby developers off-guard.
Related
I have the following code:
$DatabaseSettings = #();
$NewDatabaseSetting = "" | select DatabaseName, DataFile, LogFile, LiveBackupPath;
$NewDatabaseSetting.DatabaseName = "LiveEmployees_PD";
$NewDatabaseSetting.DataFile = "LiveEmployees_PD_Data";
$NewDatabaseSetting.LogFile = "LiveEmployees_PD_Log";
$NewDatabaseSetting.LiveBackupPath = '\\LiveServer\LiveEmployeesBackups';
$DatabaseSettings += $NewDatabaseSetting;
When I try to use one of the properties in a string execute command:
& "$SQlBackupExePath\SQLBackupC.exe" -I $InstanceName -SQL `
"RESTORE DATABASE $DatabaseSettings[0].DatabaseName FROM DISK = '$tempPath\$LatestFullBackupFile' WITH NORECOVERY, REPLACE, MOVE '$DataFileName' TO '$DataFilegroupFolder\$DataFileName.mdf', MOVE '$LogFileName' TO '$LogFilegroupFolder\$LogFileName.ldf'"
It tries to just use the value of $DatabaseSettings rather than the value of $DatabaseSettings[0].DatabaseName, which is not valid.
My workaround is to have it copied into a new variable.
How can I access the object's property directly in a double-quoted string?
When you enclose a variable name in a double-quoted string it will be replaced by that variable's value:
$foo = 2
"$foo"
becomes
"2"
If you don't want that you have to use single quotes:
$foo = 2
'$foo'
However, if you want to access properties, or use indexes on variables in a double-quoted string, you have to enclose that subexpression in $():
$foo = 1,2,3
"$foo[1]" # yields "1 2 3[1]"
"$($foo[1])" # yields "2"
$bar = "abc"
"$bar.Length" # yields "abc.Length"
"$($bar.Length)" # yields "3"
PowerShell only expands variables in those cases, nothing more. To force evaluation of more complex expressions, including indexes, properties or even complete calculations, you have to enclose those in the subexpression operator $( ) which causes the expression inside to be evaluated and embedded in the string.
#Joey has the correct answer, but just to add a bit more as to why you need to force the evaluation with $():
Your example code contains an ambiguity that points to why the makers of PowerShell may have chosen to limit expansion to mere variable references and not support access to properties as well (as an aside: string expansion is done by calling the ToString() method on the object, which can explain some "odd" results).
Your example contained at the very end of the command line:
...\$LogFileName.ldf
If properties of objects were expanded by default, the above would resolve to
...\
since the object referenced by $LogFileName would not have a property called ldf, $null (or an empty string) would be substituted for the variable.
Documentation note: Get-Help about_Quoting_Rules covers string interpolation, but, as of PSv5, not in-depth.
To complement Joey's helpful answer with a pragmatic summary of PowerShell's string expansion (string interpolation in double-quoted strings ("...", a.k.a. expandable strings), including in double-quoted here-strings):
Only references such as $foo, $global:foo (or $script:foo, ...) and $env:PATH (environment variables) can directly be embedded in a "..." string - that is, only the variable reference itself, as a whole is expanded, irrespective of what follows.
E.g., "$HOME.foo" expands to something like C:\Users\jdoe.foo, because the .foo part was interpreted literally - not as a property access.
To disambiguate a variable name from subsequent characters in the string, enclose it in { and }; e.g., ${foo}.
This is especially important if the variable name is followed by a :, as PowerShell would otherwise consider everything between the $ and the : a scope specifier, typically causing the interpolation to fail; e.g., "$HOME: where the heart is." breaks, but "${HOME}: where the heart is." works as intended.
(Alternatively, `-escape the :: "$HOME`: where the heart is.", but that only works if the character following the variable name wouldn't then accidentally form an escape sequence with a preceding `, such as `b - see the conceptual about_Special_Characters help topic).
To treat a $ or a " as a literal, prefix it with escape char. ` (a backtick); e.g.:
"`$HOME's value: $HOME"
For anything else, including using array subscripts and accessing an object variable's properties, you must enclose the expression in $(...), the subexpression operator (e.g., "PS version: $($PSVersionTable.PSVersion)" or "1st el.: $($someArray[0])")
Using $(...) even allows you to embed the output from entire commands in double-quoted strings (e.g., "Today is $((Get-Date).ToString('d')).").
Interpolation results don't necessarily look the same as the default output format (what you'd see if you printed the variable / subexpression directly to the console, for instance, which involves the default formatter; see Get-Help about_format.ps1xml):
Collections, including arrays, are converted to strings by placing a single space between the string representations of the elements (by default; a different separator can be specified by setting preference variable $OFS, though that is rarely seen in practice) E.g., "array: $(#(1, 2, 3))" yields array: 1 2 3
Instances of any other type (including elements of collections that aren't themselves collections) are stringified by either calling the IFormattable.ToString() method with the invariant culture, if the instance's type supports the IFormattable interface[1], or by calling .psobject.ToString(), which in most cases simply invokes the underlying .NET type's .ToString() method[2], which may or may not give a meaningful representation: unless a (non-primitive) type has specifically overridden the .ToString() method, all you'll get is the full type name (e.g., "hashtable: $(#{ key = 'value' })" yields hashtable: System.Collections.Hashtable).
To get the same output as in the console, use a subexpression in which you pipe to Out-String and apply .Trim() to remove any leading and trailing empty lines, if desired; e.g.,
"hashtable:`n$((#{ key = 'value' } | Out-String).Trim())" yields:
hashtable:
Name Value
---- -----
key value
[1] This perhaps surprising behavior means that, for types that support culture-sensitive representations, $obj.ToString() yields a current-culture-appropriate representation, whereas "$obj" (string interpolation) always results in a culture-invariant representation - see this answer.
[2] Notable overrides:
• The previously discussed stringification of collections (space-separated list of elements rather than something like System.Object[]).
• The hashtable-like representation of [pscustomobject] instances (explained here) rather than the empty string.
#Joey has a good answer. There is another way with a more .NET look with a String.Format equivalent, I prefer it when accessing properties on objects:
Things about a car:
$properties = #{ 'color'='red'; 'type'='sedan'; 'package'='fully loaded'; }
Create an object:
$car = New-Object -typename psobject -Property $properties
Interpolate a string:
"The {0} car is a nice {1} that is {2}" -f $car.color, $car.type, $car.package
Outputs:
# The red car is a nice sedan that is fully loaded
If you want to use properties within quotes follow as below. You have to use $ outside of the bracket to print property.
$($variable.property)
Example:
$uninstall= Get-WmiObject -ClassName Win32_Product |
Where-Object {$_.Name -like "Google Chrome"
Output:
IdentifyingNumber : {57CF5E58-9311-303D-9241-8CB73E340963}
Name : Google Chrome
Vendor : Google LLC
Version : 95.0.4638.54
Caption : Google Chrome
If you want only name property then do as below:
"$($uninstall.name) Found and triggered uninstall"
Output:
Google Chrome Found and triggered uninstall
Playing with the freedom that Ruby offers in its base features, I found rather easy to alias most operator used in the language, but the Regexp#~ unary prefix operator is trickier.
A first naïve approach would be to alias it in the Regexp class itself
class Regexp
alias hit ~# # remember that # stands for "prefix version"
# Note that a simple `alias_method :hit, :~#` will give the same result
end
As it was pointed in some answer bellow, this approach is somehow functionnal with the dot notation calling form, like /needle/.hit. However trying to execute hit /needle/ will raise undefined method hit' for main:Object (NoMethodError)`
So an other naïve approach would be to define this very method in Object, something like
class Object
def ~#(pattern)
pattern =~ $_
end
end
However, this won’t work, as the $_ global variable is in fact locally binded and won’t keep the value it has in the calling context, that is $_ is always nil in the previous snippet.
So the question is, is it possible to have the expression hit /needle/ to restitute the same result as ~ /needle/?
Works just fine for me:
class Regexp
alias_method :hit, :~ # both of them work
# alias hit ~ # both of them work
end
$_ = "input data"
/at/.hit #=> 7
~/at/ #=> 7
/at/.hit #=> 7
~/at/ #=> 7
So, as the completed question now inhibits it, the main hindrance is the narrow scope of $_. That’s where trace_var can come to the rescue:
trace_var :$_, proc { |nub|
$last_explicitly_read_line = nub
#puts "$_ is now '#{nub}'"
}
def reach(pattern)
$last_explicitly_read_line =~ pattern
end
def first
$_ = "It’s needless to despair."
end
def second
first
p reach /needle/
$_ = 'What a needlework!'
p reach /needle/
end
p reach /needle/
second
p reach /needle/
$_ = nil
p reach /needle/
So the basic idea is to stash the value of $_ each time it is changed in an other variable that will be accessible in other subsequent calling context. Here it was implemented with a an other global variable (not locally binded, unlike $_ of course), but the same result could be obtained with other implementations, like defining a class variable on Object.
One could also try to use something like binding_of_caller or binding_ninja, but my own approach of doing so failed, and also of course it comes with additional dependencies which have their own limitations.
The following code returns "Bareword found where operator expected at (eval 1) line 1, near "*,out" (Missing operator before out?)"
$val = 0;
$name = "abc";
$myStr = '$val = ($name =~ in.*,out [)';
eval($myStr);
As per my understanding, I can resolve this issue by wrapping "in.*,out [" block with '//'s.
But that "in.*,out [" can be varied. (eg: user inputs). and users may miss giving '//'s. therefore, is there any other way to handle this issue.? (eg : return 0 if eval() is trying to return that 'Bareword found where ...')
The magic of (string) eval -- and the danger -- is that it turns a heap of dummy characters into code, compiles and runs it. So can one then use '$x = ,hi'? Well, no, of course, when that string is considered code then that's a loose comma operator there, a syntax eror; and a "bareword" hi.† The string must yield valid code
In a string eval, the value of the expression (which is itself determined within scalar context) is first parsed, and if there were no errors, executed as a block within the lexical context of the current Perl program.
So that string in the question as it stands would be just (badly) invalid code, which won't compile, period. If the in.*,out [ part of the string is in quotes of some sort, then that is legitimate and the =~ operator will take it as a pattern and you have a regex. But then of course why not use regex's normal pattern delimiters, like // (or m{}, etc).
And whichever way that string gets acquired it'll be in a variable, no? So you can have /$input/ in the eval and populate that $input beforehand.
But, above all, are you certain that there is no other way? There always is. The string-eval is complex and tricky and hard to use right and nigh impossible to justify -- and dangerous. It runs arbitrary code! That can break things badly even without any bad intent.
I'd strongly suggest to consider other solutions. Also, it is unclear why there'd be need for eval in the first place -- as you only need the regex pattern as user input (not code) you can have that very regex in normal code with a pattern in a variable, which is populated earlier when the user input is supplied. (Note that taking a pattern from the user may lead to trouble as well.)
† A problem if you're into warnings, and we all are.
The following isn't valid Perl code:
$val = ($name =~ in.*,out [)
You want the following:
$val = $name =~ /in.*,out \[/
(The parens weren't harmful, but didn't help either.)
If the pattern is user-supplied, you can use the following:
$val = $name =~ /$pattern/
(No eval EXPR needed!)
Note from the correction that the pattern in the question isn't correct. You can catch such errors using eval BLOCK
eval { $val = $name =~ /$pattern/ };
die("Bad pattern \"$pattern\" provided: $#") if $#;
A note about user-provided patterns: The above won't let the user execute arbitrary code, but it won't protect you from patterns that would take longer than the lifespan of the universe to complete.
From the docs (perldoc -f m)
If ? is the delimiter, then a match-only-once rule applies, described in m?*PATTERN*? below.
The "match-only-once rule" doesn't' seem to be defined anywhere, but it seems to be a real optimization,
use Benchmark qw(:all) ;
use constant HAYSTACK => "this is a test string";
my $needle = "test";
cmpthese(-1, {
'questionmark' => sub { if ( HAYSTACK =~ m?$needle?n ) { 1 } },
'backslash' => sub { if ( HAYSTACK =~ m/$needle/n ) { 1 } },
});
With the results,
Rate backslash questionmark
backslash 9267717/s -- -57%
questionmark 21588328/s 133% --
This makes me wonder why is the behavior in m// in scalar context such that it even needs this behavior? Let's take for example the output
perl -E'say "FOOOOOO" =~ m/O/' # returns 1
If it's not even counting the O what does it do after the first match such that it's twice as slow?
The "match-only-once rule" doesn't' seem to be defined anywhere, […]
"A match-only-once rule" is a description of the rule — it's a rule saying that m?PATTERN? matches only once — not an official name that you can use to search. The text that you quote is pulled from the perlop manpage, so when it says "described in m?*PATTERN*? below", it's referring to this part of that manpage:
m?PATTERN?msixpodualngc
This is just like the m/PATTERN/ search, except that it matches only once between calls to the reset() operator. This is a useful optimization when you want to see only the first occurrence of something in each file of a set of files, for instance. Only m?? patterns local to the current package are reset.
while (<>) {
if (m?^$?) {
# blank line between header and body
}
} continue {
reset if eof; # clear m?? status for next file
}
Another example switched the first "latin1" encoding it finds to "utf8" in a pod file:
s//utf8/ if m? ^ =encoding \h+ \K latin1 ?x;
This makes me wonder why is the behavior in m// in scalar context such that it even needs this behavior?
Even in scalar context, m// or m?? may be called many times between resets, and if so then the two behave differently. (You can see this in the first snippet above. It's also the reason that your benchmarks give different performance results: the version with m?$needle?n only does a regex match the first time the function is called — it just returns 'no match' on all subsequent calls — whereas the version with m/$needle/n does a regex match every time.)
The confusion here is that "once" in "match-only-once" is in reference to the calling context of the m?? not in reference to matching once the needle inside the haystack, and ignoring subsequent matches of the needle inside the haystack. So if m?? is called many times without reset, only the first one that matches will return the match.
sub foo { return "foo" =~ m?o? };
say foo(); # 1
say foo(); # undef
reset();
say foo(); # 1
This is a markdown document example.md I have:
## New language
Raku is a new language different from Perl.
## what does it offer
+ Object-oriented programming including generics, roles and multiple dispatch
+ Functional programming primitives, lazy and eager list evaluation, junctions, autothreading and hyperoperators (vector operators)
+ Parallelism, concurrency, and asynchrony including multi-core support
+ Definable grammars for pattern matching and generalized string processing
+ Optional and gradual typing
This code will be evaluated.
```{raku evaluate=TRUE}
4/5
```
Rakudo is a compiler for raku programming language. Install it and you're all set to run raku programs!
This code will be evaluated.
```{raku evaluate=TRUE}
say "this is promising";
say $*CWD;
```
This code will **not** be evaluated.
```{raku evaluate=FALSE}
say "Hello world";
```
which I want to convert into example.md as shown below with the code and output within it.
## New language
Raku is a new language different from Perl.
## what does it offer
+ Object-oriented programming including generics, roles and multiple dispatch
+ Functional programming primitives, lazy and eager list evaluation, junctions, autothreading and hyperoperators (vector operators)
+ Parallelism, concurrency, and asynchrony including multi-core support
+ Definable grammars for pattern matching and generalized string processing
+ Optional and gradual typing
This code will be evaluated.
Code:
```{raku evaluate=TRUE}
4/5
```
Output:
```
0.8
```
Rakudo is a compiler for raku programming language. Install it and you're all set to run raku programs!
This code will be evaluated.
Code:
```{raku evaluate=TRUE}
say "this is promising";
say $*CWD;
```
Output:
```
this is promising
"C:\Users\suman".IO
```
This code will **not** be evaluated.
Code:
```{raku evaluate=FALSE}
say "Hello world";
```
What I want to accomplish is:
capture the code between backticks{raku evaluate} and backticks
execute the code if evaluate is TRUE
insert the code and output back into the document
What I tried to do:
Capture multiline code and evaluate expression
my $array= 'example.md'.IO.slurp;
#multiline capture code chunk and evaluate separately
if $array~~/\`\`\`\{raku (.*)\}(.*)\`\`\`/ {
#the first capture $0 will be evaluate
if $0~~"TRUE"{
#execute second capture which is code chunk which is captured in $1
}else {
# don't execute code
};
};
create a temp.p6 file and write code chunk $1 from above into it
my $fh="temp.p6".IO.spurt: $1;
execute the chunk if $0 is TRUE
my $output= q:x/raku temp.p6/ if $0==TRUE
integrate all this into final example.md while we create intermediate example_new.md
my $fh-out = open "example_new.md", :w; # Create a new file
# Print out next file, line by line
for "$file.tex".IO.lines -> $line {
# write output of code to example_new.md
}
$fh-out.close;
# copy
my $io = IO::Path.new("example_new.md");
$io.copy("example.md");
# clean up
unlink("example.md");
# move
$io.rename("example.md");
I am stuck in the first step. Any help?
There are two ways to execute the code and capture the output:
You can write it to a tempfile and use my $result = qqx{perl6 $filename} to spawn a separate process
You can execute the code in the same interpreter using EVAL, and use IO::Capture::Simple to capture STDOUT:
my $re = regex {
^^ # logical newline
'```{perl6 evaluate=' (TRUE|FALSE) '}'
$<code>=(.*?)
'```'
}
for $input.match(:global, $re) -> $match {
if $match[0] eq 'TRUE' {
use IO::Capture::Simple;
my $result = capture_stdout {
use MONKEY-SEE-NO-EVAL;
EVAL $match<code>;
}
# use $result now
}
}
Now you just need to switch from match to subst and return the value from that block that you want to substitute in, and then you're done.
I hope this gives you some idea how to proceed.
Code that accomplishes "What I want to accomplish"
You can run this code against your data with glot.io.
use v6;
constant $ticks = '```';
my regex Search {
$ticks '{raku evaluate=' $<evaluate>=(TRUE|FALSE) '}'
$<code>=[<!before $ticks> .]*
$ticks
}
sub Replace ($/) {
"Code:\n" ~ $ticks ~ $<code> ~ $ticks ~
($<evaluate> eq 'TRUE'
?? "\n\n" ~ 'Output:' ~ "\n" ~ $ticks ~ "\n" ~ Evaluate($<code>) ~ $ticks
!! '');
}
sub Evaluate ($code) {
my $out; my $*OUT = $*OUT but role { method print (*#args) { $out ~= #args } }
use MONKEY; my $eval-result = EVAL $code;
$out // $eval-result ~ "\n"
}
spurt
'example_new.md',
slurp('example.md')
.subst: &Search, &Replace, :g;
Explanation
Starting at the bottom and then working upwards:
The .subst method substitutes parts of its invocant string that need to be replaced and returns the revised string. .subst's first argument is a matcher; it can be a string, or, as here, a regex -- &Search1. .subst's second argument is a replacement; this can also be a string, or, as here, a Callable -- &Replace. If it's a Callable then .subst passes the match from the matcher as a match object2 as the first argument to the Callable. The :g adverb directs .subst to do the search/replace repeatedly for as many matches as there are in the invocant string.
slurp generates a string in one go from a file. No need for open, using handles, close, etc. Its result in this case becomes the invocant of the .subst explained above.
spurt does the opposite, generating a file in one go from a string, in this case the results of the slurp(...).subst... operation.
The Evaluate routine generates a string that's the output from evaluating the string of code passed to it. To capture the result of evaluation it temporarily modifies Raku's STDOUT variable $*OUT, redirecting prints (and thus also says etc.) to the internal variable $out before EVALing the code. If the EVAL results in anything being printd to $out then that is returned; if not, then the result of the EVAL is returned (coerced to a string by the ~). (A newline is appended in this second scenario but not the first because that is what's needed to get the correctly displayed result given how you've "specified" things by your example.)
The Replace routine is passed a match object from a call of the Code regex. It reconstructs the code section (without the evaluate bit) using the $<code> capture. If the $<evaluate> capture is 'TRUE' then it also appends a fresh Output: section using the Evaluate routine explained above to produce the code's output.
The Code regex matches a code section. It captures the TRUE or FALSE setting from the evaluate directive into a capture named $<evaluate> and the code into a capture named $<code>.
Footnotes
1 To pass a routine (a regex is a routine) rather than call it, it must be written with a sigil (&foo), not without (foo).
2 It does this even if the matcher was merely a string!
You can try this regex:
```{perl6 evaluate=(?<evaluate>[^}]+)}\s+(?<code>[^`]+)
You will get three results from your sample text, each result contains two named groups, the first is evaluate, containing the flag and the second one codeis the code.
Have a look at the regex-demo:
https://regex101.com/r/EsERkJ/1
Since I don't know perl, i can't help you with the implementation :(