Can anyone explain why this unit test is failing? - unit-testing

This has stumped a few of us. It's VS2013, and the code itself builds correctly as you can see from the image. We've run this test on 2 different machines with the same results.
I did copy/paste the code originally into and from MS OneNote, so possibly there is a reason there. But as you can see from Notepad++ there don't appear to be any special characters.
Ideas?
To expand on this, the following version also fails:
//Note: Why this does not pass is baffling
[TestMethod]
public void FunnyTestThatFailsForSomeReason()
{
const string expectedErrorMessage = "Web Session Not Found.";
var a = "Web Session Not Found.";
string b = "Web Session Not Found.";
Assert.AreEqual(expectedErrorMessage, a);
//Assert.AreEqual(expectedErrorMessage, b);
Assert.AreEqual(expectedErrorMessage.ToString(), b.ToString());
}

You're using Assert.AreEqual(Object, Object) which (in this case) is looking for reference equality. It's not going to work the way you want it to.
Verifies that two specified objects are equal. The assertion fails if the objects are not equal.
Use Assert.AreEqual(String, String, Boolean).
Verifies that two specified strings are equal, ignoring case or not as specified. The assertion fails if they are not equal.
Or, more simply, your strings are subtly different. Copy and pasting appears to have yielded different results:

(Here for formatting purposes; the existing answer also explains what's happened. This is just the hex dump of your question's code.)
00000000: 2020 2020 7661 7220 6120 3d20 2257 6562 c2a0 5365 7373 696f : var a = "Web..Sessio
00000018: 6ec2 a04e 6f74 c2a0 466f 756e 642e 223b 0a20 2020 2020 2020 :n..Not..Found.";.
00000030: 2020 2020 2073 7472 696e 6720 6220 3d20 2257 6562 2053 6573 : string b = "Web Ses
00000048: 7369 6f6e 204e 6f74 2046 6f75 6e64 2e22 3b0a :sion Not Found.";.
The strings aren't the same.

Related

Pandas exact str matching function?

Does pandas have a built-in string matching function for exact matches and not regex? The code below for tropical_two has a slightly higher count. Documentation tells me it does a regex search.
tropical = reviews['description'].map(lambda x: "tropical" in x).sum()
print(tropical)
tropical_two = reviews['description'].str.count("tropical").sum()
print(tropical_two)
The first way is the answer key from Kaggle but something about it seems less readable and intuitive to me compared to a .str function because when I run this it returns True instead of 2 so I am a little confused about if the answer key method is actually counting all occurrences of "tropical" and not just the first.
def in_str(text):
return "tropical" in text
in_str("tropical is tropical")
First 2 lines of dataframe:
0 Italy Aromas include tropical fruit, broom, brimston... Vulkà Bianco 87 NaN Sicily & Sardinia Etna NaN Kerin O’Keefe #kerinokeefe Nicosia 2013 Vulkà Bianco (Etna) White Blend Nicosia
1 Portugal This is ripe and fruity, a wine that is smooth... Avidagos 87 15.0 Douro NaN NaN Roger Voss #vossroger Quinta dos Avidagos 2011 Avidagos Red (Douro) Portuguese Red Quinta dos Avidagos
Notebook here, tropical code in cell #2
https://www.kaggle.com/mikexie0/exercise-summary-functions-and-maps
You may use str.count with word boundary markers to match the exact search term:
tropical_two = reviews['description'].str.count(r'\btropical\b').sum()
print(tropical_two)
There may not be the need for a separate exact API, as str.count can be used for exact matches as well.

How to reformat this datetime without regex in Google Sheets?

In Google Sheets i want to reformat this datetime Mon, 08 Mar 2021 10:57:15 GMT into this 08/03/2021.
Using RegEx i achieve the goal with
=to_date(datevalue(REGEXEXTRACT("Mon, 08 Mar 2021 10:57:15 GMT","\b[0-9]{2}\s\D{3}\s[0-9]{4}\b")))
But how can i do it without RegEx? This datetime format seems to be a classic one - can it really be, that no onboard formula can't do it? I rather think, i miss the right knowledge here...
Please try the following formula and format as date
=TRIM(LEFT(INDEX(SPLIT(K13,","),,2),12))*1
(do adjust according to your locale)
Another option is to use Custom Script.
Example:
Code:
function formatDate(date) {
return Utilities.formatDate(new Date(date), "GMT", "dd/MM/YYYY")
}
Formula in B1: =formatDate(A1)
Output:
Reference:
Custom Functions in Google Sheets

Ansible: ios upgrade router: check "spacefree_kb" prior to image copy

I'm writing a playbook for ios upgrade of multiple switches and have most pieces working with exception of the flash free check. Basically, I want to check if there is enough flash space free prior to copying the image.
I tried using the gather facts module but it is not working how I expected:
from gather facts I see this:
"ansible_net_filesystems_info": {
"flash:": {
"spacefree_kb": 37492,
"spacetotal_kb": 56574
This is the check I want to do:
fail:
msg: 'This device does not have enough flash memory to proceed.'
when: "ansible_net_filesystems_info | json_query('*.spacefree_kb')|int < new_ios_filesize|int"
From doing some research I understand that any value returned by a jinja2 template will be a string so my check is failing:
Pass integer variable to task without losing the integer type
The solution suggested in the link doesn't seem to work for me even with ansible 2.7.
I then resorted to store the results of 'dir' in a register and tried using regex_search but can't seem to get the syntax right.
(similar to this :
Ansible regex_findall multiple strings)
"stdout_lines": [
[
"Directory of flash:/",
"",
" 2 -rwx 785 Jul 2 2019 15:39:05 +00:00 dhcp-snooping.db",
" 3 -rwx 1944 Jul 28 2018 20:05:20 +00:00 vlan.dat",
" 4 -rwx 3096 Jul 2 2019 01:03:26 +00:00 multiple-fs",
" 5 -rwx 1915 Jul 2 2019 01:03:26 +00:00 private-config.text",
" 7 -rwx 35800 Jul 2 2019 01:03:25 +00:00 config.text",
" 8 drwx 512 Apr 25 2015 00:03:16 +00:00 c2960s-universalk9-mz.150-2.SE7",
" 622 drwx 512 Apr 25 2015 00:03:17 +00:00 dc_profile_dir",
"",
"57931776 bytes total (38391808 bytes free)"
]
]
Can anyone provide some insight to this seemingly simple task? I just want '38391808' as an integer from the example above (or any other suggestion). I'm fairly new to ansible.
Thanks in advance.
json_query wildcard expressions return a list. The tasks below
- set_fact:
free_space: "{{ ansible_net_filesystems_info|
json_query('*.spacefree_kb') }}"
- debug:
var: free_space
give the list
"free_space": [
37492
]
which neither can be converted to an integer nor can be compared to an integer. This is the reason for the problem.
The solution is simple. Just take the first element of the list and the condition will start working
- fail:
msg: 'This device does not have enough flash memory to proceed.'
when: ansible_net_filesystems_info|
json_query('*.spacefree_kb')|
first|
int < new_ios_filesize|int
Moreover, json_query is not necessary. The attribute spacefree_kb can be referenced directly
- fail:
msg: 'This device does not have enough flash memory to proceed.'
when: ansible_net_filesystems_info['flash:'].spacefree_kb|
int < new_ios_filesize|int
json_query has an advantage : see this example on a C9500 :
[{'bootflash:': {'spacetotal_kb': 10986424.0, 'spacefree_kb': 4391116.0}}]
yes they changed flash: to bootflash:.

Forvalues dropping leading 0's, how to fix?

I am attempting to create a loop to save me having to type out the code many times. Essentially, I have 60 csv files that I need to alter and save. My code looks as follows:
forvalues i = 0203 0206 : 1112 {
cd "C:\Users\User\Desktop\Data\"
import delimited `i'.csv, varnames(1)
gen time=`i'
keep rssd9017 rssd9010 bhck4074 bhck4079 bhck4093 bhck2170 time
save `i'.dta, replace
}
However, I am getting the error "203.csv" does not exist. It seems to be dropping the leading 0, any way to fix this?
You are asking for a numlist, but in this context 0203, with nothing else said, just looks to Stata like a quirky but acceptable way to write 203: hence your problem.
But do you really have a numlist that is 0203 0206 : 1112?
Try it:
numlist "0203 0206 : 1112"
ret li
The list starts 203 206 209 212 215 218 221 224 227 230 233 236 ...
My wild guess is that you have files, one for each quarter over a period, labelled 0203 for March 2002 through to 1112 for December 2011. In fact you do say that you have times, even though my guess implies 40 files, not 60. If so, that means you won't have a file that is labelled 0215, so this is the wrong way to think in any case.
Here is a better approach. First take the cd out of the loop: you need only do that once!
cd "C:\Users\User\Desktop\Data"
Now find the files that are ????.csv. You need only install fs once.
ssc inst fs
fs ????.csv
foreach f in `r(files)' {
import delimited `f', varnames(1)
gen time = substr("`f'", 1, 4)
keep rssd9017 rssd9010 bhck4074 bhck4079 bhck4093 bhck2170 time
save `time'.dta, replace
}
On my guess, you still need to fix the time to something civilised and you would be better off appending the files, but one problem at a time.
Note that insisting on leading zeros, which you think is the problem here, but is probably a red herring, is written up here.

R: need to replace invisible/accented characters with regex

I'm working with a file generated from several different machines that had different locale-settings, so I ended up with a column of a data frame with different writings for the same word:
CÓRDOBA
CÓRDOBA
CÒRDOBA
I'd like to convert all those to CORDOBA. I've tried doing
t<-gsub("Ó|Ó|Ã’|°|°|Ò","O",t,ignore.case = T) # t is the vector of names
Wich works until it finds some "invisible" characters:
As you can see, I'm not able to see, in R, the additional charater that lies between à and \ (If I copy-paste to MS Word, word shows it with an empty rectangle). I've tried to dput the vector, but it shows exactly as in screen (without the "invisible" character).
I ran Encoding(t), and ir returns unknown for all values.
My system configuration follows:
> sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)
locale:
[1] LC_COLLATE=Spanish_Colombia.1252 LC_CTYPE=Spanish_Colombia.1252 LC_MONETARY=Spanish_Colombia.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Colombia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zoo_1.7-12 dplyr_0.4.2 data.table_1.9.4
loaded via a namespace (and not attached):
[1] R6_2.1.0 assertthat_0.1 magrittr_1.5 plyr_1.8.3 parallel_3.2.1 DBI_0.3.1 tools_3.2.1 reshape2_1.4.1 Rcpp_0.11.6 stringi_0.5-5
[11] grid_3.2.1 stringr_1.0.0 chron_2.3-47 lattice_0.20-31
I've saveRDS a file with a data frame of actual and expected toy values, wich could be loadRDS from here. I'm not absolutely sure it will load with the same problems I have (depending on you locale), but I hope it does, so you can provide some help.
At the end, I'd like to convert all those special characters to unaccented ones (Ó to O, etc.), hopefully without having to manually input each one of the special ones into a regex (in other words, I'd like --if possible-- some sort of gsub("[:weird:]","[:equivalentToWeird:]",t). If not possible, at least I'd like to be able to find (and replace) those "invisible" characters.
Thanks,
############## EDIT TO ADD ###################
If I run the following code:
d<-readRDS("c:/path/to(downloaded/Dropbox/file/inv_char.Rdata")
stri_escape_unicode(d$actual)
This is what I get:
[1] "\\u00c3\\u201cN N\\u00c2\\u00b0 08 \\\"CACIQUE CALARC\\u00c3\\u0081\\\" - ARMENIA"
[2] "\\u00d3N N\\u00b0 08 \\\"CACIQUE CALARC\\u00c1\\\" - ARMENIA"
[3] "\\u00d3N N\\u00b0 08 \\\"CACIQUE CALARC\\u00c1\\\" - ARMENIA(ALTERNO)"
Normal output is:
> d$actual
[1] ÓN N° 08 "CACIQUE CALARCÃ" - ARMENIA ÓN N° 08 "CACIQUE CALARCÁ" - ARMENIA ÓN N° 08 "CACIQUE CALARCÁ" - ARMENIA(ALTERNO)
With the help of #hadley, who pointed me towards stringi, I ended up discovering the offending characters and replacing them. This was my initial attempt:
unweird<-function(t){
t<-stri_escape_unicode(t)
t<-gsub("\\\\u00c3\\\\u0081|\\\\u00c1","A",t)
t<-gsub("\\\\u00c3\\\\u02c6|\\\\u00c3\\\\u2030|\\\\u00c9|\\\\u00c8","E",t)
t<-gsub("\\\\u00c3\\\\u0152|\\\\u00c3\\\\u008d|\\\\u00cd|\\\\u00cc","I",t)
t<-gsub("\\\\u00c3\\\\u2019|\\\\u00c3\\\\u201c|\\\\u00c2\\\\u00b0|\\\\u00d3|\\\\u00b0|\\\\u00d2|\\\\u00ba|\\\\u00c2\\\\u00ba","O",t)
t<-gsub("\\\\u00c3\\\\u2018|\\\\u00d1","N",t)
t<-gsub("\\u00a0|\\u00c2\\u00a0","",t)
t<-gsub("\\\\u00f3","o",t)
t<-stri_unescape_unicode(t)
}
which produced the expected result. I was a little bit curious about other stringi functions, so I wondered if its substitution one could be faster on my 3.3 million rows. I then tried stri_replace_all_regex like this:
stri_unweird<-function(t){
stri_unescape_unicode(stri_replace_all_regex(stri_escape_unicode(t),
c("\\\\u00c3\\\\u0081|\\\\u00c1",
"\\\\u00c3\\\\u02c6|\\\\u00c3\\\\u2030|\\\\u00c9|\\\\u00c8",
"\\\\u00c3\\\\u0152|\\\\u00c3\\\\u008d|\\\\u00cd|\\\\u00cc",
"\\\\u00c3\\\\u2019|\\\\u00c3\\\\u201c|\\\\u00c2\\\\u00b0|\\\\u00d3|\\\\u00b0|\\\\u00d2|\\\\u00ba|\\\\u00c2\\\\u00ba",
"\\\\u00c3\\\\u2018|\\\\u00d1",
"\\u00a0|\\u00c2\\u00a0",
"\\\\u00f3"),
c("A","E","I","O","N","","o"),
vectorize_all = F))
}
As a side note, I ran microbenchmark on both methods, these are the results:
g<-microbenchmark(unweird(t),stri_unweird(t),times = 100L)
summary(g)
min lq mean median uq max neval cld
1 423.0083 425.6400 431.9609 428.1031 432.6295 490.7658 100 b
2 118.5831 119.5057 121.2378 120.3550 121.8602 138.3111 100 a