Can Gomega equal with ginkgo print full strings? - unit-testing

Example error print of a unit test:
Expected
<string>: "...up - Finish..."
to equal |
<string>: "...up - Vault ..."
Is there a way to increase the print limit this is just not practical at all...
Like at least to something like 100 signs...
Edit:
I might not have given enough information:
Vault ...
Finish...
Are not the only parts that are different in the string and it is very hard to read without more context if an error occurs. There should be a way to allow full comparison prints no? Similar as to how it is in NodeJS Chai.

I do not believe this is possible without modifying the underlying Ginkgo code, at least from my reading and experimentation. Ginkgo uses its own reporting framework, and you can leverage that to customize the output...within limits.
One way to see the raw report output is to dump it as json with the --json-report <PATH> flag for ginkgo:
$ ginkgo --json-report=spec-out.json
I created a simple spec that compared two really long strings (just the English alphabet repeated, separated by spaces, a bunch of times), differing only in the replacement of a single alphabet block with "foobar", and the contents of the report relevant to what you'd see in the normal output were limited to:
"Failure": {
"Message": "Expected\n \u003cstring\u003e: \"...wxyz abcdef...\"\nto equal |\n \u003cstring\u003e: \"...wxyz foobar...\"",
Then I changed the compared string so that it the diff extended for a much longer stretch, and the message was identical - still truncated to the initial point of mismatch.
You can access the underlying reporting framework through Ginkgo itself, for example:
ReportAfterEach(func(report SpecReport) {
fmt.Fprintf(os.Stderr, "SPEC REPORT: %s | %s\nFAILURE MESSAGE: %s\n", report.State, report.FullText(), report.FailureMessage())
})
This also returned the truncated strings in the message, which would indicate to me that there is no easily accessible mechanism for getting a longer string to output - since this is presumably generated at a lower level (I'm thinking in the code for the Equal() matcher, but I haven't looked yet):
SPEC REPORT: failed | Utils Compares really long strings
FAILURE MESSAGE: Expected
<string>: "...wxyz abcdef..."
to equal |
<string>: "...wxyz foobar..."
For reference see the ginkgo reporting docs and relevant portion of the ginkgo godoc.

Related

OpenModelica SimulationOptions 'variableFilter' not working with '^' exceptions

To reduce size of my simulation output files, I want to give variable name exceptions instead of a list of many certain variables to the simulationsOptions/outputFilter (cf. OpenModelica Users Guide / Output) of my model. I found the regexp operator "^" to fullfill my needs, but that didn't work as expected. So I think that something is wrong with the interpretation of connected character strings when negated.
Example:
When I have any derivatives der(...) in my model and use variableFilter=der.* the output file will contain all the filtered derivatives. Since there are no other varibles beginning with character d the same happens with variableFilter=d.*. For testing I also tried variableFilter=rde.* to confirm that every variable is filtered.
When I now try to except by variableFilter=^der.*, =^rde.* or =^d.*, I get exactly the same result as without using ^. So the operator seems to be ignored in this notation.
When I otherwise use variableFilter=[^der].*, =[^rde].* or even =[^d].*, all wanted derivation variables are filtered from the ouput, but there is no difference between those three expressions above. For me it seems that every character is interpretated standalone and not as as a connected string.
Did I understand and use the regexp usage right or could this be a code bug?
Side/follow-up question: Where can I officially report this for software revision?
_
OpenModelica v.1.19.2 (64-bit)

AWS Signature Version 2 Example not reproducible

Like the guy in this question (AWS Signature Version 2 - can't reproduce signature from example) I can't run the example of AWS Signature Version 2 (https://docs.aws.amazon.com/general/latest/gr/signature-version-2.html).
We have the string:
GET\nelasticmapreduce.amazonaws.com\n/\nAWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Action=DescribeJobFlows&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2011-10-03T15%3A19%3A30&Version=2009-03-31
and the sample secret key
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
To be independent of any programming language, lets take an online tool for the hash, which is calculated with HmacSHA256: https://www.liavaag.org/English/SHA-Generator/HMAC/
But I get the following hash value:
xgbYI2xegVYMVTvnhoqc8/opbN0v/5Pn+8i9usAQAjk=
which is sadly not the expected value (not URL-encoded here):
i91nKc4PWAt0JJIdXwz9HxZCJDdiy6cf/Mj6vPxyYIs=
What did I do wrong? Why is my calculation of the hash value not correct? Is the initial string correct? If you manage to get the right result with the online tool, please let me know how it was done.
TLDR: It's the newlines
Although some tools and programming languages, particularly those based on C or originating on Unix where C was heavily used, treat \n as a notation or representation for newline, that webpage does not. If I enter the string from your Q in the webpage's 'text' mode, it computes the HMAC of a value containing a backslash and a lowercase letter 'en', not a newline as required by the AWS spec.
If I enter the correct input (containing newlines) in hex as
4745540a656c61737469636d61707265647563652e616d617a6f6e6177732e636f6d0a2f0a4157534163636573734b657949643d414b4941494f53464f444e4e374558414d504c4526416374696f6e3d44657363726962654a6f62466c6f7773265369676e61747572654d6574686f643d486d6163534841323536265369676e617475726556657273696f6e3d322654696d657374616d703d323031312d31302d3033543135253341313925334133302656657273696f6e3d323030392d30332d3331
or in base64 as
R0VUCmVsYXN0aWNtYXByZWR1Y2UuYW1hem9uYXdzLmNvbQovCkFXU0FjY2Vzc0tleUlkPUFLSUFJT1NGT0ROTjdFWEFNUExFJkFjdGlvbj1EZXNjcmliZUpvYkZsb3dzJlNpZ25hdHVyZU1ldGhvZD1IbWFjU0hBMjU2JlNpZ25hdHVyZVZlcnNpb249MiZUaW1lc3RhbXA9MjAxMS0xMC0wM1QxNSUzQTE5JTNBMzAmVmVyc2lvbj0yMDA5LTAzLTMx
then I get the correct result (and you should too).

RGoogleAnalytics replacing unexpectedly escaped characters with gsub

I'm using RGoogleAnalytics, I'm just at the learning stage at the moment.
I'm following the code in the tutorial here https://code.google.com/p/r-google-analytics/
But when I try to run
ga.goals <- conf$GetGoals()
ga.goals
I get an error message telling me there is an unexpected escaped character '\.' at pos 7
I get a similar message for the next two lines of code (GetSegments)
This question deals with a similar problems in the Facebook Graphs API
How to replace "unexpected escaped character" in R
I've tried using a similar bit of code
confGoalsSub <- gsub('\\.', ' ', conf$GetGoals())
to remove the escaped characters, but I get another error :
cannot coerce type 'closure' to vector of type 'character'
Out of desperation I have tried confGoalsSub <- gsub('\\.', ' ', conf) which returns a character vector that is just garbage (it's just the code for conf with the decimal points stripped out).
Can anyone suggest a better expression than gsub that will return a useful object?
EDIT: As per the suggestion below I've now added the brackets at the end of the function call but I still get the same error message about unexpected escape characters. I get the same error when I try to call other, similar function such as $GetSegments().
I saw on one video at the weekend that this package was broken for a long time, although the speaker did not provide details as to why. Perhaps I should give up and try one of the other Google Analytics packages in R.
Seems odd, given that this one is supposed to be Google supported.
I think this error arises when the RJSON library isn't able to parse the Google Analytics Data Feed properly and convert it into a nested list. The updated version of [RGoogleAnalytics] (http://cran.r-project.org/web/packages/RGoogleAnalytics/index.html) fixes this problem. Currently, you won't be able to retrieve Goals and Segments from your Google Analytics account using the library but beyond that it supports the full range of dimensions and metrics.

eval failing to match regex after sometime

I get first input from user which is a tree (having significant height and depth) of nodes. Each of the node contains a regex and modifiers. This tree gets saved in memory. This is taken only once at the application startup.
The second input is a value which is matched starting at the root node of the tree till an exact matching leaf node is found (Depth First Search). The match is determined as follows :
my $evalstr = <<EOEVAL;
if(\$input_value =~ /\$node_regex/$node_modifiers){
1;
}else{
-1;
}
EOEVAL
no strict 'refs';
my $return_value = eval "no strict;$evalstr";
The second input is provided continuously throughout the application's life time by a source.
problem:
The above code works very well for some time (approx. 10 hours), but after continuous input for this time, the eval continuously starts failing and I get -1 in $return_value. All other features of the application work very fine including other comparison statements.If I restart the application, the matching again starts and gives proper results.
Observations:
1) I get deep recursion warning many times, but I read somewhere it is normal as stack size for me would be more than 100 many a times, considering the size of the input tree.
2) If I use simple logic for regex match without eval as above, I don't get any issue for any continuous run of the application.
if($input_value =~ /$node_regex/){
$return_value=1;
}else{
$return_value=-1;
}
but then I have to sacrifice dynamic modifiers, as per Dynamic Modifiers
Checks:
1) I checked $# but it is empty.
2) Also printed the respective values of $input_value,$node_regex and $node_modifiers, they are correct and should have matched the value with regex at the failure point.
3) I checked for memory usage, but it's fairly constant over the time for the perl process.
4) Was using perl 5.8.8 then updated it to 5.12, but still face the same issue.
Question :
What could be the cause of above issue? Why it fails after some time, but works well when the application is restarted?
A definitive answer would require more knowledge of perl internals than I have. But given what you are doing, continuous parsing of large trees, it seems safe to assume that some limit is being reached, some resource is exhausted. I would take a close look at things and make sure that all resources are being released between each iteration of a parse. I would be especially concerned with circular references in the complex structures, and making sure that there are none.

RegEx to parse or validate Base64 data

Is it possible to use a RegEx to validate, or sanitize Base64 data? That's the simple question, but the factors that drive this question are what make it difficult.
I have a Base64 decoder that can not fully rely on the input data to follow the RFC specs. So, the issues I face are issues like perhaps Base64 data that may not be broken up into 78 (I think it's 78, I'd have to double check the RFC, so don't ding me if the exact number is wrong) character lines, or that the lines may not end in CRLF; in that it may have only a CR, or LF, or maybe neither.
So, I've had a hell of a time parsing Base64 data formatted as such. Due to this, examples like the following become impossible to decode reliably. I will only display partial MIME headers for brevity.
Content-Transfer-Encoding: base64
VGhpcyBpcyBzaW1wbGUgQVNDSUkgQmFzZTY0IGZvciBTdGFja092ZXJmbG93IGV4YW1wbGUu
Ok, so parsing that is no problem, and is exactly the result we would expect. And in 99% of the cases, using any code to at least verify that each char in the buffer is a valid base64 char, works perfectly. But, the next example throws a wrench into the mix.
Content-Transfer-Encoding: base64
http://www.stackoverflow.com
VGhpcyBpcyBzaW1wbGUgQVNDSUkgQmFzZTY0IGZvciBTdGFja092ZXJmbG93IGV4YW1wbGUu
This a version of Base64 encoding that I have seen in some viruses and other things that attempt to take advantage of some mail readers desire to parse mime at all costs, versus ones that go strictly by the book, or rather RFC; if you will.
My Base64 decoder decodes the second example to the following data stream. And keep in mind here, the original stream is all ASCII data!
[0x]86DB69FFFC30C2CB5A724A2F7AB7E5A307289951A1A5CC81A5CC81CDA5B5C1B19481054D0D
2524810985CD94D8D08199BDC8814DD1858DAD3DD995C999B1BDDC8195E1B585C1B194B8
Anyone have a good way to solve both problems at once? I'm not sure it's even possible, outside of doing two transforms on the data with different rules applied, and comparing the results. However if you took that approach, which output do you trust? It seems that ASCII heuristics is about the best solution, but how much more code, execution time, and complexity would that add to something as complicated as a virus scanner, which this code is actually involved in? How would you train the heuristics engine to learn what is acceptable Base64, and what isn't?
UPDATE:
Do to the number of views this question continues to get, I've decided to post the simple RegEx that I've been using in a C# application for 3 years now, with hundreds of thousands of transactions. Honestly, I like the answer given by Gumbo the best, which is why I picked it as the selected answer. But to anyone using C#, and looking for a very quick way to at least detect whether a string, or byte[] contains valid Base64 data or not, I've found the following to work very well for me.
[^-A-Za-z0-9+/=]|=[^=]|={3,}$
And yes, this is just for a STRING of Base64 data, NOT a properly formatted RFC1341 message. So, if you are dealing with data of this type, please take that into account before attempting to use the above RegEx. If you are dealing with Base16, Base32, Radix or even Base64 for other purposes (URLs, file names, XML Encoding, etc.), then it is highly recommend that you read RFC4648 that Gumbo mentioned in his answer as you need to be well aware of the charset and terminators used by the implementation before attempting to use the suggestions in this question/answer set.
From the RFC 4648:
Base encoding of data is used in many situations to store or transfer data in environments that, perhaps for legacy reasons, are restricted to US-ASCII data.
So it depends on the purpose of usage of the encoded data if the data should be considered as dangerous.
But if you’re just looking for a regular expression to match Base64 encoded words, you can use the following:
^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$
This one is good, but will match an empty String
This one does not match empty string :
^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{4})$
The answers presented so far fail to check that the Base64 string has all pad bits set to 0, as required for it to be the canonical representation of Base64 (which is important in some environments, see https://www.rfc-editor.org/rfc/rfc4648#section-3.5) and therefore, they allow aliases that are different encodings for the same binary string. This could be a security problem in some applications.
Here is the regexp that verifies that the given string is not just valid base64, but also the canonical base64 string for the binary data:
^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/][AQgw]==|[A-Za-z0-9+/]{2}[AEIMQUYcgkosw048]=)?$
The cited RFC considers the empty string as valid (see https://www.rfc-editor.org/rfc/rfc4648#section-10) therefore the above regex also does.
The equivalent regular expression for base64url (again, refer to the above RFC) is:
^(?:[A-Za-z0-9_-]{4})*(?:[A-Za-z0-9_-][AQgw]==|[A-Za-z0-9_-]{2}[AEIMQUYcgkosw048]=)?$
Neither a ":" nor a "." will show up in valid Base64, so I think you can unambiguously throw away the http://www.stackoverflow.com line. In Perl, say, something like
my $sanitized_str = join q{}, grep {!/[^A-Za-z0-9+\/=]/} split /\n/, $str;
say decode_base64($sanitized_str);
might be what you want. It produces
This is simple ASCII Base64 for StackOverflow exmaple.
The best regexp which I could find up till now is in here
https://www.npmjs.com/package/base64-regex
which is in the current version looks like:
module.exports = function (opts) {
opts = opts || {};
var regex = '(?:[A-Za-z0-9+\/]{4}\\n?)*(?:[A-Za-z0-9+\/]{2}==|[A-Za-z0-9+\/]{3}=)';
return opts.exact ? new RegExp('(?:^' + regex + '$)') :
new RegExp('(?:^|\\s)' + regex, 'g');
};
Here's an alternative regular expression:
^(?=(.{4})*$)[A-Za-z0-9+/]*={0,2}$
It satisfies the following conditions:
The string length must be a multiple of four - (?=^(.{4})*$)
The content must be alphanumeric characters or + or / - [A-Za-z0-9+/]*
It can have up to two padding (=) characters on the end - ={0,2}
It accepts empty strings
To validate base64 image we can use this regex
/^data:image/(?:gif|png|jpeg|bmp|webp)(?:;charset=utf-8)?;base64,(?:[A-Za-z0-9]|[+/])+={0,2}
private validBase64Image(base64Image: string): boolean {
const regex = /^data:image\/(?:gif|png|jpeg|bmp|webp|svg\+xml)(?:;charset=utf-8)?;base64,(?:[A-Za-z0-9]|[+/])+={0,2}/;
return base64Image && regex.test(base64Image);
}
The shortest regex to check RFC-4648 compiliance enforcing canonical encoding (i.e. all pad bits set to 0):
^(?=(.{4})*$)[A-Za-z0-9+/]*([AQgw]==|[AEIMQUYcgkosw048]=)?$
Actually this is the mix of this and that answers.
I found a solution that works very well
^(?:([a-z0-9A-Z+\/]){4})*(?1)(?:(?1)==|(?1){2}=|(?1){3})$
It will match the following strings
VGhpcyBpcyBzaW1wbGUgQVNDSUkgQmFzZTY0IGZvciBTdGFja092ZXJmbG93IGV4YW1wbGUu
YW55IGNhcm5hbCBwbGVhcw==
YW55IGNhcm5hbCBwbGVhc3U=
YW55IGNhcm5hbCBwbGVhc3Vy
while it won't match any of those invalid
YW5#IGNhcm5hbCBwbGVhcw==
YW55IGNhc=5hbCBwbGVhcw==
YW55%%%%IGNhcm5hbCBwbGVhc3V
YW55IGNhcm5hbCBwbGVhc3
YW55IGNhcm5hbCBwbGVhc
YW***55IGNhcm5hbCBwbGVh=
YW55IGNhcm5hbCBwbGVhc==
YW55IGNhcm5hbCBwbGVhc===
My simplified version of Base64 regex:
^[A-Za-z0-9+/]*={0,2}$
Simplification is that it doesn't check that its length is a multiple of 4. If you need that - use other answers. Mine is focusing on simplicity.
To test it: https://regex101.com/r/zdtGSH/1