It's easy to compare (flat) structs in golang and it's pretty common to have unit tests that verify equality of structs. Now, I'd like to provide developers with proper feedback what exactly went wrong. I know I could implement VisualizeStructDifference somehow, but I'd be surprised if it hasn't been done before. I'd be even surprised if it wasn't easily accessible from golangs standard testing tools.
Can you give me some pointers?
An example:
func TestStruct(t *testing.T) {
cases := []struct {
name string
input InputStruct
want OutputStruct
}{
{
name: "Case A",
input: InputStruct{
A: "Input A",
B: "Input B",
},
want: OutputStruct{
One: "Output One",
Two: "Output Two",
},
},
// ...
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
result := unitUnderTest(tc.input)
if result != tc.want{
t.Errorf("Test %v failed: %v", tc.name, VisualizeStructDifference(result, tc.want))
}
}
}
}
I found a solution for my use case: stretchr/testify provides
assert.EqualValuesf(t, tc.want, result, "%v failed", tc.name)
Output:
Diff:
--- Expected
+++ Actual
## -3,3 +3,3 ##
One: (string) (len=10) "Output One",
- Two: (string) (len=10) "Output Two",
+ Two: (string) (len=9) "Output Tw",
Related
I have two lists. The first contains original product data as following:
data class InputProductData (val optionFamilyInput: String?, val optionCodeInput: String?, val optionDescriptionInput: String?)
val inputProductData = mutableListOf(
InputProductData("AAA", "111","Chimney with red bricks"),
InputProductData(null,"222","Two wide windows in the main floor"),
InputProductData("CCCC",null,"Beautiful door in green color"),
InputProductData("DDDD",null,"House with area 120 square meters"),
InputProductData(null,"555","Old wood windows")
)
Second list consists of customizing data. The list can have many identical option ids (first column).
data class CustomizingProductOption(val id: Int, val optionName: String, val optionCategory: String, val optionFamily: String?, val optionCode: String?, val searchPattern: String?, val outputValue: String)
val customizingProductOptions = mutableListOf(
CustomizingProductOption(10001, "Chimney", "Additional options", "^AAA$", "", "^Chimney with", "Available"),
CustomizingProductOption(10002, "Windows", "Basic options", "", "^222$", "^Two wide windows", "Available"),
CustomizingProductOption(10002, "Windows", "Basic options", "", "^555$", "wood windows$", "Available"),
CustomizingProductOption(10003, "Door color", "Basic options", "^CCCC$", "", "door in green color$", "green"),
CustomizingProductOption(10004, "House area", "Basic options", "^DDD", "", "120 square meters", "120")
)
The target is to check the product input data and to identify different product options. Whitin the following loop it is done by use of a business logic. There are 2 different constelations which can occure:
Option family + regex within option description
Option code + regex within option description
data class IndicatedOptions(val id: Int, val output: String)
val indicatedOptions: MutableList<IndicatedOptions> = mutableListOf()
for (i in 0 until inputProductData.size) {
for (k in 0 until customizingProductOptions.size) {
if(inputProductData[i].optionFamilyInput.toString().contains(Regex(customizingProductOptions[k].optionFamily.toString())) == true &&
inputProductData[i].optionDescriptionInput.toString().contains(Regex(customizingProductOptions[k].searchPattern.toString())) == true ||
inputProductData[i].optionCodeInput.toString().contains(Regex(customizingProductOptions[k].optionCode.toString())) == true &&
inputProductData[i].optionDescriptionInput.toString().contains(Regex(customizingProductOptions[k].searchPattern.toString())) == true) {
indicatedOptions.add(IndicatedOptions(customizingProductOptions[k].id, customizingProductOptions[k].outputValue))
}
}
}
println("\n--- ALL INDICATED OPTIONS ---")
indicatedOptions.forEach { println(it) }
val indicatedOptionsUnique = indicatedOptions.distinct().sortedBy { it.id }
println("\n--- UNIQUE INDICATED OPTIONS ---")
indicatedOptionsUnique.forEach {println(it)}
QUESTION: Do you see any ways to optimize this codein order to get it more faster?
First, the "regex" code looks broken. Why do you test if a String contains a Regex? This is the wrong way around you would normally test a Regex to see if the target string is matched by the Regex.
Ideas for performance
Precompile your Regex in the constructor of CustomizingProductOption
Your if logic is 4 logic ANDs. The code executes first to last in a logical expressions, so arrange the first test to be the one that is most selective (i.e. have the least number of matches).
Ideas for readability
use proper streams, e.g. inputProductData.map { customizingProductOptions.filter { LOGIC } }...
Stop using unnecessary toString() on something that is already a String
Stop testing if a boolean expression ==true
Now with sample code:
# Use Regex class here
data class CustomizingProductOption(
val id: Int, val optionName: String, val optionCategory: String,
val optionFamily: Regex?, val optionCode: Regex?, val searchPattern: String?,
val outputValue: String,
)
# Instantiate like this:
CustomizingProductOption(
10001, "Chimney", "Additional options", Regex("^AAA$"),
null, "^Chimney with", "Available",
),
# main code
val indicatedOptions: List<IndicatedOptions> = inputProductData.map { productData ->
customizingProductOptions.filter { option -> // this filter will only return matching options to product data
productData.optionFamilyInput != null && option.optionFamily?.containsMatchIn(productData.optionFamilyInput) ?: false
//&& other conditions
}
.map {option -> // transform to your desired output
IndicatedOptions(
option.id,
option.outputValue,
)
}
}.flatten() // you need this to flatten List<List<IndicatedOptions>>
I have a struct representing sizes of computer objects. Objects of this struct are constructed from string values input by users; e.g. "50KB" would be tokenised into an int value of "50" and the string value "KB".
type SizeUnit string
const (
B = "B"
KB = "KB"
MB = "MB"
GB = "GB"
TB = "TB"
)
type ObjectSize struct {
NumberOfUnits int
Unit SizeUnit
}
func NewObjectSizeFromString(input_str string) (*ObjectSize, error)
In the body of this function, I first check if the input value is in the valid format; i.e. any number of digits, followed by any one of "B", "KB", "MB", "GB" or "TB". I then extract the int and string components separately and return a pointer to a struct.
In order to do these three things though, I'm having to compile the regex three times.
The first time to check the format of the input string
rg, err := regexp.Compile(`^[0-9]+B$|KB$|MB$|GB$|TB$`)
And then compile again to fetch the int component:
rg, err := regexp.Compile(`^[0-9]+`)
rg.FindString(input_str)
And then compile again to fetch the string/units component:
rg, err := regexp.Compile(`B$|KB$|MB$|GB$|TB$`)
rg.FindString(input_str)
Is there any way to get the two components from the input string with a single regex compilation?
The full code can be found on the Go Playground.
I should point out that this is an academic question as I'm experimenting with Go's regex library. For a simple use-case of this sort, I would probably use a simple for loop to parse the input string.
You can capture both the values with a single expression using regexp.FindStringSubmatch:
func NewObjectSizeFromString(input_str string) (*ObjectSize, error) {
var defaultReturn *ObjectSize = nil
full_search_pattern := `^([0-9]+)([KMGT]?B)$`
rg, err := regexp.Compile(full_search_pattern)
if err != nil {
return defaultReturn, errors.New("Could not compile search expression")
}
matched := rg.FindStringSubmatch(input_str)
if matched == nil {
return defaultReturn, errors.New("Not in valid format")
}
i, err := strconv.ParseInt(matched[1], 10, 32)
return &ObjectSize{int(i), SizeUnit(matched[2])}, nil
}
See the playground.
The ^([0-9]+)([KMGT]?B)$ regex matches
^ - start of string
([0-9]+) - Group 1 (this value will be held in matched[1]): one or more digits
([KMGT]?B) - Group 2 (it will be in matched[2]): an optional K, M, G, T letter, and then a B letter
$ - end of string.
Note that matched[0] will hold the whole match.
Does Kotlin provide a mutation function to split a list when a specific predicate is true?
In the following example the list should be split when the element is a ..
The result should be of the type List<List<String>>.
// input list
val list = listOf(
"This is", "the", "first sentence", ".",
"And", "now there is", "a second", "one", ".",
"Nice", "."
)
// the following should be the result of the transformation
listOf(
listOf("This is", "the", "first sentence"),
listOf("And", "now there is", "a second", "one"),
listOf("Nice")
)
I need something like list.splitWhen { it == "." }
Does Kotlin provide a mutation function to split a list when a
specific predicate is true?
The closest one I have heard of is partition(), however I don't think it will work in your case.
I have made and have briefly tested 3 higher order extension functions, which gives the same expected output.
Solution 1: Straightforward approach
inline fun List<String>.splitWhen(predicate: (String)->Boolean):List<List<String>> {
val list = mutableListOf<MutableList<String>>()
var needNewList = false
forEach {
string->
if(!predicate(string)){
if(needNewList||list.isEmpty()){
list.add(mutableListOf(string))
needNewList= false
}
else {
list.last().add(string)
}
}
else {
/* When a delimiter is found */
needNewList = true
}
}
return list
}
Solution 2: Pair based approach
inline fun List<String>.splitWhen(predicate: (String)->Boolean):List<List<String>> {
val list = mutableListOf<List<String>>()
withIndex()
.filter { indexedValue -> predicate(indexedValue.value) || indexedValue.index==0 || indexedValue.index==size-1} // Just getting the delimiters with their index; Include 0 and last -- so to not ignore it while pairing later on
.zipWithNext() // zip the IndexValue with the adjacent one so to later remove continuous delimiters; Example: Indices : 0,1,2,5,7 -> (0,1),(1,2),(2,5),(5,7)
.filter { pair-> pair.first.index + 1 != pair.second.index } // Getting rid of continuous delimiters; Example: (".",".") will be removed, where "." is the delimiter
.forEach{pair->
val startIndex = if(predicate(pair.first.value)) pair.first.index+1 else pair.first.index // Trying to not consider delimiters
val endIndex = if(!predicate(pair.second.value) && pair.second.index==size-1) pair.second.index+1 else pair.second.index // subList() endIndex is exclusive
list.add(subList(startIndex,endIndex)) // Adding the relevant sub-list
}
return list
}
Solution 3: Check next value if delimiter found approach
inline fun List<String>.splitWhen(predicate: (String)-> Boolean):List<List<String>> =
foldIndexed(mutableListOf<MutableList<String>>(),{index, list, string->
when {
predicate(string) -> if(index<size-1 && !predicate(get(index+1))) list.add(mutableListOf()) // Adds a new List within the output List; To prevent continuous delimiters -- !predicate(get(index+1))
list.isNotEmpty() -> list.last().add(string) // Just adding it to lastly added sub-list, as the string is not a delimiter
else -> list.add(mutableListOf(string)) // Happens for the first String
}
list})
Simply call list.splitWhen{it=="delimiter"}. Solution 3 looks more syntactic sugar. Apart from it, you can do some performance test to check which one performs well.
Note: I have done some brief tests which you can have a look via Kotlin Playground or via Github gist.
Is there any method in Go or having regular expression that it will remove only the articles used in the string?
I have tried below code that will do it but it will also remove other words from the string I'm showing the code below:
removalString := "This is a string"
stringToRemove := []string{"a", "an", "the", "is"}
for _, wordToRemove := range stringToRemove {
removalString = strings.Replace(removalString, wordToRemove, "", -1)
}
space := regexp.MustCompile(`\s+`)
trimedExtraSpaces := space.ReplaceAllString(removalString, " ")
spacesCovertedtoDashes := strings.Replace(trimedExtraSpaces, " ", "-", -1)
slug := strings.ToLower(spacesCovertedtoDashes)
fmt.Println(slug)
Edited
Play link
In this It will remove the is which is used in the this.
The Expected output is this-string
You can use strings.Split and strings.Join plus a loop for filtering and then building it together again:
removalString := "This is a string"
stringToRemove := []string{"a", "an", "the", "is"}
filteredStrings := make([]string, 0)
for _, w := range strings.Split(removalString, " ") {
shouldAppend := true
lowered := strings.ToLower(w)
for _, w2 := range stringToRemove {
if lowered == w2 {
shouldAppend = false
break
}
}
if shouldAppend {
filteredStrings = append(filteredStrings, lowered)
}
}
resultString := strings.Join(filteredStrings, "-")
fmt.Printf(resultString)
Outpus:
this-string
Program exited.
Here you have the live example
My version just using regexp
Construct a regexp of the form '\ba\b|\ban\b|\bthe\b|\bis\b|' which will find
the words in the list that have "word boundaries" on both sides - so "This" is not matched
Second regexp reduces any spaces to dashes and makes multiple spaces a single dash
package main
import (
"bytes"
"fmt"
"regexp"
)
func main() {
removalString := "This is a strange string"
stringToRemove := []string{"a", "an", "the", "is"}
var reg bytes.Buffer
for _, x := range stringToRemove {
reg.WriteString(`\b`) // word boundary
reg.WriteString(x)
reg.WriteString(`\b`)
reg.WriteString(`|`) // alternation operator
}
regx := regexp.MustCompile(reg.String())
slug := regx.ReplaceAllString(removalString, "")
regx2 := regexp.MustCompile(` +`)
slug = regx2.ReplaceAllString(slug, "-")
fmt.Println(slug)
}
I'm parsing a web page to get some values inside labels, but I'm not interested in the label, only in the content.
I'm using regexp.FindAll to get all the matching expressions (including the label) and then ReplaceAll to replace every subexpression, removing the label. Running the regexp twice takes double of time, of course, and I'd like to avoid it.
Is there a way apply both functions simultaneously, or an equivalent regexp?
Of course, I could make a function to remove the label but in some cases could be more complex because of the variable-length labels (like ) and a regexp can take care of this.
A simple example of my code is here (it won't run in the playground): http://play.golang.org/p/uGKjzmylSY
func main() {
res, err := http.Get("http://www.elpais.es")
if err != nil {
panic(err)
}
body, err := ioutil.ReadAll(res.Body)
fmt.Println("body: ", len(body), cap(body))
res.Body.Close()
if err != nil {
panic(err)
}
r := regexp.MustCompile("<li>(.+)</li>")
// Find all subexpressions, containing the label <li>
out := r.FindAll(body, -1)
for i, v := range out[:10] {
fmt.Printf("%d: %s\n", i, v)
}
//Replace to remove the label.
out2 := make([][]byte, len(out))
for i, v := range out {
out2[i] = r.ReplaceAll(v, []byte("$1"))
}
for i, v := range out2[:10] {
fmt.Printf("%d: %s\n", i, v)
}
}
By the way, I understand that regex cannot be used to parse HTML. I'm only interested in some of the innermost labels, not in the structure or nestings, so I suppose it is OK :)
Recommendation: Use goquery for that task, very simple to use and reduces your code by so much.
Example:
doc, _ := goquery.NewDocument("http://www.elpais.es")
text := doc.Find("li").Slice(10, -1).Text()
Regarding your question, use FindAllSubmatch to extract the match directly:
r := regexp.MustCompile("<li>(.+)</li>")
// Find all subexpressions, containing the label <li>
out := r.FindAllSubmatch(body, -1)
for i, v := range out[:10] {
fmt.Printf("%d: %s\n", i, v[1])
}