How to write DRY and effective if-statement in golang? - if-statement

I have following code:
func main() {
// counts := make(map[string]int)
files := os.Args[1:]
if len(files) == 0 {
counts := make(map[string]int)
countLines(os.Stdin, counts)
fmt.Println("os.Stdin")
printCounts(counts)
} else {
for _, arg := range files {
counts := make(map[string]int)
f, err := os.Open(arg)
if err != nil {
fmt.Fprintf(os.Stderr, "dup2: %v\n", err)
continue
}
countLines(f, counts)
f.Close()
// print counts of each file
printCounts(counts)
}
}
}
func printCounts(counts map[string]int) {
//...
}
func countLines(f *os.File, counts map[string]int){
//...
}
where i repeat myself in if-else statement by initiating counts dict twice,
(counts := make(map[string]int)) both in if and else.
My question is what is the gopher-way of writing this?
Is that better to do the allocation outside the if-else statment with new and do initiation in every block?

I don't see much repetition in your code. You could somehow merge both if and else part but I'm not a fan of.
A simple refactor is moving counts initialization into your countLines function and make it return it.
func countLines(f *os.File, counts map[string]int)
->
func countLines(f *os.File) map[string]int
And don't think much about allocations until you are doing a lot (let's say at least 100K allocations) and profile your code before doing little optimizations. Maps will allocate memory not only on make but also when you append to them and their hash table is full.

Related

Error handling of the lines where regex doesn't match while reading a file

I am trying to read a log file and match some string in each line. Right now, If the line doesn't have any matching string the program exits with an error because the length of res == 0 and stops reading lines after. I want the program to continue to read the next lines even if regex doesn't match in between.
func analyzeLog(s string) (*time.Time, bool) {
res := regexp.MustCompile(LogLineRegex).FindAllStringSubmatch(s, 1)
if len(res) == 0 {
panic("Not Matching")
}
timeString := res[0][1]
description := res[0][2]
t, err := time.Parse(TimeFormat, timeString)
check(err)
return &t, strings.HasPrefix(description, ErrorTerm)
}
func readLogFile(offset int64) (*ErrorMetrics, int64, error) {
...
...
...
for {
line, _, err := r.ReadLine()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
t, hasError := analyzeLog(string(line))
if hasError {
em.Count += 1
}
}
...
...
...
}
What would be a good way to move ahead?
Don’t panic() and your program won’t exit.
You can use log.Println(err) for example and from there handle more precisely when do you want to log and when do you want to panic().
“ The panic built-in function stops normal execution of the current goroutine. When a function F calls panic, normal execution of F stops immediately. Any functions whose execution was deferred by F are run in the usual way, and then F returns to its caller. To the caller G, the invocation of F then behaves like a call to panic, terminating G's execution and running any deferred functions. This continues until all functions in the executing goroutine have stopped, in reverse order. At that point, the program is terminated with a non-zero exit code. This termination sequence is called panicking and can be controlled by the built-in function recover.”
You can also recover from the error but that would mean you have to recover and start reading your logs from the next line and I don’t think its what you want. You just use panic() in the context where you don’t want panic. Use panic where further execution is not possible.
Hope this helps.
I have returned the nil value if the returning array length is zero. Now, the function looks like below.
func analyzeLog(s string) (*time.Time, bool) {
res := regexp.MustCompile(LogLineRegex).FindAllStringSubmatch(s, 1)
if len(res) == 0 {
return nil, false
}
timeString := res[0][1]
description := res[0][2]
t, err := time.Parse(TimeFormat, timeString)
check(err)
return &t, strings.HasPrefix(description, ErrorTerm)
}

Learning to write unit tests

I am trying to learn how to write tests for my code in order to write better code, but I just seem to have the hardest time figuring out how to actually test some code I have written. I have read so many tutorials, most of which seem to only cover functions that add two numbers or mock some database or server.
I have a simple function I wrote below that takes a text template and a CSV file as input and executes the template using the values of the CSV. I have "tested" the code by trial and error, passing files, and printing values, but I would like to learn how to write proper tests for it. I feel that learning to test my own code will help me understand and learn faster and better. Any help is appreciated.
// generateCmds generates configuration commands from a text template using
// the values from a CSV file. Multiple commands in the text template must
// be delimited by a semicolon. The first row of the CSV file is assumed to
// be the header row and the header values are used for key access in the
// text template.
func generateCmds(cmdTmpl string, filename string) ([]string, error) {
t, err := template.New("cmds").Parse(cmdTmpl)
if err != nil {
return nil, fmt.Errorf("parsing template: %v", err)
}
f, err := os.Open(filename)
if err != nil {
return nil, fmt.Errorf("reading file: %v", err)
}
defer f.Close()
records, err := csv.NewReader(f).ReadAll()
if err != nil {
return nil, fmt.Errorf("reading records: %v", err)
}
if len(records) == 0 {
return nil, errors.New("no records to process")
}
var (
b bytes.Buffer
cmds []string
keys = records[0]
vals = make(map[string]string, len(keys))
)
for _, rec := range records[1:] {
for k, v := range rec {
vals[keys[k]] = v
}
if err := t.Execute(&b, vals); err != nil {
return nil, fmt.Errorf("executing template: %v", err)
}
for _, s := range strings.Split(b.String(), ";") {
if cmd := strings.TrimSpace(s); cmd != "" {
cmds = append(cmds, cmd)
}
}
b.Reset()
}
return cmds, nil
}
Edit: Thanks for all the suggestions so far! My question was flagged as being too broad, so I have some specific questions regarding my example.
Would a test table be useful in a function like this? And, if so, would the test struct need to include the returned cmds string slice and the value of err? For example:
type tmplTest struct {
name string // test name
tmpl string // the text template
filename string // CSV file with template values
expected []string // expected configuration commands
err error // expected error
}
How do you handle errors that are supposed to be returned for specific test cases? For example, os.Open() returns an error of type *PathError if an error is encountered. How do I initialize a *PathError that is equivalent to the one returned by os.Open()? Same idea for template.Parse(), template.Execute(), etc.
Edit 2: Below is a test function I came up with. My two question from the first edit still stand.
package cmd
import (
"testing"
"strings"
"path/filepath"
)
type tmplTest struct {
name string // test name
tmpl string // text template to execute
filename string // CSV containing template text values
cmds []string // expected configuration commands
}
var tests = []tmplTest{
{"empty_error", ``, "", nil},
{"file_error", ``, "fake_file.csv", nil},
{"file_empty_error", ``, "empty.csv", nil},
{"file_fmt_error", ``, "fmt_err.csv", nil},
{"template_fmt_error", `{{ }{{`, "test_values.csv", nil},
{"template_key_error", `{{.InvalidKey}}`, "test_values.csv", nil},
}
func TestGenerateCmds(t *testing.T) {
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
cmds, err := generateCmds(tc.tmpl, filepath.Join("testdata", tc.filename))
if err != nil {
// Unexpected error. Fail the test.
if !strings.Contains(tc.name, "error") {
t.Fatal(err)
}
// TODO: Otherwise, check that the function failed at the expected point.
}
if tc.cmds == nil && cmds != nil {
t.Errorf("expected no commands; got %d", len(cmds))
}
if len(cmds) != len(tc.cmds) {
t.Errorf("expected %d commands; got %d", len(tc.cmds), len(cmds))
}
for i := range cmds {
if cmds[i] != tc.cmds[i] {
t.Errorf("expected %q; got %q", tc.cmds[i], cmds[i])
}
}
})
}
}
You basically need to have some sample files with the contents you want to test, then in your test code you can call the generateCmds function passing in the template string and the files to then verify that the results are what you expect.
It is not so much different as the examples you probably saw for simpler cases.
You can place the files under a testdata folder inside the same package (testdata is a special name that the Go tools will ignore during build).
Then you can do something like:
func TestCSVProcessing(t *testing.T) {
templateStr := `<your template here>`
testFile := "testdata/yourtestfile.csv"
result, err := generateCmds(templateStr, testFile)
if err != nil {
// fail the test here, unless you expected an error with this file
}
// compare the "result" contents with what you expected
// failing the test if it does not match
}
EDIT
About the specific questions you added later:
Would a test table be useful in a function like this? And, if so, would the test struct need to include the returned cmds string slice and the value of err?
Yes, it'd make sense to include both the expected strings to be returned as well as the expected error (if any).
How do you handle errors that are supposed to be returned for specific test cases? For example, os.Open() returns an error of type *PathError if an error is encountered. How do I initialize a *PathError that is equivalent to the one returned by os.Open()?
I don't think you'll be able to "initialize" an equivalent error for each case. Sometimes the libraries might use internal types for their errors making this impossible. Easiest would be to "initialize" a regular error with the same value returned in its Error() method, then just compare the returned error's Error() value with the expected one.

golang string channel send/receive inconsistency

New to go. I'm using 1.5.1. I'm trying to accumulate a word list based on an incoming channel. However, my input channel (wdCh) is sometimes getting the empty string ("") during testing. I'm perplexed. I'd rather not have a test for the empty string before I add its accumulated count in my map. Feels like a hack to me.
package accumulator
import (
"fmt"
"github.com/stretchr/testify/assert"
"testing"
)
var words map[string]int
func Accumulate(wdCh chan string, closeCh chan bool) {
words = make(map[string]int)
for {
select {
case word := <-wdCh:
fmt.Printf("word = %s\n", word)
words[word]++
case <-closeCh:
return
}
}
}
func pushWords(w []string, wdCh chan string) {
for _, value := range w {
fmt.Printf("sending word = %s\n", value)
wdCh <- value
}
close(wdCh)
}
func TestAccumulate(t *testing.T) {
sendWords := []string{"one", "two", "three", "two"}
wMap := make(map[string]int)
wMap["one"] = 1
wMap["two"] = 2
wMap["three"] = 1
wdCh := make(chan string)
closeCh := make(chan bool)
go Accumulate(wdCh, closeCh)
pushWords(sendWords, wdCh)
closeCh <- true
close(closeCh)
assert.Equal(t, wMap, words)
}
Check out this article about channel-axioms. Looks like there's a race between closing wdCh and sending true on the closeCh channel.
So the outcome depends on what gets scheduled first between pushWords returning and Accumulate.
If TestAccumulate runs first, sending true on closeCh, then when Accumulate runs it picks either of the two channels since they can both be run because pushWords closed wdCh.
A receive from a closed channel returns the zero value immediately.
Until closedCh is signaled, Accumulate will randomly put one or more empty "" words in the map.
If Accumulate runs first then it's likely to put many empty strings in the word map as it loops until TestAccumulate runs and finally it sends a signal on closeCh.
An easy fix would be to move
close(wdCh)
after sending true on the closeCh. That way wdCh can't return the zero value until after you've signaled on the closeCh. Additionally, closeCh <- true blocks because closeCh doesn't have a buffer size, so wdCh won't get closed until after you've guaranteed that Accumulate has finished looping forever.
I think the reason is when you close the channle, "select" will although receive the signal.
So when you close "wdCh" in "func pushWords", the loop in Accumulate will receive signal from "<-wdCh".
May be you should add some code to test the action after channel is closed!
for {
select {
case word, ok := <-wdCh:
if !ok {
fmt.Println("channel wdCh is closed!")
continue
}
fmt.Printf("word = %s\n", word)
words[word]++
case <-closeCh:
return
}
}

Go app hangs when testing a function that contains a lock

This is a function I wrote that adds a request to a request queue:
func (self *RequestQueue) addRequest(request *Request) {
self.requestLock.Lock()
self.queue[request.NormalizedUrl()] = request.ResponseChannel
self.requestLock.Unlock()
}
and this is one of its tests:
func TestAddRequest(t *testing.T) {
before := len(rq.queue)
r := SampleRequests(1)[0]
rq.addRequest(&r)
if (len(rq.queue) - 1) != before {
t.Errorf("Failed to add request to queue")
}
}
When I run this test, the application hangs. If I comment out this test, everything works fine.
I think the problem is the locking inside the function. Is there something that I'm doing wrong?
Thanks for your help!
The problem was an infinite loop in the SampleRequests() function:
func SampleRequests(num int) []Request {
requests := make([]Request, num, num+10)
for i := 0; i < len(requests); i++ {
r := NewRequest("GET", "http://api.openweathermap.org/data/2.5/weather", nil)
r.Params.Set("lat", "35")
r.Params.Add("lon", "139")
r.Params.Add("units", "metric")
requests = append(requests, r)
}
return requests
}
I was checking if i was less than the length of the array in the continuation condition of the for loop. With each iteration, an item was added to the array, the length increased and the for loop continued executing.

Multiple concurrency in golang

I'm trying to port a simple synchronous bit of PHP to Go, but am having a hard time getting my head around how concurrency works with regards to channels. The PHP script makes a request to get a list of media library sections, then makes requests to get the items within each of these sections. If the section is a list of TV Shows, it then makes a request for each show to get all the seasons and then another to get the episodes within each season.
I've trying writing in pidgeon-go what I expected to work, but I'm not having any luck. I've tried various channel guides online, but normally end up with deadlock warnings. Currently this example warns about item := <-ch used as value and doesn't look like it's waiting on the goroutines to return. Does anyone have any ideas what I can do?
package main
import (
"fmt"
"time"
)
// Get all items for all sections
func main() {
ch := make(chan string)
sections := getSections()
for _, section := range sections {
go getItemsInSection(section, ch)
}
items := make([]string, 0)
for item := <- ch {
items = append(items, item)
}
fmt.Println(items)
}
// Return a list of the various library sections
func getSections() []string {
return []string{"HD Movies", "Movies", "TV Shows"}
}
// Get items within the given section, note that some items may spawn sub-items
func getItemsInSection(name string, ch chan string) {
time.Sleep(1 * time.Second)
switch name {
case "HD Movies":
ch <- "Avatar"
ch <- "Avengers"
case "Movies":
ch <- "Aliens"
ch <- "Abyss"
case "TV Shows":
go getSubItemsForItem("24", ch)
go getSubItemsForItem("Breaking Bad", ch)
}
}
// Get sub-items for a given parent
func getSubItemsForItem(name string, ch chan string) {
time.Sleep(1 * time.Second)
ch <- name + ": S01E01"
ch <- name + ": S01E02"
}
First, that code doesn't compile because for item := <- ch should be for item := range ch
Now the problem is you either have to close the channel or run your loop forever inside a goroutine.
go func() {
for {
item, ok := <-ch
if !ok {
break
}
fmt.Println(item)
items = append(items, item)
}
}()
time.Sleep(time.Second)
fmt.Println(items)
playground