I'm trying to understand the problem outlined on this slide:
http://talks.golang.org/2013/bestpractices.slide#27
Copying the code in case the URL dies:
func sendMsg(msg, addr string) error {
conn, err := net.Dial("tcp", addr)
if err != nil {
return err
}
defer conn.Close()
_, err = fmt.Fprint(conn, msg)
return err
}
func broadcastMsg(msg string, addrs []string) error {
errc := make(chan error)
for _, addr := range addrs {
go func(addr string) {
errc <- sendMsg(msg, addr)
fmt.Println("done")
}(addr)
}
for _ = range addrs {
if err := <-errc; err != nil {
return err
}
}
return nil
}
func main() {
addr := []string{"localhost:8080", "http://google.com"}
err := broadcastMsg("hi", addr)
time.Sleep(time.Second)
if err != nil {
fmt.Println(err)
return
}
fmt.Println("everything went fine")
}
And the comments:
the goroutine is blocked on the chan write
the goroutine holds a reference to the chan
the chan will never be garbage collected
I'm not sure I understand why the chan never gets collected or which goroutine is keeping a reference to the chan. Your time is appreciated!
The Go Programming Language Specification
Function literals
A function literal represents an anonymous function.
FunctionLit = "func" Function .
func(a, b int, z float64) bool { return a*b < int(z) }
A function literal can be assigned to a variable or invoked directly.
f := func(x, y int) int { return x + y }
func(ch chan int) { ch <- ACK }(replyChan)
Function literals are closures: they may refer to variables defined in
a surrounding function. Those variables are then shared between the
surrounding function and the function literal, and they survive as
long as they are accessible.
Send statements
A send statement sends a value on a channel. The channel expression
must be of channel type, the channel direction must permit send
operations, and the type of the value to be sent must be assignable to
the channel's element type.
SendStmt = Channel "<-" Expression .
Channel = Expression .
Both the channel and the value expression are evaluated before
communication begins. Communication blocks until the send can proceed.
A send on an unbuffered channel can proceed if a receiver is ready. A
send on a buffered channel can proceed if there is room in the buffer.
A send on a closed channel proceeds by causing a run-time panic. A
send on a nil channel blocks forever.
There is only one go statement, go func(addr string), and it's a closure over the channel variable errc.
func broadcastMsg(msg string, addrs []string) error {
errc := make(chan error)
for _, addr := range addrs {
go func(addr string) {
errc <- sendMsg(msg, addr)
fmt.Println("done")
}(addr)
}
for _ = range addrs {
if err := <-errc; err != nil {
return err
}
}
return nil
}
Two goroutines are started since len(addrs) == 2. Because of a premature exit when err != nil on the first receive on channel errc, only one goroutine completes. The second goroutine is blocked on the send (write) to the unbuffered channel errc; it never completes. Therefore, there is still a reference to errc, so it's never garbage collected. The second goroutine is eventually abandoned when the program exits.
Related
This example is taken from http://blog.golang.org/pipelines. It runs and gives correct answer but it shows following runtime error: "fatal error: all goroutines are asleep - deadlock!". Could anyone help me understand why this is happening?
package main
import (
"fmt"
)
func gen(nums ...int) <- chan int {
out := make(chan int)
go func() {
for _, n := range nums {
out <- n
}
}()
return out
}
func sq(in <- chan int) <- chan int {
out := make(chan int)
go func() {
for n := range in {
out <- n * n
}
close(out)
}()
return out
}
func main() {
for n := range sq(gen(2,3)) {
fmt.Println(n)
}
}
However the following modification doesn't.
func main() {
// Set up the pipeline.
c := gen(2, 3)
out := sq(c)
// Consume the output.
fmt.Println(<-out) // 4
fmt.Println(<-out) // 9
}
The for n := range in of the sq() function never exits, and start blocking (after reading 2 values), because gen() never closed its channel.
Adding close(out) to the go func of gen() would make it work: see playground.
With channel, the receiver blocks until receiving a value.
The range keyword, when used with a channel, will wait on the channel until it is closed.
sq() is blocked, which means close(out) is never called, and in turn main() blocks on range sq() (since the channel sq isn't closed).
In your second example, main() itself exits, which means even though sq() is blocked, everything still stops.
I'm wondering how can I drain / close the buffered channels so that I don't get into the deadlock? I'm using range to loop through the channels but it seems that although they are "read" they don't get closed like the non-buffered channels do.
package main
func main() {
cp := 2
ch := make(chan string, cp)
for i := 0; i < cp; i++ {
go send(ch)
}
go send(ch)
for lc := range ch {
print(lc)
}
}
func send(ch chan string) {
ch <- "hello\n"
}
Play
You can close channels using the close() builtin. This has to be called after all of your concurrent processing is done. How you're doing that depends on what you want to do.
In your current architecture it seems that you have to establish a global state, something that tracks all your processes and determines that the last one finished. Such a state can be achieved by using a sync.WaitGroup for example.
func send(c chan string, wg *sync.WaitGroup) {
defer wg.Done()
// ...
}
wg := &sync.WaitGroup{}
for i := 0; i < cp; i++ {
wg.Add(1)
go send(ch, wg)
}
wg.Add(1)
go send(ch, wg)
wg.Wait()
close(ch)
for e := range(ch) {
// ...
}
Note that closing the channel and then iterating over it will give you only the elements that are queued in the channel. This means that any goroutine that wanted to put a value in the channel can't do this anymore as the channel is closed.
I have concurrent goroutines which want to append a (pointer to a) struct to the same slice.
How do you write that in Go to make it concurrency-safe?
This would be my concurrency-unsafe code, using a wait group:
var wg sync.WaitGroup
MySlice = make([]*MyStruct)
for _, param := range params {
wg.Add(1)
go func(param string) {
defer wg.Done()
OneOfMyStructs := getMyStruct(param)
MySlice = append(MySlice, &OneOfMyStructs)
}(param)
}
wg.Wait()
I guess you would need to use go channels for concurrency-safety. Can anyone contribute with an example?
There is nothing wrong with guarding the MySlice = append(MySlice, &OneOfMyStructs) with a sync.Mutex. But of course you can have a result channel with buffer size len(params) all goroutines send their answers and once your work is finished you collect from this result channel.
If your params has a fixed size:
MySlice = make([]*MyStruct, len(params))
for i, param := range params {
wg.Add(1)
go func(i int, param string) {
defer wg.Done()
OneOfMyStructs := getMyStruct(param)
MySlice[i] = &OneOfMyStructs
}(i, param)
}
As all goroutines write to different memory this isn't racy.
The answer posted by #jimt is not quite right, in that it misses the last value sent in the channel and the last defer wg.Done() is never called. The snippet below has the corrections.
https://play.golang.org/p/7N4sxD-Bai
package main
import "fmt"
import "sync"
type T int
func main() {
var slice []T
var wg sync.WaitGroup
queue := make(chan T, 1)
// Create our data and send it into the queue.
wg.Add(100)
for i := 0; i < 100; i++ {
go func(i int) {
// defer wg.Done() <- will result in the last int to be missed in the receiving channel
queue <- T(i)
}(i)
}
go func() {
// defer wg.Done() <- Never gets called since the 100 `Done()` calls are made above, resulting in the `Wait()` to continue on before this is executed
for t := range queue {
slice = append(slice, t)
wg.Done() // ** move the `Done()` call here
}
}()
wg.Wait()
// now prints off all 100 int values
fmt.Println(slice)
}
I wanted to add that since you know how many values you are expecting from the channel, you may not need to make use of any synchronization primitives. Just read from the channel as much data as you are expecting and leave it alone:
borrowing #chris' answer
package main
import "fmt"
type T int
func main() {
var slice []T
queue := make(chan T)
// Create our data and send it into the queue.
for i := 0; i < 100; i++ {
go func(i int) {
queue <- T(i)
}(i)
}
for i := 0; i < 100; i++ {
select {
case t := <-queue:
slice = append(slice, t)
}
}
// now prints off all 100 int values
fmt.Println(slice)
}
The select will block until the channels receives some data, so we can rely on this behaviour to just read from the channel 100 times before exiting.
In your case, you can just do:
package main
func main() {
MySlice = []*MyStruct{}
queue := make(chan *MyStruct)
for _, param := range params {
go func(param string) {
OneOfMyStructs := getMyStruct(param)
queue <- &OneOfMyStructs
}(param)
}
for _ := range params {
select {
case OneOfMyStructs := <-queue:
MySlice = append(MySlice, OneOfMyStructs)
}
}
}
What is the idiomatic way to cast multiple return values in Go?
Can you do it in a single line, or do you need to use temporary variables such as I've done in my example below?
package main
import "fmt"
func oneRet() interface{} {
return "Hello"
}
func twoRet() (interface{}, error) {
return "Hejsan", nil
}
func main() {
// With one return value, you can simply do this
str1 := oneRet().(string)
fmt.Println("String 1: " + str1)
// It is not as easy with two return values
//str2, err := twoRet().(string) // Not possible
// Do I really have to use a temp variable instead?
temp, err := twoRet()
str2 := temp.(string)
fmt.Println("String 2: " + str2 )
if err != nil {
panic("unreachable")
}
}
By the way, is it called casting when it comes to interfaces?
i := interface.(int)
You can't do it in a single line.
Your temporary variable approach is the way to go.
By the way, is it called casting when it comes to interfaces?
It is actually called a type assertion.
A type cast conversion is different:
var a int
var b int64
a = 5
b = int64(a)
func silly() (interface{}, error) {
return "silly", nil
}
v, err := silly()
if err != nil {
// handle error
}
s, ok := v.(string)
if !ok {
// the assertion failed.
}
but more likely what you actually want is to use a type switch, like-a-this:
switch t := v.(type) {
case string:
// t is a string
case int :
// t is an int
default:
// t is some other type that we didn't name.
}
Go is really more about correctness than it is about terseness.
Or just in a single if:
if v, ok := value.(migrater); ok {
v.migrate()
}
Go will take care of the cast inside the if clause and let you access the properties of the casted type.
template.Must is the standard library's approach for returning only the first return value in one statement. Could be done similarly for your case:
func must(v interface{}, err error) interface{} {
if err != nil {
panic(err)
}
return v
}
// Usage:
str2 := must(twoRet()).(string)
By using must you basically say that there should never be an error, and if there is, then the program can't (or at least shouldn't) keep operating, and will panic instead.
What I would like to do is have a set of producer goroutines (of which some may or may not complete) and a consumer routine. The issue is with that caveat in parentheses - we don't know the total number that will return an answer.
So what I want to do is this:
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int) {
// May or may not produce.
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
}
func main() {
c := make(chan int, 10)
for i := 0; i < 10; i++ {
go producer(c, signal)
}
// If we include a close, then that's WRONG. Chan will be closed
// but a producer will try to write to it. Runtime error.
close(c)
// If we don't close, then that's WRONG. All goroutines will
// deadlock, since the range keyword will look for a close.
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
fmt.Println("All done.")
}
So the issue is, if I close it's wrong, if I don't close - it's still wrong (see comments in code).
Now, the solution would be an out-of-band signal channel, that ALL producers write to:
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int, signal chan bool) {
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
signal <- true
}
func main() {
c := make(chan int, 10)
signal := make(chan bool, 10)
for i := 0; i < 10; i++ {
go producer(c, signal)
}
// This is basically a 'join'.
num_done := 0
for num_done < 10 {
<- signal
num_done++
}
close(c)
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
fmt.Println("All done.")
}
And that totally does what I want! But to me it seems like a mouthful. My question is: Is there any idiom/trick that lets me do something similar in an easier way?
I had a look here: http://golang.org/doc/codewalk/sharemem/
And it seems like the complete chan (initialised at the start of main) is used in a range but never closed. I do not understand how.
If anyone has any insights, I would greatly appreciate it. Cheers!
Edit: fls0815 has the answer, and has also answered the question of how the close-less channel range works.
My code above modifed to work (done before fls0815 kindly supplied code):
package main
import (
"fmt"
"math/rand"
"sync"
)
var wg_prod sync.WaitGroup
var wg_cons sync.WaitGroup
func producer(c chan int) {
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
wg_prod.Done()
}
func main() {
c := make(chan int, 10)
wg_prod.Add(10)
for i := 0; i < 10; i++ {
go producer(c)
}
wg_cons.Add(1)
go func() {
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
wg_cons.Done()
} ()
wg_prod.Wait()
close(c)
wg_cons.Wait()
fmt.Println("All done.")
}
Only producers should close channels. You could achieve your goal by invoking consumer(s) which iterates (range) over the resulting channel once your producers were started. In your main thread you wait (see sync.WaitGroup) until your consumers/producers finished their work. After producers finished you close the resulting channel which will force your consumers to exit (range will exit when channels are closed and no buffered item is left).
Example code:
package main
import (
"log"
"sync"
"time"
"math/rand"
"runtime"
)
func consumer() {
defer consumer_wg.Done()
for item := range resultingChannel {
log.Println("Consumed:", item)
}
}
func producer() {
defer producer_wg.Done()
success := rand.Float32() > 0.5
if success {
resultingChannel <- rand.Int()
}
}
var resultingChannel = make(chan int)
var producer_wg sync.WaitGroup
var consumer_wg sync.WaitGroup
func main() {
rand.Seed(time.Now().Unix())
for c := 0; c < runtime.NumCPU(); c++ {
producer_wg.Add(1)
go producer()
}
for c := 0; c < runtime.NumCPU(); c++ {
consumer_wg.Add(1)
go consumer()
}
producer_wg.Wait()
close(resultingChannel)
consumer_wg.Wait()
}
The reason I put the close-statement into the main function is because we have more than one producer. Closing the channel in one producer in the example above would lead to the problem you already ran into (writing on closed channels; the reason is that there could one producer left who still produces data). Channels should only be closed when there is no producer left (therefore my suggestion on closing the channel only by the producer). This is how channels are constructed in Go. Here you'll find some more information on closing channels.
Related to the sharemem example: AFAICS this example runs endless by re-queuing the Resources again and again (from pending -> complete -> pending -> complete... and so on). This is what the iteration at the end of the main-func does. It receives the completed Resources and re-queues them using Resource.Sleep() to pending. When there is no completed Resource it waits and blocks for new Resources being completed. Therefore there is no need to close the channels because they are in use all the time.
There are always lots of ways to solve these problems. Here's a solution using the simple synchronous channels that are fundamental in Go. No buffered channels, no closing channels, no WaitGroups.
It's really not that far from your "mouthful" solution, and--sorry to disappoint--not that much smaller. It does put the consumer in it's own goroutine, so that the consumer can consume numbers as the producer produces them. It also makes the distinction that a production "try" can end in either success or failure. If production fails, the try is done immediately. If it succeeds, the try is not done until the number is consumed.
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int, fail chan bool) {
if success := rand.Float32() > 0.5; success {
c <- rand.Int()
} else {
fail <- true
}
}
func consumer(c chan int, success chan bool) {
for {
num := <-c
fmt.Printf("Producer produced: %d\n", num)
success <- true
}
}
func main() {
const nTries = 10
c := make(chan int)
done := make(chan bool)
for i := 0; i < nTries; i++ {
go producer(c, done)
}
go consumer(c, done)
for i := 0; i < nTries; i++ {
<-done
}
fmt.Println("All done.")
}
I'm adding this because the extant answers don't make a couple things clear. First, the range loop in the codewalk example is just an infinite event loop, there to keep re-checking and updating the same url list forever.
Next, a channel, all by itself, already is the idiomatic consumer-producer queue in Go. The size of the async buffer backing the channel determines how much producers can produce before getting backpressure. Set N = 0 below to see lock-step producer consumer without anyone racing ahead or getting behind. As it is, N = 10 will let the producer produce up to 10 products before blocking.
Last, there are some nice idioms for writing communicating sequential processees in Go (e.g. functions that start go routines for you, and using the for/select pattern to communicate and accept control commands). I think of WaitGroups as clumsy, and would like to see idiomatic examples instead.
package main
import (
"fmt"
"time"
)
type control int
const (
sleep control = iota
die // receiver will close the control chan in response to die, to ack.
)
func (cmd control) String() string {
switch cmd {
case sleep: return "sleep"
case die: return "die"
}
return fmt.Sprintf("%d",cmd)
}
func ProduceTo(writechan chan<- int, ctrl chan control, done chan bool) {
var product int
go func() {
for {
select {
case writechan <- product:
fmt.Printf("Producer produced %v\n", product)
product++
case cmd:= <- ctrl:
fmt.Printf("Producer got control cmd: %v\n", cmd)
switch cmd {
case sleep:
fmt.Printf("Producer sleeping 2 sec.\n")
time.Sleep(2000 * time.Millisecond)
case die:
fmt.Printf("Producer dies.\n")
close(done)
return
}
}
}
}()
}
func ConsumeFrom(readchan <-chan int, ctrl chan control, done chan bool) {
go func() {
var product int
for {
select {
case product = <-readchan:
fmt.Printf("Consumer consumed %v\n", product)
case cmd:= <- ctrl:
fmt.Printf("Consumer got control cmd: %v\n", cmd)
switch cmd {
case sleep:
fmt.Printf("Consumer sleeping 2 sec.\n")
time.Sleep(2000 * time.Millisecond)
case die:
fmt.Printf("Consumer dies.\n")
close(done)
return
}
}
}
}()
}
func main() {
N := 10
q := make(chan int, N)
prodCtrl := make(chan control)
consCtrl := make(chan control)
prodDone := make(chan bool)
consDone := make(chan bool)
ProduceTo(q, prodCtrl, prodDone)
ConsumeFrom(q, consCtrl, consDone)
// wait for a moment, to let them produce and consume
timer := time.NewTimer(10 * time.Millisecond)
<-timer.C
// tell producer to pause
fmt.Printf("telling producer to pause\n")
prodCtrl <- sleep
// wait for a second
timer = time.NewTimer(1 * time.Second)
<-timer.C
// tell consumer to pause
fmt.Printf("telling consumer to pause\n")
consCtrl <- sleep
// tell them both to finish
prodCtrl <- die
consCtrl <- die
// wait for that to actually happen
<-prodDone
<-consDone
}
You can use simple unbuffered channels without wait groups if you use the generator pattern with a fanIn function.
In the generator pattern, each producer returns a channel and is responsible for closing it. A fanIn function then iterates over these channels and forwards the values returned on them down a single channel that it returns.
The problem of course, is that the fanIn function forwards the zero value of the channel type (int) when each channel is closed.
You can work around it by using the zero value of your channel type as a sentinel value and only using the results from the fanIn channel if they are not the zero value.
Here's an example:
package main
import (
"fmt"
"math/rand"
)
const offset = 1
func producer() chan int {
cout := make(chan int)
go func() {
defer close(cout)
// May or may not produce.
success := rand.Float32() > 0.5
if success {
cout <- rand.Int() + offset
}
}()
return cout
}
func fanIn(cin []chan int) chan int {
cout := make(chan int)
go func() {
defer close(cout)
for _, c := range cin {
cout <- <-c
}
}()
return cout
}
func main() {
chans := make([]chan int, 0)
for i := 0; i < 10; i++ {
chans = append(chans, producer())
}
for num := range fanIn(chans) {
if num > offset {
fmt.Printf("Producer produced: %d\n", num)
}
}
fmt.Println("All done.")
}
producer-consumer is such a common pattern that I write a library prosumer for convenience with dealing with chan communication carefully. Eg:
func main() {
maxLoop := 10
var wg sync.WaitGroup
wg.Add(maxLoop)
defer wg.Wait()
consumer := func(ls []interface{}) error {
fmt.Printf("get %+v \n", ls)
wg.Add(-len(ls))
return nil
}
conf := prosumer.DefaultConfig(prosumer.Consumer(consumer))
c := prosumer.NewCoordinator(conf)
c.Start()
defer c.Close(true)
for i := 0; i < maxLoop; i++ {
fmt.Printf("try put %v\n", i)
discarded, err := c.Put(i)
if err != nil {
fmt.Errorf("discarded elements %+v for err %v", discarded, err)
wg.Add(-len(discarded))
}
time.Sleep(time.Second)
}
}
close has a param called graceful, which means whether drain the underlying chan.