I made a small program to benchmark go channel throughput, however it always deadlocks, I tried very hard but cannot understand why:
package main
import (
"fmt"
"runtime"
)
const CONCURRENCY = 32
const WORK_PER_WORKER = 100
const TOTAL_WORK = CONCURRENCY * WORK_PER_WORKER
func work() {
sum := 0
for i := 0; i < 10000000; i++ {
sum *= i
}
}
type WorkItem struct {
Done chan int
}
func main() {
runtime.GOMAXPROCS(CONCURRENCY)
var workQueue [CONCURRENCY]chan *WorkItem
// initialize workers
for i := 0; i < CONCURRENCY; i++ {
workQueue[i] = make(chan *WorkItem)
}
// start workers
for i := 0; i < CONCURRENCY; i++ {
go func(i int) {
anItem := <-workQueue[i]
work()
anItem.Done <- 1
}(i)
}
completed := make(chan bool, TOTAL_WORK)
for i := 0; i < TOTAL_WORK; i++ {
go func(i int) {
// send work to queues
workToDo := &WorkItem{Done: make(chan int)}
workQueue[i/WORK_PER_WORKER] <- workToDo // !! DEADLOCK
// wait until the work is done
<-workToDo.Done
completed <- true
}(i)
}
fmt.Println("Waiting")
for i := 0; i < TOTAL_WORK; i++ {
<-completed
}
}
Your code go func(i int) { anItem := <-workQueue[i]; ... } removes juste 1 item from workQueue[i] but you are trying to stuff WORK_PER_WORKER items into it. You will work on CONCURRENCY many items and after that all reading goroutines have terminated and you have your deadlock.
Looping in the worker goroutines "solves" your deadlock: http://play.golang.org/p/j2pavqnBDv
Just "solves" because these worker goroutines will never terminate. Maybe you can experiment with closeing your channels to notify the worker goroutines when nothing will be sent.
Because your worker only process one task and then exit. Thus, only first CONCURRENCY items proceed and then workQueue[i/WORK_PER_WORKER] <- workToDo blocks indifinitely. Thus, completed chan never receive enough values and main also blocks forever.
Your worker should do work in loops, like this:
for i := 0; i < CONCURRENCY; i++ {
go func(i int) {
for anItem := range workQueue[i] {
work()
anItem.Done <- 1
}
}(i)
}
Related
This is a follow up from my previous question.
I am trying to build a prototype for a webcrawler and I want to use a chan to block the execution until all the jobs are done, just as in
func main() {
go func() {
do_stuff()
stop <- true
}
fmt.Println(<-stop)
}
There is a queue function that dispatch the jobs to the workers. When all jobs are finished, the function will also the channel and send a signal.
type Job int
//simulating a worker that processes a html page and returns some more links
func worker(in chan Job, out chan Job, num int) {
for element := range in {
if element%2 == 0 {
out <- 100*element + 5
out <- 100*element + 3
out <- 100*element + 1
}
}
}
func queue(toWorkers chan<- Job, fromWorkers <-chan Job, init Job, stop chan bool) {
var list []Job
var currentJobs int
currentJobs = 0
list = append(list, init)
done := make(map[Job]bool)
for {
var send chan<- Job
var item Job
if len(list) > 0 {
send = toWorkers
item = list[0]
} else if currentJobs == 0 {
close(toWorkers)
// this messes up everything!
stop <- true
return
}
select {
case send <- item:
currentJobs += 1
// We sent an item, remove it
list = list[1:]
case thing := <-fromWorkers:
currentJobs -= 1
// Got a new thing
if !done[thing] {
list = append(list, thing)
done[thing] = true
}
}
}
}
func main() {
in := make(chan Job, 1)
out := make(chan Job, 1)
stop := make(chan bool)
// dispatches jobs to workers
go queue(in, out, 0, stop)
for i := 0; i < max_workers; i++ {
go worker(in, out, i)
}
duration := time.Second
time.Sleep(duration)
// this cause deadlock
fmt.Println(<-stop)
}
Link to playground
If I understand correctly, the problem is with the stop channel: when the workers still have jobs, go thinks that no one will send to that channel and declares deadlock. The function queue will both close the toWorkers channel and send a signal to stop, but not while there are outstanding jobs.
What am I missing?
Use sync.WaitGroup to wait for all the go routines to end.
http://golang.org/pkg/sync/#WaitGroup
http://blog.golang.org/pipelines
I made a small example here: http://play.golang.org/p/P30LdV0Gfe
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
routinesNo := 10
wg.Add(routinesNo)
for i := 0; i < routinesNo; i++ {
go func(n int) {
fmt.Printf("%d ", n)
wg.Done()
}(i)
}
wg.Wait()
fmt.Println("\nThe end!")
}
func main() {
jobs := []Job{job1, job2, job3}
numOfJobs := len(jobs)
resultsChan := make(chan *Result, numOfJobs)
jobChan := make(chan *job, numOfJobs)
go consume(numOfJobs, jobChan, resultsChan)
for i := 0; i < numOfJobs; i++ {
jobChan <- jobs[i]
}
close(jobChan)
for i := 0; i < numOfJobs; i++ {
<-resultsChan
}
close(resultsChan)
}
func (b *Blockchain) consume(num int, jobChan chan *Job, resultsChan chan *Result) {
for i := 0; i < num; i++ {
go func() {
job := <-jobChan
resultsChan <- doJob(job)
}()
}
}
In the above example, jobs are pushed into the jobChan and goroutines will pull it off the jobChan and execute the jobs concurrently and push results into resultsChan. We will then pull results out of resultsChan.
Question 1:
In my code, there is no serialized/linearilized results. Although jobs go in the order of job1, job2, job3. The results might come out as job3, job1, job2, depending which one takes the longest.
I would still like to execute the jobs concurrently, however, I need to make sure that results come out of the resultsChan in the same order that it went in as jobs.
Question2:
I have approximately 300k jobs, this means the code will generate up to 300k goroutines. Is this efficient to have so many goroutines or would I be better off group the jobs together in a slice of 100 or so and have each goroutine go through 100 rather than 1.
Here's a way I've handled serialization (and also setting a limited number of workers). I set some worker objects with input and output fields and synchronization channels, then I go round-robin through them, picking up any work they've done and giving them a new job. Then I make one final pass through them to pick up any completed jobs that are left over. Note you might want the worker count to exceed your core count somewhat, so that you can keep all resources busy for a bit even when there's one unusually long job. Code is at http://play.golang.org/p/PM9y4ieMxw and below.
This is hairy (hairier than I remember it being before sitting down to write an example!)--would love to see what anyone else has, either just better implementations or a whole different way to accomplish your goal.
package main
import (
"fmt"
"math/rand"
"runtime"
"time"
)
type Worker struct {
in int
out int
inited bool
jobReady chan bool
done chan bool
}
func (w *Worker) work() {
time.Sleep(time.Duration(rand.Float32() * float32(time.Second)))
w.out = w.in + 1000
}
func (w *Worker) listen() {
for <-w.jobReady {
w.work()
w.done <- true
}
}
func doSerialJobs(in chan int, out chan int) {
concurrency := 23
workers := make([]Worker, concurrency)
i := 0
// feed in and get out items
for workItem := range in {
w := &workers[i%
concurrency]
if w.inited {
<-w.done
out <- w.out
} else {
w.jobReady = make(chan bool)
w.done = make(chan bool)
w.inited = true
go w.listen()
}
w.in = workItem
w.jobReady <- true
i++
}
// get out any job results left over after we ran out of input
for n := 0; n < concurrency; n++ {
w := &workers[i%concurrency]
if w.inited {
<-w.done
out <- w.out
}
close(w.jobReady)
i++
}
close(out)
}
func main() {
runtime.GOMAXPROCS(10)
in, out := make(chan int), make(chan int)
allFinished := make(chan bool)
go doSerialJobs(in, out)
go func() {
for result := range out {
fmt.Println(result)
}
allFinished <- true
}()
for i := 0; i < 100; i++ {
in <- i
}
close(in)
<-allFinished
}
Note that only in and out in this example carry actual data--all the other channels are just for synchronization.
I have concurrent goroutines which want to append a (pointer to a) struct to the same slice.
How do you write that in Go to make it concurrency-safe?
This would be my concurrency-unsafe code, using a wait group:
var wg sync.WaitGroup
MySlice = make([]*MyStruct)
for _, param := range params {
wg.Add(1)
go func(param string) {
defer wg.Done()
OneOfMyStructs := getMyStruct(param)
MySlice = append(MySlice, &OneOfMyStructs)
}(param)
}
wg.Wait()
I guess you would need to use go channels for concurrency-safety. Can anyone contribute with an example?
There is nothing wrong with guarding the MySlice = append(MySlice, &OneOfMyStructs) with a sync.Mutex. But of course you can have a result channel with buffer size len(params) all goroutines send their answers and once your work is finished you collect from this result channel.
If your params has a fixed size:
MySlice = make([]*MyStruct, len(params))
for i, param := range params {
wg.Add(1)
go func(i int, param string) {
defer wg.Done()
OneOfMyStructs := getMyStruct(param)
MySlice[i] = &OneOfMyStructs
}(i, param)
}
As all goroutines write to different memory this isn't racy.
The answer posted by #jimt is not quite right, in that it misses the last value sent in the channel and the last defer wg.Done() is never called. The snippet below has the corrections.
https://play.golang.org/p/7N4sxD-Bai
package main
import "fmt"
import "sync"
type T int
func main() {
var slice []T
var wg sync.WaitGroup
queue := make(chan T, 1)
// Create our data and send it into the queue.
wg.Add(100)
for i := 0; i < 100; i++ {
go func(i int) {
// defer wg.Done() <- will result in the last int to be missed in the receiving channel
queue <- T(i)
}(i)
}
go func() {
// defer wg.Done() <- Never gets called since the 100 `Done()` calls are made above, resulting in the `Wait()` to continue on before this is executed
for t := range queue {
slice = append(slice, t)
wg.Done() // ** move the `Done()` call here
}
}()
wg.Wait()
// now prints off all 100 int values
fmt.Println(slice)
}
I wanted to add that since you know how many values you are expecting from the channel, you may not need to make use of any synchronization primitives. Just read from the channel as much data as you are expecting and leave it alone:
borrowing #chris' answer
package main
import "fmt"
type T int
func main() {
var slice []T
queue := make(chan T)
// Create our data and send it into the queue.
for i := 0; i < 100; i++ {
go func(i int) {
queue <- T(i)
}(i)
}
for i := 0; i < 100; i++ {
select {
case t := <-queue:
slice = append(slice, t)
}
}
// now prints off all 100 int values
fmt.Println(slice)
}
The select will block until the channels receives some data, so we can rely on this behaviour to just read from the channel 100 times before exiting.
In your case, you can just do:
package main
func main() {
MySlice = []*MyStruct{}
queue := make(chan *MyStruct)
for _, param := range params {
go func(param string) {
OneOfMyStructs := getMyStruct(param)
queue <- &OneOfMyStructs
}(param)
}
for _ := range params {
select {
case OneOfMyStructs := <-queue:
MySlice = append(MySlice, OneOfMyStructs)
}
}
}
I have a test program that gives different results when executing more than one goroutine on more than one Cpu (Goroutines = Cpus). The "test" is about syncing goroutines using channels, and the program itself counts occurences of chars in strings. It produces consistent results on one Cpu / one goroutine.
See code example on playground (Note: Run on local machine to execute on multi core, and watch the resulting numbers vary): http://play.golang.org/p/PT5jeCKgBv .
Code summary: The program counts occurences of 4 different chars (A,T, G,C) in (DNA) strings.
Problem: Result (n occurences of chars) varies when executed on multiple Cpu's (goroutines). Why?
Description:
A goroutine spawns work (SpawnWork) as strings to Workers. Sets up
artificial string input data (hardcoded strings are copied n times).
Goroutine Workers (Worker) are created equalling the numbers of Cpu's.
Workers checks each char in string and counts A,T's and sends the
sum into a channel, and G,C counts to another channel.
SpawnWork closes workstring channel as to control Workers (which consumes strings using range, which quits when the input channel is closed by SpawnWork).
When Workers has consumed its ranges (of chars) it sends a quit signal on the quit channel (quit <- true). These "pulses" will occure Cpu number of times ( Cpu count = goroutines count).
Main (select) loop will quit when it has received Cpu-count number of quit
signals.
Main func prints a summary of occurences of Chars (A,T's, G,C's).
Simplified code:
1. "Worker" (goroutines) counting chars in lines:
func Worker(inCh chan *[]byte, resA chan<- *int, resB chan<- *int, quit chan bool) {
//for p_ch := range inCh {
for {
p_ch, ok := <-inCh // similar to range
if ok {
ch := *p_ch
for i := 0; i < len(ch); i++ {
if ch[i] == 'A' || ch[i] == 'T' { // Count A:s and T:s
at++
} else if ch[i] == 'G' || ch[i] == 'C' { // Count G:s and C:s
gc++
}
}
resA <- &at // Send line results on separate channels
resB <- &gc // Send line results on separate channels
} else {
quit <- true // Indicate that we're all done
break
}
}
}
2. Spawn work (strings) to workers:
func SpawnWork(inStr chan<- *[]byte, quit chan bool) {
// Artificial input data
StringData :=
"NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
"NTGAGAAATATGCTTTCTACTTTTTTGTTTAATTTGAACTTGAAAACAAAACACACACAA\n" +
"... etc\n" +
// ...
for scanner.Scan() {
s := scanner.Bytes()
if len(s) == 0 || s[0] == '>' {
continue
} else {
i++
inStr <- &s
}
}
close(inStr) // Indicate (to Workers) that there's no more strings coming.
}
3. Main routine:
func main() {
// Count Cpus, and count down in final select clause
CpuCnt := runtime.NumCPU()
runtime.GOMAXPROCS(CpuCnt)
// Make channels
resChA := make(chan *int)
resChB := make(chan *int)
quit := make(chan bool)
inStr := make(chan *[]byte)
// Set up Workers ( n = Cpu )
for i := 0; i < CpuCnt; i++ {
go Worker(inStr, resChA, resChB, quit)
}
// Send lines to Workers
go SpawnWork(inStr, quit)
// Count the number of "A","T" & "G","C" per line
// (comes in here as ints per row, on separate channels (at and gt))
for {
select {
case tmp_at := <-resChA:
tmp_gc := <-resChB // Ch A and B go in pairs anyway
A += *tmp_at // sum of A's and T's
B += *tmp_gc // sum of G's and C's
case <-quit:
// Each goroutine sends "quit" signals when it's done. Since
// the number of goroutines equals the Cpu counter, we count
// down each time a goroutine tells us it's done (quit at 0):
CpuCnt--
if CpuCnt == 0 { // When all goroutines are done then we're done.
goto out
}
}
}
out:
// Print report to screen
}
Why does this code count consistently only when executed on a singel cpu/goroutine? That is, the channels doesn't seem to sync, or the main loop quits forcefully before all goroutines are done? Scratching head.
(Again: See/run the full code at the playground: http://play.golang.org/p/PT5jeCKgBv )
// Rolf Lampa
Here is a working version which consistently produces the same results no matter how many cpus are used.
Here is what I did
remove passing of *int - very racy to pass in a channel!
remove passing of *[]byte - pointless as slices are reference types anyway
copy the slice before putting it in the channel - the slice points to the same memory causing a race
fix initialisation of at and gc in Worker - they were in the wrong place - this was the major cause of the difference in results
use sync.WaitGroup for synchronisation and channel close()
I used the -race parameter of go build to find and fix the data races.
package main
import (
"bufio"
"fmt"
"runtime"
"strings"
"sync"
)
func Worker(inCh chan []byte, resA chan<- int, resB chan<- int, wg *sync.WaitGroup) {
defer wg.Done()
fmt.Println("Worker started...")
for ch := range inCh {
at := 0
gc := 0
for i := 0; i < len(ch); i++ {
if ch[i] == 'A' || ch[i] == 'T' {
at++
} else if ch[i] == 'G' || ch[i] == 'C' {
gc++
}
}
resA <- at
resB <- gc
}
}
func SpawnWork(inStr chan<- []byte) {
fmt.Println("Spawning work:")
// An artificial input source.
StringData :=
"NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
"NTGAGAAATATGCTTTCTACTTTTTTGTTTAATTTGAACTTGAAAACAAAACACACACAA\n" +
"CTTCCCAATTGGATTAGACTATTAACATTTCAGAAAGGATGTAAGAAAGGACTAGAGAGA\n" +
"TATACTTAATGTTTTTAGTTTTTTAAACTTTACAAACTTAATACTGTCATTCTGTTGTTC\n" +
"AGTTAACATCCCTGAATCCTAAATTTCTTCAGATTCTAAAACAAAAAGTTCCAGATGATT\n" +
"TTATATTACACTATTTACTTAATGGTACTTAAATCCTCATTNNNNNNNNCAGTACGGTTG\n" +
"TTAAATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN\n" +
"NNNNNNNCTTCAGAAATAAGTATACTGCAATCTGATTCCGGGAAATATTTAGGTTCATAA\n"
// Expand data n times
tmp := StringData
for n := 0; n < 1000; n++ {
StringData = StringData + tmp
}
scanner := bufio.NewScanner(strings.NewReader(StringData))
scanner.Split(bufio.ScanLines)
var i int
for scanner.Scan() {
s := scanner.Bytes()
if len(s) == 0 || s[0] == '>' {
continue
} else {
i++
s_copy := append([]byte(nil), s...)
inStr <- s_copy
}
}
close(inStr)
}
func main() {
CpuCnt := runtime.NumCPU() // Count down in select clause
CpuOut := CpuCnt // Save for print report
runtime.GOMAXPROCS(CpuCnt)
fmt.Printf("Processors: %d\n", CpuCnt)
resChA := make(chan int)
resChB := make(chan int)
inStr := make(chan []byte)
fmt.Println("Spawning workers:")
var wg sync.WaitGroup
for i := 0; i < CpuCnt; i++ {
wg.Add(1)
go Worker(inStr, resChA, resChB, &wg)
}
fmt.Println("Spawning work:")
go func() {
SpawnWork(inStr)
wg.Wait()
close(resChA)
close(resChB)
}()
A := 0
B := 0
LineCnt := 0
for tmp_at := range resChA {
tmp_gc := <-resChB // Theese go together anyway
A += tmp_at
B += tmp_gc
LineCnt++
}
if !(A+B > 0) {
fmt.Println("No A/B was found!")
} else {
ABFraction := float32(B) / float32(A+B)
fmt.Println("\n----------------------------")
fmt.Printf("Cpu's : %d\n", CpuOut)
fmt.Printf("Lines : %d\n", LineCnt)
fmt.Printf("A+B : %d\n", A+B)
fmt.Printf("A : %d\n", A)
fmt.Printf("B : %d\n", A)
fmt.Printf("AB frac: %v\n", ABFraction*100)
fmt.Println("----------------------------")
}
}
What I would like to do is have a set of producer goroutines (of which some may or may not complete) and a consumer routine. The issue is with that caveat in parentheses - we don't know the total number that will return an answer.
So what I want to do is this:
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int) {
// May or may not produce.
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
}
func main() {
c := make(chan int, 10)
for i := 0; i < 10; i++ {
go producer(c, signal)
}
// If we include a close, then that's WRONG. Chan will be closed
// but a producer will try to write to it. Runtime error.
close(c)
// If we don't close, then that's WRONG. All goroutines will
// deadlock, since the range keyword will look for a close.
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
fmt.Println("All done.")
}
So the issue is, if I close it's wrong, if I don't close - it's still wrong (see comments in code).
Now, the solution would be an out-of-band signal channel, that ALL producers write to:
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int, signal chan bool) {
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
signal <- true
}
func main() {
c := make(chan int, 10)
signal := make(chan bool, 10)
for i := 0; i < 10; i++ {
go producer(c, signal)
}
// This is basically a 'join'.
num_done := 0
for num_done < 10 {
<- signal
num_done++
}
close(c)
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
fmt.Println("All done.")
}
And that totally does what I want! But to me it seems like a mouthful. My question is: Is there any idiom/trick that lets me do something similar in an easier way?
I had a look here: http://golang.org/doc/codewalk/sharemem/
And it seems like the complete chan (initialised at the start of main) is used in a range but never closed. I do not understand how.
If anyone has any insights, I would greatly appreciate it. Cheers!
Edit: fls0815 has the answer, and has also answered the question of how the close-less channel range works.
My code above modifed to work (done before fls0815 kindly supplied code):
package main
import (
"fmt"
"math/rand"
"sync"
)
var wg_prod sync.WaitGroup
var wg_cons sync.WaitGroup
func producer(c chan int) {
success := rand.Float32() > 0.5
if success {
c <- rand.Int()
}
wg_prod.Done()
}
func main() {
c := make(chan int, 10)
wg_prod.Add(10)
for i := 0; i < 10; i++ {
go producer(c)
}
wg_cons.Add(1)
go func() {
for num := range c {
fmt.Printf("Producer produced: %d\n", num)
}
wg_cons.Done()
} ()
wg_prod.Wait()
close(c)
wg_cons.Wait()
fmt.Println("All done.")
}
Only producers should close channels. You could achieve your goal by invoking consumer(s) which iterates (range) over the resulting channel once your producers were started. In your main thread you wait (see sync.WaitGroup) until your consumers/producers finished their work. After producers finished you close the resulting channel which will force your consumers to exit (range will exit when channels are closed and no buffered item is left).
Example code:
package main
import (
"log"
"sync"
"time"
"math/rand"
"runtime"
)
func consumer() {
defer consumer_wg.Done()
for item := range resultingChannel {
log.Println("Consumed:", item)
}
}
func producer() {
defer producer_wg.Done()
success := rand.Float32() > 0.5
if success {
resultingChannel <- rand.Int()
}
}
var resultingChannel = make(chan int)
var producer_wg sync.WaitGroup
var consumer_wg sync.WaitGroup
func main() {
rand.Seed(time.Now().Unix())
for c := 0; c < runtime.NumCPU(); c++ {
producer_wg.Add(1)
go producer()
}
for c := 0; c < runtime.NumCPU(); c++ {
consumer_wg.Add(1)
go consumer()
}
producer_wg.Wait()
close(resultingChannel)
consumer_wg.Wait()
}
The reason I put the close-statement into the main function is because we have more than one producer. Closing the channel in one producer in the example above would lead to the problem you already ran into (writing on closed channels; the reason is that there could one producer left who still produces data). Channels should only be closed when there is no producer left (therefore my suggestion on closing the channel only by the producer). This is how channels are constructed in Go. Here you'll find some more information on closing channels.
Related to the sharemem example: AFAICS this example runs endless by re-queuing the Resources again and again (from pending -> complete -> pending -> complete... and so on). This is what the iteration at the end of the main-func does. It receives the completed Resources and re-queues them using Resource.Sleep() to pending. When there is no completed Resource it waits and blocks for new Resources being completed. Therefore there is no need to close the channels because they are in use all the time.
There are always lots of ways to solve these problems. Here's a solution using the simple synchronous channels that are fundamental in Go. No buffered channels, no closing channels, no WaitGroups.
It's really not that far from your "mouthful" solution, and--sorry to disappoint--not that much smaller. It does put the consumer in it's own goroutine, so that the consumer can consume numbers as the producer produces them. It also makes the distinction that a production "try" can end in either success or failure. If production fails, the try is done immediately. If it succeeds, the try is not done until the number is consumed.
package main
import (
"fmt"
"math/rand"
)
func producer(c chan int, fail chan bool) {
if success := rand.Float32() > 0.5; success {
c <- rand.Int()
} else {
fail <- true
}
}
func consumer(c chan int, success chan bool) {
for {
num := <-c
fmt.Printf("Producer produced: %d\n", num)
success <- true
}
}
func main() {
const nTries = 10
c := make(chan int)
done := make(chan bool)
for i := 0; i < nTries; i++ {
go producer(c, done)
}
go consumer(c, done)
for i := 0; i < nTries; i++ {
<-done
}
fmt.Println("All done.")
}
I'm adding this because the extant answers don't make a couple things clear. First, the range loop in the codewalk example is just an infinite event loop, there to keep re-checking and updating the same url list forever.
Next, a channel, all by itself, already is the idiomatic consumer-producer queue in Go. The size of the async buffer backing the channel determines how much producers can produce before getting backpressure. Set N = 0 below to see lock-step producer consumer without anyone racing ahead or getting behind. As it is, N = 10 will let the producer produce up to 10 products before blocking.
Last, there are some nice idioms for writing communicating sequential processees in Go (e.g. functions that start go routines for you, and using the for/select pattern to communicate and accept control commands). I think of WaitGroups as clumsy, and would like to see idiomatic examples instead.
package main
import (
"fmt"
"time"
)
type control int
const (
sleep control = iota
die // receiver will close the control chan in response to die, to ack.
)
func (cmd control) String() string {
switch cmd {
case sleep: return "sleep"
case die: return "die"
}
return fmt.Sprintf("%d",cmd)
}
func ProduceTo(writechan chan<- int, ctrl chan control, done chan bool) {
var product int
go func() {
for {
select {
case writechan <- product:
fmt.Printf("Producer produced %v\n", product)
product++
case cmd:= <- ctrl:
fmt.Printf("Producer got control cmd: %v\n", cmd)
switch cmd {
case sleep:
fmt.Printf("Producer sleeping 2 sec.\n")
time.Sleep(2000 * time.Millisecond)
case die:
fmt.Printf("Producer dies.\n")
close(done)
return
}
}
}
}()
}
func ConsumeFrom(readchan <-chan int, ctrl chan control, done chan bool) {
go func() {
var product int
for {
select {
case product = <-readchan:
fmt.Printf("Consumer consumed %v\n", product)
case cmd:= <- ctrl:
fmt.Printf("Consumer got control cmd: %v\n", cmd)
switch cmd {
case sleep:
fmt.Printf("Consumer sleeping 2 sec.\n")
time.Sleep(2000 * time.Millisecond)
case die:
fmt.Printf("Consumer dies.\n")
close(done)
return
}
}
}
}()
}
func main() {
N := 10
q := make(chan int, N)
prodCtrl := make(chan control)
consCtrl := make(chan control)
prodDone := make(chan bool)
consDone := make(chan bool)
ProduceTo(q, prodCtrl, prodDone)
ConsumeFrom(q, consCtrl, consDone)
// wait for a moment, to let them produce and consume
timer := time.NewTimer(10 * time.Millisecond)
<-timer.C
// tell producer to pause
fmt.Printf("telling producer to pause\n")
prodCtrl <- sleep
// wait for a second
timer = time.NewTimer(1 * time.Second)
<-timer.C
// tell consumer to pause
fmt.Printf("telling consumer to pause\n")
consCtrl <- sleep
// tell them both to finish
prodCtrl <- die
consCtrl <- die
// wait for that to actually happen
<-prodDone
<-consDone
}
You can use simple unbuffered channels without wait groups if you use the generator pattern with a fanIn function.
In the generator pattern, each producer returns a channel and is responsible for closing it. A fanIn function then iterates over these channels and forwards the values returned on them down a single channel that it returns.
The problem of course, is that the fanIn function forwards the zero value of the channel type (int) when each channel is closed.
You can work around it by using the zero value of your channel type as a sentinel value and only using the results from the fanIn channel if they are not the zero value.
Here's an example:
package main
import (
"fmt"
"math/rand"
)
const offset = 1
func producer() chan int {
cout := make(chan int)
go func() {
defer close(cout)
// May or may not produce.
success := rand.Float32() > 0.5
if success {
cout <- rand.Int() + offset
}
}()
return cout
}
func fanIn(cin []chan int) chan int {
cout := make(chan int)
go func() {
defer close(cout)
for _, c := range cin {
cout <- <-c
}
}()
return cout
}
func main() {
chans := make([]chan int, 0)
for i := 0; i < 10; i++ {
chans = append(chans, producer())
}
for num := range fanIn(chans) {
if num > offset {
fmt.Printf("Producer produced: %d\n", num)
}
}
fmt.Println("All done.")
}
producer-consumer is such a common pattern that I write a library prosumer for convenience with dealing with chan communication carefully. Eg:
func main() {
maxLoop := 10
var wg sync.WaitGroup
wg.Add(maxLoop)
defer wg.Wait()
consumer := func(ls []interface{}) error {
fmt.Printf("get %+v \n", ls)
wg.Add(-len(ls))
return nil
}
conf := prosumer.DefaultConfig(prosumer.Consumer(consumer))
c := prosumer.NewCoordinator(conf)
c.Start()
defer c.Close(true)
for i := 0; i < maxLoop; i++ {
fmt.Printf("try put %v\n", i)
discarded, err := c.Put(i)
if err != nil {
fmt.Errorf("discarded elements %+v for err %v", discarded, err)
wg.Add(-len(discarded))
}
time.Sleep(time.Second)
}
}
close has a param called graceful, which means whether drain the underlying chan.