golang string channel send/receive inconsistency

golang string channel send/receive inconsistency - unit-testing

New to go. I'm using 1.5.1. I'm trying to accumulate a word list based on an incoming channel. However, my input channel (wdCh) is sometimes getting the empty string ("") during testing. I'm perplexed. I'd rather not have a test for the empty string before I add its accumulated count in my map. Feels like a hack to me.
package accumulator
import (
"fmt"
"github.com/stretchr/testify/assert"
"testing"
)
var words map[string]int
func Accumulate(wdCh chan string, closeCh chan bool) {
words = make(map[string]int)
for {
select {
case word := <-wdCh:
fmt.Printf("word = %s\n", word)
words[word]++
case <-closeCh:
return
}
}
}
func pushWords(w []string, wdCh chan string) {
for _, value := range w {
fmt.Printf("sending word = %s\n", value)
wdCh <- value
}
close(wdCh)
}
func TestAccumulate(t *testing.T) {
sendWords := []string{"one", "two", "three", "two"}
wMap := make(map[string]int)
wMap["one"] = 1
wMap["two"] = 2
wMap["three"] = 1
wdCh := make(chan string)
closeCh := make(chan bool)
go Accumulate(wdCh, closeCh)
pushWords(sendWords, wdCh)
closeCh <- true
close(closeCh)
assert.Equal(t, wMap, words)
}

Check out this article about channel-axioms. Looks like there's a race between closing wdCh and sending true on the closeCh channel.
So the outcome depends on what gets scheduled first between pushWords returning and Accumulate.
If TestAccumulate runs first, sending true on closeCh, then when Accumulate runs it picks either of the two channels since they can both be run because pushWords closed wdCh.
A receive from a closed channel returns the zero value immediately.
Until closedCh is signaled, Accumulate will randomly put one or more empty "" words in the map.
If Accumulate runs first then it's likely to put many empty strings in the word map as it loops until TestAccumulate runs and finally it sends a signal on closeCh.
An easy fix would be to move
close(wdCh)
after sending true on the closeCh. That way wdCh can't return the zero value until after you've signaled on the closeCh. Additionally, closeCh <- true blocks because closeCh doesn't have a buffer size, so wdCh won't get closed until after you've guaranteed that Accumulate has finished looping forever.

I think the reason is when you close the channle, "select" will although receive the signal.
So when you close "wdCh" in "func pushWords", the loop in Accumulate will receive signal from "<-wdCh".
May be you should add some code to test the action after channel is closed!
for {
select {
case word, ok := <-wdCh:
if !ok {
fmt.Println("channel wdCh is closed!")
continue
}
fmt.Printf("word = %s\n", word)
words[word]++
case <-closeCh:
return
}
}

Related

Multiple concurrency in golang

I'm trying to port a simple synchronous bit of PHP to Go, but am having a hard time getting my head around how concurrency works with regards to channels. The PHP script makes a request to get a list of media library sections, then makes requests to get the items within each of these sections. If the section is a list of TV Shows, it then makes a request for each show to get all the seasons and then another to get the episodes within each season.
I've trying writing in pidgeon-go what I expected to work, but I'm not having any luck. I've tried various channel guides online, but normally end up with deadlock warnings. Currently this example warns about item := <-ch used as value and doesn't look like it's waiting on the goroutines to return. Does anyone have any ideas what I can do?
package main
import (
"fmt"
"time"
)
// Get all items for all sections
func main() {
ch := make(chan string)
sections := getSections()
for _, section := range sections {
go getItemsInSection(section, ch)
}
items := make([]string, 0)
for item := <- ch {
items = append(items, item)
}
fmt.Println(items)
}
// Return a list of the various library sections
func getSections() []string {
return []string{"HD Movies", "Movies", "TV Shows"}
}
// Get items within the given section, note that some items may spawn sub-items
func getItemsInSection(name string, ch chan string) {
time.Sleep(1 * time.Second)
switch name {
case "HD Movies":
ch <- "Avatar"
ch <- "Avengers"
case "Movies":
ch <- "Aliens"
ch <- "Abyss"
case "TV Shows":
go getSubItemsForItem("24", ch)
go getSubItemsForItem("Breaking Bad", ch)
}
}
// Get sub-items for a given parent
func getSubItemsForItem(name string, ch chan string) {
time.Sleep(1 * time.Second)
ch <- name + ": S01E01"
ch <- name + ": S01E02"
}

First, that code doesn't compile because for item := <- ch should be for item := range ch
Now the problem is you either have to close the channel or run your loop forever inside a goroutine.
go func() {
for {
item, ok := <-ch
if !ok {
break
}
fmt.Println(item)
items = append(items, item)
}
}()
time.Sleep(time.Second)
fmt.Println(items)
playground

Idiomatic variable-size worker pool in Go

I'm trying to implement a pool of workers in Go. The go-wiki (and Effective Go in the Channels section) feature excellent examples of bounding resource use. Simply make a channel with a buffer that's as large as the worker pool. Then fill that channel with workers, and send them back into the channel when they're done. Receiving from the channel blocks until a worker is available. So the channel and a loop is the entire implementation -- very cool!
Alternatively one could block on sending into the channel, but same idea.
My question is about changing the size of the worker pool while it's running. I don't believe there's a way to change the size of a channel. I have some ideas, but most of them seem way too complicated. This page actually implements a semaphore using a channel and empty structs in much the same way, but it has the same problem (these things come up all the time while Googling for "golang semaphore".

I would do it the other way round. Instead of spawning many goroutines (which still require a considerable amount of memory) and use a channel to block them, I would model the workers as goroutines and use a channel to distribute the work. Something like this:
package main
import (
"fmt"
"sync"
)
type Task string
func worker(tasks <-chan Task, quit <-chan bool, wg *sync.WaitGroup) {
defer wg.Done()
for {
select {
case task, ok := <-tasks:
if !ok {
return
}
fmt.Println("processing task", task)
case <-quit:
return
}
}
}
func main() {
tasks := make(chan Task, 128)
quit := make(chan bool)
var wg sync.WaitGroup
// spawn 5 workers
for i := 0; i < 5; i++ {
wg.Add(1)
go worker(tasks, quit, &wg)
}
// distribute some tasks
tasks <- Task("foo")
tasks <- Task("bar")
// remove two workers
quit <- true
quit <- true
// add three more workers
for i := 0; i < 3; i++ {
wg.Add(1)
go worker(tasks, quit, &wg)
}
// distribute more tasks
for i := 0; i < 20; i++ {
tasks <- Task(fmt.Sprintf("additional_%d", i+1))
}
// end of tasks. the workers should quit afterwards
close(tasks)
// use "close(quit)", if you do not want to wait for the remaining tasks
// wait for all workers to shut down properly
wg.Wait()
}
It might be a good idea to create a separate WorkerPool type with some convenient methods. Also, instead of type Task string it is quite common to use a struct that also contains a done channel that is used to signal that the task had been executed successfully.
Edit: I've played around a bit more and came up with the following: http://play.golang.org/p/VlEirPRk8V. It's basically the same example, with a nicer API.

A simple change that can think is to have a channel that controls how big is the semaphore.
The relevant part is the select statements. If there is more work from the queue process it with the current semaphore. If there is a request to change the size of the semaphore change it and continue processing the req queue with the new semaphore. Note that the old one is going to be garbage collected.
package main
import "time"
import "fmt"
type Request struct{ num int }
var quit chan struct{} = make(chan struct{})
func Serve(queue chan *Request, resize chan int, semsize int) {
for {
sem := make(chan struct{}, semsize)
var req *Request
select {
case semsize = <-resize:
{
sem = make(chan struct{}, semsize)
fmt.Println("changing semaphore size to ", semsize)
}
case req = <-queue:
{
sem <- struct{}{} // Block until there's capacity to process a request.
go handle(req, sem) // Don't wait for handle to finish.
}
case <-quit:
return
}
}
}
func process(r *Request) {
fmt.Println("Handled Request", r.num)
}
func handle(r *Request, sem chan struct{}) {
process(r) // May take a long time & use a lot of memory or CPU
<-sem // Done; enable next request to run.
}
func main() {
workq := make(chan *Request, 1)
ctrlq := make(chan int)
go func() {
for i := 0; i < 20; i += 1 {
<-time.After(100 * time.Millisecond)
workq <- &Request{i}
}
<-time.After(500 * time.Millisecond)
quit <- struct{}{}
}()
go func() {
<-time.After(500 * time.Millisecond)
ctrlq <- 10
}()
Serve(workq, ctrlq, 1)
}
http://play.golang.org/p/AHOLlAv2LH

Accessing a channel from a different goroutine

I currently have the following code:
package main
import (
"fmt"
"math/rand"
"time"
)
var channel = make(chan []float32, 1)
func main() {
aMap := initMap()
for key := range aMap {
fmt.Print("the iteration number is: ", key)
theSlice := aMap[key]
channel <- theSlice
go transition(channel)
time.Sleep(2 * time.Second)
}
}
func transition(channel chan []float32) {
newSlice := make([]float32,5)
newSlice = <- channel
fmt.Println(newSlice)
//need to retrieve the last element of the other slice and put it on this newSlice
}
func initMap() map[int][]float32 {
aMap := make(map[int][]float32)
for i := 0; i < 2; i++ {
aMap[i] = []float32{rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32(), rand.Float32()}
}
return aMap
}
We have a map with two slices and what i'm trying to do is take the last element of slice1 and make it the last element of slice2 and vice versa.
I'm actually working on a bigger project that involves simulating the division and differentiation of cells. Since its a simulation, all the divisions and differentiations are going on at the same time. The issue is with differentiation, where a cell of typeA is transformed into a cell of typeB. Each type of cell is stored in a different data structure and i figured i could use one channel for each different data structure and use the channels to change the data concurrently.
Hope this makes sense. Any suggestions?
Thanks,
CJ
EDIT: it seems what i'm asking isn't clear. My question for the example above is, how would i go about inter-changing the last elements of each of the two slices concurrently?

Channels are two way, and are about communication, not sharing memory. So perhaps instead of sending the entire slice, just sending the last item in each slice one way down the channel and then sending the second value back the other way.
func main() {
...
// send only the last item
theSlice := aMap[key]
channel <- theSlice[len(theSlice)-1]
// waiting for the response
theSlice[len(theSlice)-1] = <- channel
}
func transition(channel chan []float32) {
newSlice := make([]float32,5)
channel <- newSlice[len(newSlice)-1]
newSlice[len(newSlice)-1] = <- channel
fmt.Println(newSlice)
}
The above code does assume that the channel declaration has been changed away from a slice to just a float32 :
var channel = make(chan float32, 1)

processing jobs from a neverending queue with a fixed number of workers

This is doing my head in, I cant figure out how to solve it;
I want to have a fixed number N of goroutines running in parallell
From a never-ending queue I will fetch X msg about jobs to process
I want to let the N goroutines process these X jobs, and as soon as one of the routines have nothing more to do, I want to fetch another X jobs from the neverending queue
The code in the answer below (see url) works brilliantly to process the tasks, but the workers will die once that tasks list is empty, I want them to stay alive and somehow notify the main code that they are out of work so I can fetch more jobs to fill the tasks list with tasks
How would you define a pool of goroutines to be executed at once in Golang?
Using user:Jsor example code from below, I try to create a simple program, but I am confused.
import (
"fmt"
"strconv"
)
//workChan - read only that delivers work
//requestChan - ??? what is this
func Worker(myid string, workChan <- chan string, requestChan chan<- struct{}) {
for {
select {
case work := <-workChan:
fmt.Println("Channel: " + myid + " do some work: " + work)
case requestChan <- struct{}{}:
//hm? how is the requestChan used?
}
}
}
func Logic(){
workChan := make(chan string)
requestChan := make(chan struct{})
//Create the workers
for i:=1; i < 5; i++ {
Worker( strconv.Itoa( i), workChan, requestChan)
}
//Give the workers some work
for i:=100; i < 115; i++ {
workChan<- "workid"+strconv.Itoa( i)
}
}

This is what the select statement is for.
func Worker(workChan chan<- Work, requestChan chan<- struct{}) {
for {
select {
case work := <-workChan:
// Do work
case requestChan <- struct{}{}:
}
}
}
This worker will run forever and ever. If work is available, it will pull it from the worker channel. If there's nothing left it will send a request.
Not that since it runs forever and ever, if you want to be able to kill a worker you need to do something else. One possibility is to always check ok with workChan and if that channel is closed quit the function. Another option is to use an individual quit channel for each worker.

Compared to the other solution you posted, you just need (first) not to close the channel, and just keep feeding items to it.
Then you need to answer the following question: is it absolutely necessary that (a) you fetch the next X items from your queue only once one of the workers has “nothing more to do” (or, what is the same, once the first X items are either fully processed, or assigned to a worker); or (b) is it okay if you keep the second set of X items in memory, and go feeding them to the workers as new work items are needed?
As I understand it, only (a) needs the requestChan you’re wondering about (see below). For (b), something as simple as the following would suffice:
# B version
type WorkItem int
const (
N = 5 // Number of workers
X = 15 // Number of work items to get from the infinite queue at once
)
func Worker(id int, workChan <-chan WorkItem) {
for {
item := <-workChan
doWork(item)
fmt.Printf("Worker %d processes item #%v\n", id, item)
}
}
func Dispatch(workChan chan<- WorkItem) {
for {
items := GetNextFromQueue(X)
for _, item := range items {
workChan <- item
fmt.Printf("Dispatched item #%v\n", item)
}
}
}
func main() {
workChan := make(chan WorkItem) // Shared amongst all workers; could make it buffered if GetNextFromQueue() is slow.
// Start N workers.
for i := 0; i < N; i++ {
go Worker(i, workChan)
}
// Dispatch items to the workers.
go Dispatch(workChan)
time.Sleep(20 * time.Second) // Ensure main(), and our program, finish.
}
(I’ve uploaded to the Playground a full working solution for (b).)
As for (a), the workers change to say: do work, or if there’s no more work, tell the dispatcher to get more via the reqChan communication channel. That “or” is implemented via select. Then, the dispatcher waits on reqChan before making another call to GetNextFromQueue(). It’s more code, but ensures the semantics that you might be interested in. (The previous version is overall simpler, though.)
# A version
func Worker(id int, workChan <-chan WorkItem, reqChan chan<- int) {
for {
select {
case item := <-workChan:
doWork(item)
fmt.Printf("Worker %d processes item #%v\n", id, item)
case reqChan <- id:
fmt.Printf("Worker %d thinks they requested more work\n", id)
}
}
}
func Dispatch(workChan chan<- WorkItem, reqChan <-chan int) {
for {
items := GetNextFromQueue(X)
for _, item := range items {
workChan <- item
fmt.Printf("Dispatched item #%v\n", item)
}
id := <-reqChan
fmt.Printf("Polling the queue in Dispatch() at the request of worker %d\n", id)
}
}
(I’ve also uploaded to the Playground a full working solution for (a).)

Func will not run; increment channel

I'm writing a function where I'm trying to increment a channel. In a much larger program, this is not working and it actually hangs on a line that looks like:
current = <-channel
The go funcs are running, but the program seems to halt on this line.
I tried to write a smaller SSCCE, but now I'm having a different problem. Here it is:
package main
import (
"fmt"
)
func main() {
count := make(chan int)
go func(count chan int) {
current := 0
for {
current = <-count
current++
count <- current
fmt.Println(count)
}
}(count)
}
However, in the above the go func does not actually seem to be called at all. If I put a fmt.Println statement before for {, it does not print out. If I put fmt.Println statements before or after they go func block, they will both print out.
Why does the self-calling block in the above example not seem to run at all?
If it were running, why would it block on current = <-count? How could I properly increment the channel?

I can't answer the first one issue without more info. The code you did show has two issues. First, the program exits after the goroutine is started. The second issue is that the goroutine is waiting for something to be sent to count, if you receive from the count channel it will not deadlock.
Here is an example showing the deadlock (http://play.golang.org/p/cRgjZt7U2A):
package main
import (
"fmt"
)
func main() {
count := make(chan int)
go func() {
current := 0
for {
current = <-count
current++
count <- current
fmt.Println(count)
}
}()
fmt.Println(<-count)
}
Here is an example of it working the way I think you are expecting (http://play.golang.org/p/QQnRpCDODu)
package main
import (
"fmt"
)
func main() {
count := make(chan int)
go func() {
current := 0
for {
current = <-count
current++
count <- current
fmt.Println(count)
}
}()
count <- 1
fmt.Println(<-count)
}

Channel :- Channel is something that can't store the value. It can only buffer the value so the basic usage is it can send and receive the value. So when you declare count := make(chan int) it does not contain any value. So the statement current = <-count will give you error that all go routines are asleep. Basically channel was design to work as communicator for different go routines which are running on different process and your main function is running on different process.
So your answer to first question is:-
1.Why does the self-calling block in the above example not seem to run at all?
Answer- See the main function you are running has one process and the go routine is running on another process so if your main function gets its execution completed before your go-routine than you will never get result form go-routine because your main thread gets dead after the execution is complete. So i am providing you a web example which is related to your example of incrementing the counter. In this example you will create a server and listen on port 8000.First of all run this example and go in your web browser and type localhost:8000 and it will so you the incrementing counter that channel stores in every buffer. This example will provide you an idea of how channel works.
2.If it were running, why would it block on current = <-count? How could I properly increment the channel?
Answer-You are receiving from the channel but channel does not have anything in its buffer so you will get an error "All go-routines are asleep". First you should transfer value into the channel and correspondingly receive it otherwise it will again go to deadlock.
package main
import (
"fmt"
"http"
)
type webCounter struct {
count chan int
}
func NewCounter() *webCounter {
counter := new(webCounter)
counter.count = make(chan int, 1)
go func() {
for i:=1 ;; i++ { counter.count <- i }
}()
return counter
}
func (w *webCounter) ServeHTTP(r http.ResponseWriter, rq *http.Request) {
if rq.URL.Path != "/" {
r.WriteHeader(http.StatusNotFound)
return
}
fmt.Fprintf(r, "You are visitor %d", <-w.count)
}
func main() {
http.ListenAndServe(":8000", NewCounter());
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

golang string channel send/receive inconsistency - unit-testing

Related

Multiple concurrency in golang

Idiomatic variable-size worker pool in Go

Accessing a channel from a different goroutine

processing jobs from a neverending queue with a fixed number of workers

Func will not run; increment channel

Categories

Resources