understand the code - Go concurrency pattern: Daisy Chain - concurrency

I was studying Go concurrency pattern.
One pattern I am not sure is: Daisy Chain https://talks.golang.org/2012/concurrency.slide#39
It's very hard for me to understand the control flow of the code.
Can someone explain to me ?
package main
import (
"fmt"
)
func f(left, right chan int) {
left <- 1 + <-right
}
func main() {
const n = 10000
leftmost := make(chan int)
right := leftmost //point B: what does these do ?
left := leftmost
for i := 0; i < n; i++ {
right = make(chan int)
go f(left, right)
left = right //point A
}
go func(c chan int) { c <- 1 }(right)
fmt.Println(<-leftmost)
}
Conclusion:
the flow of channel going from right to left. It is good practice to write
func f(left chan<- int, right <-chan int) rather than original function signature as above.
'chain reaction' does not start until c <- 1, when signal 1 is sent to right most channel,
reaction goes all the way to left most end. Print out 10001.
The reason is go channel block 'read' until received channel receive signal.
#Rick-777 shows how to use array like structure for easy understanding.
Since each go coroutine is just around 6k big. It's not a bad idea to make 10k channel.
I clean up some code around Point B, for channel initialization.
Here is the source code:
http://play.golang.org/p/1kFYPypr0l

VonC has already given a direct answer. Here are some further remarks.
A slightly tidied-up version is in the playground, the difference being that the channels passed as parameters have their direction specified explicitly, ie. <-chan and chan<-. It's good practice to do this because the compiler can catch more mistakes for you.
An alternative and equivalent program that has a daisy-chain of n goroutines can be written using an array of channels instead. This allocates the same total number of channels using fewer lines of code. See playground:
package main
import (
"fmt"
)
func f(left chan<- int, right <-chan int) {
left <- 1 + <-right
}
func main() {
const n = 10000
// first we construct an array of n+1 channels each being a 'chan int'
var channels [n+1]chan int
for i := range channels {
channels[i] = make(chan int)
}
// now we wire n goroutines in a chain
for i := 0; i < n; i++ {
go f(channels[i], channels[i+1])
}
// insert a value into the right-hand end
go func(c chan<- int) { c <- 1 }(channels[n])
// pick up the value emerging from the left-hand end
fmt.Println(<-channels[0])
}
I hope you can see now how the original program is equivalent to this program. There is one minor difference: the original program does not create any channel array, so uses just a little less memory.

It illustrates you can generate a large number of goroutines.
Here, each go f(left, right) blocks: left <- 1 + <-right blocks because it waits for right to get a value. See "do golang channels maintain order".
All channels created here are unbuffered channels.
All 10000 goroutines are created.
Point B: right and left are declared, using the short variable declaration.
right is initialized to leftmost, but it doesn't matter, because it will be reassigned to a new channel in the for loop (right = make(chan int)).
Another way to declare right would have been:
var right chan int
left is initialized with leftmost, the very first channel created.
Point A: But once that channel start waiting (left <- 1 + <-right), the for loop set left to right, and created a new right: that is how the daisy chain is build
left <- (new) right (now left) <- (new) right (now left) <- ...
Then, one value is sent to the last right channel created: {c <- 1 }(right)
And you wait for the first leftmost channel created to receive its value (incremented 10000 time).
Since receivers always block until there is data to receive, the main() function itself doesn't exit before leftmost finally receive its value.
If main() exited too soon, the daisy chain wouldn't have time to complete.

I found dry-run this program could be really helpful to understand it.
At first, after executing
leftmost := make(chan int)
right := leftmost
left := leftmost
leftmost, left, and right are all referring to the same chan int
[chan int]
|
left, leftmost, right
Let's run some iterations for the for-loop.
i = 0
When we just enter the for loop,
[chan int]
|
left, leftmost, right
after executing right = make(chan int) and go f(left, right).
[chan int] <-(+1)- [chan int]
| |
left, leftmost right
after executing left = right
[chan int] <-(+1)- [chan int]
| |
leftmost left, right
i = 1
When we just enter the for loop,
[chan int] <-(+1)- [chan int]
| |
leftmost left, right
after executing right = make(chan int) and go f(left, right).
[chan int] <-(+1)- [chan int] <-(+1)- [chan int]
| | |
leftmost left right
after executing left = right
[chan int] <-(+1)- [chan int] <-(+1)- [chan int]
| |
leftmost left, right
I feel like two loops are enough to see the pattern:
Every loop we create a new chan int and append it at the end of the "linked list of chan int".
So after n = 100000 loops, we created 100000 new chan int, and the number of chan int in the "linked list of chan int will be 100001.
100001 chan int means 100000 gaps between each pair of adjacent chan int, and each gap means one +1.
Before the for loop, because all chan int are acting as receivers and there is no pass-in value, so all chan int will just wait.
After the for loop, we execute go func(c chan int) { c <- 1 }(right), then the 1 is passed into the "linked list of chan int" and perform +1 on the value for 100000 times, so the final result to the leftmost will be 100001.
Things will be like when we pass 1 into the "linked list of chan int":
[chan int] <-(+1)- [chan int] <-(+1)- ...... <-(+1)- [chan int] <- 1
| |
leftmost left, right
I created a leetcode playground holding all the code. You could try it here (https://leetcode.com/playground/gAa59fh3).

Related

range function in sml with a step parameter

I am very new to SML and functional programming.
I have searched the site but was unable to find an answer to my question.
I am trying to write a basic range function with a start, stop, and step parameter.
For example, range(2, 12, 3) should return the list [2,5,8,11].
I don't get any errors, but when I try running range(2,12,3); the cursor advances to the next line and nothing happens, I can't even type anything into the smlnj app.
Here is my code:
fun range(start, stop, step) =
if start = stop then nil
else start::range(start+step, stop, step);
Which outputs this:
val range = fn : int * int * int -> int list
What changes do I need to make to my code so that when I run range(2,12,3) I get [2,5,8,11] ?
Thank you
Your condition (start = stop) to break the recursion is wrong. In your example you'll perform recursive calls with start=2, start=5, start=8, start=11, start=14, ... leading to an infinite loop. It's not that nothing happens, it's just that smnlj keeps computing...
You have an infinite recursion whenever stop - start is not a multiple of the step size (or zero).
Look at range(11,12,3), which should generate your last number:
if 11 = 12 then nil
else 11::range(11+3, 12, 3)
This will calculate range(14,12,3):
if 14 = 12 then nil
else 14::range(14+3, 12, 3)
and then range(17,12,3), and so on, ad infinitum.
You need to replace = with either > or >=, depending on whether stop should be included or not.

Haskell runs incredibly slow compared to C++

I have an assignment, that I need to solve in Haskell.
It was firstly solved in C++, then I rewrote the code to Haskell.
The algorithm is working correctly but the Haskell version runs slower, compared to the C++ version.
For example, for the input:
110110100011010101010101
010101101001000101010100
Haskell (with GHCI): 20 sec
Haskell (compiled GHC): 3 sec
C++: <1 sec
With a difference this much, I think, I am doing something wrong.
Problem description: We are given 2 arrays(strings) of the same length, containing 0s and 1s. Our task is to find the minimal switches(switch=0->1 or 1->0) to make the source array identical to the target. There is a rule for switching: We can only change the state of i if i+1 is 1 AND i+2->n are 0, except for the last one.
C++ code:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
//flips the i-th char int s to match the i-th char in t
int flip2(int i, string& s, char t){
if(i>=s.length() || s[i]==t) return 0; //if matches, or non-existent index, returns 0 steps
int c=1; // 1 step is switching the character
c+=flip2(i+1,s,'1'); //in order to switch i, i+1 have to be 1
for(int j=i+2;j<s.length();j++) //in order to switch i, i+2->n have to be 0
c+=flip2(j,s,'0');
s[i]=t;
return c;
}
//returns the minimum number of switch steps to make s=t
int ultimateFlip( string s, string t){
int c=0;
for(int i=0;i<s.length();i++){ // switches every character in s to match t
c+=flip2(i,s,t[i]); //adds up the steps
}
return c;
}
int main()
{
string s; // source array (made up of 0s and 1s)
getline(cin, s);
string t; //target array (made up of 0s and 1s)
getline(cin, t);
cout<<ultimateFlip(s,t);
}
Haskell code:
import System.IO
import Control.Monad
main :: IO ()
main = do
hSetBuffering stdout NoBuffering
s <- getLine -- source string
t <- getLine -- target string
let sol = ultimateFlip s t
putStrLn $ show sol
return ()
--returns the minimum number of switch steps to make s=t
ultimateFlip :: [Char] -> [Char] -> Int
ultimateFlip [] [] = 0
ultimateFlip (s:sx) (t:tx) = k + ultimateFlip sxn tx
where
(k,sxn)=flip2 s t sx --snx = new (after) version of sx(the rest of the string)
--flips the s to match t, sx= rest of the source string after s
flip2 :: Char->Char->[Char]->(Int,[Char])
flip2 s t sx
| s == t = (0,sx) --if mathes, no switch needed
| null sx = (1,sx) --if last one, one switch need
| otherwise = (k2+k1+1,'1':nsx2)
where
(sxh:sxt) = sx
(k1,nsx1) = flip2 sxh '1' sxt --switch next to 1
(k2,nsx2) = zeroOut nsx1 --switch everything after the next to 0
--switch everything to 0
zeroOut :: [Char] -> (Int, [Char])
zeroOut [] = (0,[])
zeroOut (s:sx) = (k1+k2,'0':nsx2)
where
(k1,nsx1) = flip2 s '0' sx
(k2,nsx2) = zeroOut nsx1
For Haskell I am using: GHC, version 8.10.2
For C++ I am using: gcc (GCC) 10.2.0
You are spending an awful lot of time allocating and immediately destructuring pairs. That's pretty unnecessary, because you always know what [Char] you're going to get back in the second half of the tuple. Here's one way to eliminate that problem:
ultimateFlip :: [Char] -> [Char] -> Int
ultimateFlip [] [] = 0
ultimateFlip (s:sx) (t:tx)
| s == t = ultimateFlip sx tx
| null sx = 1
| otherwise = ultimateFlip sx tx' + 1 + ultimateFlip tx' tx where
tx' = '1' : ('0'<$drop 1 tx)
With this change, the Haskell performs pretty much the same as the C++ on my machine -- sometimes a few ms faster, sometimes a few ms slower, for inputs slightly longer than the one you proposed.
Of course, as usual, switching to a better algorithm blows microoptimizations like this one out of the water in terms of gains. The following implementation takes less time than the reporting precision of time even for much longer strings.
import Data.Bits
main :: IO ()
main = do
s <- getLine
t <- getLine
print (ultimateFlip s t)
ultimateFlip :: [Char] -> [Char] -> Int
ultimateFlip [] [] = 0
ultimateFlip (s:sx) (t:tx)
| s == t = ultimateFlip sx tx
| otherwise = go '1' pow sx + 1 + go '1' pow tx where
pow = 2^(length sx-1)
go _ _ [] = 0
go s pow (t:tx) = go s' pow' tx + n where
pow' = shiftR pow 1
(s', n) = if s == t then ('0', 0) else ('1', pow)
It also smoothly upgrades to using arbitrary-sized integers for those longer inputs just by switching Int to Integer in the type signature of ultimateFlip.
The biggest problem you're having is with a lack of "strictness". Haskell's lazy evaluation means that even simple calculations like k2+k1+1 generally won't be evaluated until the answer is needed. With recursive functions performing a series of additions like thus, you can sometimes end up building an enormous unevaluated expression that takes up tons of memory before it finally gets evaluated at the end.
Here, by adding a language extension at the top:
{-# LANGUAGE BangPatterns #-}
and adding a single strictness "!" annotation in your flip's "where" clause:
(!k1,nsx1) = flip2 sxh '1' sxt
^
this drops the runtime on my machine from 800ms to 80ms (again, compiled with ghc -O2). That's still slower than the C++ version (20ms), but it's in the right ballpark.
The annotation here has the effect of forcing the expression to be evaluated. Figuring out where strictness annotations are needed is a bit of a dark art. In this case, I suspected your counting was causing the problem, so I threw in "!" before all the places that a count was being returned, and then I deleted them until I found the one that made most of the difference.
The remaining speed difference is probably a result of using a lot of list processing in Haskell (versus arrays in C++), so you could likely do better, though I'm not sure it's worth the trouble.

What's Swift's "fromAfter" call in array slices?

Swift 3 has upTo and through
which are noninclusive, inclusive respectively
func prefix(upTo: Int)
Returns a subsequence from the start of the collection up to, but not including, the specified position.
.
func prefix(through: Int)
Returns a subsequence from the start of the collection through the specified position.
for the other end it has from
func suffix(from: Int)
Returns a subsequence from the specified position to the end of the collection.
which seems to be inclusive
What's the non-inclusive call at the far end??
// sum the numbers before, and after, an index i...
let lo = A.prefix(upTo: i).reduce(0,+) // means noninclusive
let hi = A.suffix(from: i+1).reduce(0,+) // 'from' seems to mean inclusive
what's the call I don't know? It sucks to have to write from with +1.
There is currently no non-inclusive suffix method for for Collection types in the stdlib, but for this use case, you can readily implement your own by combining suffix(from:) with dropFirst(_:) (which, imho, better shows intent than from: idx+1), e.g.
extension Collection where SubSequence == SubSequence.SubSequence {
public func suffix(after start: Index) -> SubSequence {
return suffix(from: start).dropFirst(1)
}
}
Applied to your example (separately sum numbers before and after a given partitioning number (or, index of), not including the partitioning one):
/* in this example, invalid indices will yield a full-array sum into
lo or hi, depending on high or low index out of bounds, respectively */
func splitSum(of arr: [Int], at: Int) -> (Int, Int) {
guard at < arr.count else { return (arr.reduce(0, +), 0) }
guard at >= 0 else { return (0, arr.reduce(0, +)) }
let lo = arr.prefix(upTo: at).reduce(0, +)
let hi = arr.suffix(after: at).reduce(0, +)
return (lo, hi)
}
// example usage
let arr = [Int](repeating: 1, count: 10)
print(splitSum(of: arr, at: 4)) // (4, 5)
Leaving the subject of a non-inclusive suffix method, an alternative approach to your split sum calculation would be to use one of the split(...) methods for Collection types:
func splitSum(of arr: [Int], at: Int) -> (Int, Int) {
guard at < arr.count else { return (arr.reduce(0, +), 0) }
guard at >= 0 else { return (0, arr.reduce(0, +)) }
let sums = arr.enumerated()
.split (omittingEmptySubsequences: false) { $0.0 == at }
.map { $0.reduce(0) { $0 + $1.1 } }
guard let lo = sums.first, let hi = sums.last else { fatalError() }
return (lo, hi)
}
// example: same as above
I believe the split version is a bit more verbose, however, and also semantically poorer at showing the intent of the code.

How do goroutines work?

I was following the Go Tour and I am a bit stuck when it comes to goroutines. I understand that they are very lightweight and that every time a goroutine blocks, another one will start but I can't get my head around how this example actually works:
package main
import (
"fmt"
"time"
)
func say(s string) {
for i := 0; i < 5; i++ {
time.Sleep(1000 * time.Millisecond)
fmt.Println(s)
}
}
func main() {
go say("world")
say("hello")
}
Playground
I understand that a goroutine is started for the say function with the argument "world", but as far as I understand that should print "world" five times and "hello" once. However I don't understand why the output is as it is:
hello
world
hello
world
hello
world
hello
world
hello
From my limited understanding of threads from other languages the output should have been something like this:
hello
world
world
world
world
world
or like this:
world
world
world
hello
world
world
Why does the second line execute five times as well? Does anything below a go statement classify as part of the go routine?
Also the next slide shows something I can't get my head round again:
package main
import "fmt"
func sum(a []int, c chan int) {
sum := 0
for _, v := range a {
sum += v
}
c <- sum // send sum to c
}
func main() {
a := []int{7, 2, 8, -9, 4, 0}
c := make(chan int)
go sum(a[:len(a)/2], c)
go sum(a[len(a)/2:], c)
x, y := <-c, <-c // receive from c
fmt.Println(x, y, x+y)
}
Playground
A goroutine is started for the second half of the slice and then another one for the first part of the slice, however the values x and y have been assigned two different values. The way I see it the sum function will send it's sum to channel c and then the next sum will send it's sum to the same channel c so how can the two variables be assigned two different values? Shouldn't channel c have one single sum value in there?
I appreciate that this is quite a long question but I wasn't able to find the answer to these questions.
Why does the second line execute 5 times as well?
The second line will print hello every second 5 times in the main() thread.
But concurrently the first line go say("world") will also print world every seconds five times in a separate goroutine.
The Sleep ensure that each routine yields, allowing the other to resume.
Hence the output:
hello
world
hello
world
hello
world
hello
world
hello
The way I see it the sum function will send it's sum to channel c and then the next sum will send it's sum to the same channel c so how can the two variables be assigned two different values?
Because each send will block on c until channel c is read.
Since there are two write to c, you need to read:
x, y := <-c, <-c // receive from c twice.
The Assignement section of Golang Spec allows for a tuple assignment if:
the number of operands on the left must equal the number of expressions on the right, each of which must be single-valued, and the nth expression on the right is assigned to the nth operand on the left.
For the first function you should see values in the style VonC presented. The reason hello prints 5 times as well is because the function say prints things 5 times. Just imagine the program without the goroutine. I think it doesn't guarantee that you will get hello and world perfectly interspersed but I may be wrong.
The reason the channel works is:
Golang let's you do multiple assignment as VonC mentions
Channels empty out, i.e. when you assign c to x it removes the first sum that was passed into the channel and when it assigns c to y it passes in the second value (again I think the order is not a guarantee as x could have the first half and y the second half sum or vice-versa.
If you imagine channels as a sort of a queue I think it makes more sense. The summing goroutines push values onto the queue and assignments pop the values sequentially.

Reading from multiple channels simultaneously in Golang

I am new to Golang. Right now I am trying to figure out how to make an any-to-one channel in Golang, where the setup is as follows:
say I have two goroutines numgen1 and numgen2 executing concurrently and writing numbers to channels num1 resp. num2. I would like to add the numbers sent from numgen1 and numgen2 in a new process, addnum. I have tried something like this:
func addnum(num1, num2, sum chan int) {
done := make(chan bool)
go func() {
n1 := <- num1
done <- true
}()
n2 := <- num2
<- done
sum <- n1 + n2
}
but this seems sadly incorrect. Could someone please give me some ideas?
Thank you very much for your help.
Depending on your requirements, you may need to read both of the channels for every iteration (i.e. a sort-of 'zip' function). You can do this with a select, similarly to user860302's answer:
func main() {
c1 := make(chan int)
c2 := make(chan int)
out := make(chan int)
go func(in1, in2 <-chan int, out chan<- int) {
for {
sum := 0
select {
case sum = <-in1:
sum += <-in2
case sum = <-in2:
sum += <-in1
}
out <- sum
}
}(c1, c2, out)
}
This runs forever. My preferred way to terminate goroutines like this one is to close the input channels. In this case you would need to wait for both to close, then close(out) before terminating.
Tip: note the use of directional channels as goroutine formal parameters. The compiler catches more mistakes when you write it this way. Happiness!
The simplest answer would be
func addnum(num1, num2, sum chan int) {
n1 := <- num1
n2 := <- num2
sum <- n1 + n2
}
Since you need both num1 and num2 to do the calculation, it makes no sense to do it otherwise. After all, there are two possible execution orders:
num1 generates a number, followed by num2
num2 generates a number, followed by num1
In the first case, our channel reads correspond exactly to the execution order. In the second case, our first read will block until num1 has finally produced a number; the second read will complete near-instantaneous because the num2 channel already has a number.
If you want to know more about channels in Go, I'd suggest to have a look at http://godoc.org/github.com/thomas11/csp -- this is a collection of Hoare's CSP examples written in Go.
To answer the question "Reading from multiple channels simultaneously"
There is a way to listen to multiple channels simultaneously :
func main() {
c1 := make(chan string)
c2 := make(chan string)
...
go func() {
for {
select {
case msg1 := <- c1:
fmt.Println(msg1)
case msg2 := <- c2:
fmt.Println(msg2)
}
}
}()
In this example, I create a channel msg1 and msg2.
Then I create a go routine with an infinite loop. In this loop, I listen to msg1 AND msg2.
This system allow you to read to multiple channels simultaneously and to process the messages when the arrive.
In order to avoid leaks, I should probably add another channel to stop the goroutine.