Clojure: architecture advice needed - clojure

I'm writing a little clojure pub/sub interface. It's very barebones, only two methods that will actually be used: do-pub and sub-listen. sub-listen takes a string (a sub name) and do-pub takes two strings (a sub name and a value).
I'm still fairly new at clojure and am having some trouble coming up with a workable way to do this. My first thought (and indeed my first implementation) uses a single agent which holds a hash:
{ subname (promise1 promise2 etc) }
When a thread wants to sub it conj's a promise object to the list associated with the sub it wants, then immediately tries to de-reference that promise (therefore blocking).
When a pub happens it goes through every item in the list for the sub and delivers the value to that item (the promise). It then dissoc's that subname from the map and returns it to the agent.
In this way I got a simple pub sub implementation working. However, the problem comes when someone subs, doesn't receive a pub for a certain amount of time, then gets killed due to timeout. In this scenario there will be a worthless promise in the agent that doesn't need to be, and moreover this will be a source of a memory leak if that sub never gets pub'd.
Does anyone have any thoughts on how to solve this? Or if there is a better way to do what I'm trying to do overall (I'm trying to avoid using any external pre-cooked pubsub libraries, this is a pet project not a work one)?

You can do something like this:
Create an atom
publish function will update the atom value by the passed in value to the function
Subscribers can use add-watch on the atom to be notified of when the atom value changes i.e due to call to publish function
Use remove-watch to remove the subscription.
This way you will have a very basic pub-sub system.

I have marked Ankur's answer as the solution but I wanted to expand on it a bit. What I ended up doing is having a central atom that all client threads do an add-watch on. When a pub is done the atom's value is changed to a vector containing the name of the sub and the value being pub'd.
The function the clients pass to add-watch is a partial function which looks like
(partial (fn [prom sub key ref _old new] ...) sub prom)
where prom is a promise previously generated. The client then blocks while waiting on that promise. The partial function checks if the sub in new is the same as sub, if so it removes the watch and delivers on the promise with the value from new.

Related

Camunda, how to inject subprocess with specific parameters from main process

I have a Camunda flow with a Call Activity (sequential), the call Activity calls several subflows based on a list of process keys (ids) in a certain order.
For instance I get a list of ["flow-1", "flow-2"], then flow-1.bpmn and flow-2.bpmn are executed.
But, also in the scope is flow specific data, added to the scope in "Read LOT Configuration". For instance [{"name", "flow-1", "identifier" : "some-data"}, {name: "flow-2", "identifier" : "some other data"}].
I would like the call activity to determine that for flow-1, I need to send the flow-1 related object along.
I do not want to send the entire collection, but only the flow specific data.
How can I achieve this?
Some ideas:
a) use the element variable from the call activity settings as key to extract the correct data element in a data mapping
b) surround the call activity with a multi-instance embedded sub process. In this scope you will have the element variable (processId), which can then be used to perform delegate variable mapping (https://docs.camunda.org/manual/7.16/reference/bpmn20/subprocesses/call-activity/#delegation-of-variable-mapping)
c) pass the processID as data and fetch the configuration for the particular process inside its sub process implementation only

Understanding the point of supply blocks (on-demand supplies)

I'm having trouble getting my head around the purpose of supply {…} blocks/the on-demand supplies that they create.
Live supplies (that is, the types that come from a Supplier and get new values whenever that Supplier emits a value) make sense to me – they're a version of asynchronous streams that I can use to broadcast a message from one or more senders to one or more receivers. It's easy to see use cases for responding to a live stream of messages: I might want to take an action every time I get a UI event from a GUI interface, or every time a chat application broadcasts that it has received a new message.
But on-demand supplies don't make a similar amount of sense. The docs say that
An on-demand broadcast is like Netflix: everyone who starts streaming a movie (taps a supply), always starts it from the beginning (gets all the values), regardless of how many people are watching it right now.
Ok, fair enough. But why/when would I want those semantics?
The examples also leave me scratching my head a bit. The Concurancy page currently provides three examples of a supply block, but two of them just emit the values from a for loop. The third is a bit more detailed:
my $bread-supplier = Supplier.new;
my $vegetable-supplier = Supplier.new;
my $supply = supply {
whenever $bread-supplier.Supply {
emit("We've got bread: " ~ $_);
};
whenever $vegetable-supplier.Supply {
emit("We've got a vegetable: " ~ $_);
};
}
$supply.tap( -> $v { say "$v" });
$vegetable-supplier.emit("Radish"); # OUTPUT: «We've got a vegetable: Radish␤»
$bread-supplier.emit("Thick sliced"); # OUTPUT: «We've got bread: Thick sliced␤»
$vegetable-supplier.emit("Lettuce"); # OUTPUT: «We've got a vegetable: Lettuce␤»
There, the supply block is doing something. Specifically, it's reacting to the input of two different (live) Suppliers and then merging them into a single Supply. That does seem fairly useful.
… except that if I want to transform the output of two Suppliers and merge their output into a single combined stream, I can just use
my $supply = Supply.merge:
$bread-supplier.Supply.map( { "We've got bread: $_" }),
$vegetable-supplier.Supply.map({ "We've got a vegetable: $_" });
And, indeed, if I replace the supply block in that example with the map/merge above, I get exactly the same output. Further, neither the supply block version nor the map/merge version produce any output if the tap is moved below the calls to .emit, which shows that the "on-demand" aspect of supply blocks doesn't really come into play here.
At a more general level, I don't believe the Raku (or Cro) docs provide any examples of a supply block that isn't either in some way transforming the output of a live Supply or emitting values based on a for loop or Supply.interval. None of those seem like especially compelling use cases, other than as a different way to transform Supplys.
Given all of the above, I'm tempted to mostly write off the supply block as a construct that isn't all that useful, other than as a possible alternate syntax for certain Supply combinators. However, I have it on fairly good authority that
while Supplier is often reached for, many times one would be better off writing a supply block that emits the values.
Given that, I'm willing to hazard a pretty confident guess that I'm missing something about supply blocks. I'd appreciate any insight into what that might be.
Given you mentioned Supply.merge, let's start with that. Imagine it wasn't in the Raku standard library, and we had to implement it. What would we have to take care of in order to reach a correct implementation? At least:
Produce a Supply result that, when tapped, will...
Tap (that is, subscribe to) all of the input supplies.
When one of the input supplies emits a value, emit it to our tapper...
...but make sure we follow the serial supply rule, which is that we only emit one message at a time; it's possible that two of our input supplies will emit values at the same time from different threads, so this isn't an automatic property.
When all of our supplies have sent their done event, send the done event also.
If any of the input supplies we tapped sends a quit event, relay it, and also close the taps of all of the other input supplies.
Make very sure we don't have any odd races that will lead to breaking the supply grammar emit* [done|quit].
When a tap on the resulting Supply we produce is closed, be sure to close the tap on all (still active) input supplies we tapped.
Good luck!
So how does the standard library do it? Like this:
method merge(*#s) {
#s.unshift(self) if self.DEFINITE; # add if instance method
# [I elided optimizations for when there are 0 or 1 things to merge]
supply {
for #s {
whenever $_ -> \value { emit(value) }
}
}
}
The point of supply blocks is to greatly ease correctly implementing reusable operations over one or more Supplys. The key risks it aims to remove are:
Not correctly handling concurrently arriving messages in the case that we have tapped more than one Supply, potentially leading us to corrupt state (since many supply combinators we might wish to write will have state too; merge is so simple as not to). A supply block promises us that we'll only be processing one message at a time, removing that danger.
Losing track of subscriptions, and thus leaking resources, which will become a problem in any longer-running program.
The second is easy to overlook, especially when working in a garbage-collected language like Raku. Indeed, if I start iterating some Seq and then stop doing so before reaching the end of it, the iterator becomes unreachable and the GC eats it in a while. If I'm iterating over lines of a file and there's an implicit file handle there, I risk the file not being closed in a very timely way and might run out of handles if I'm unlucky, but at least there's some path to it getting closed and the resources released.
Not so with reactive programming: the references point from producer to consumer, so if a consumer "stops caring" but hasn't closed the tap, then the producer will retain its reference to the consumer (thus causing a memory leak) and keep sending it messages (thus doing throwaway work). This can eventually bring down an application. The Cro chat example that was linked is an example:
my $chat = Supplier.new;
get -> 'chat' {
web-socket -> $incoming {
supply {
whenever $incoming -> $message {
$chat.emit(await $message.body-text);
}
whenever $chat -> $text {
emit $text;
}
}
}
}
What happens when a WebSocket client disconnects? The tap on the Supply we returned using the supply block is closed, causing an implicit close of the taps of the incoming WebSocket messages and also of $chat. Without this, the subscriber list of the $chat Supplier would grow without bound, and in turn keep alive an object graph of some size for each previous connection too.
Thus, even in this case where a live Supply is very directly involved, we'll often have subscriptions to it that come and go over time. On-demand supplies are primarily about resource acquisition and release; sometimes, that resource will be a subscription to a live Supply.
A fair question is if we could have written this example without a supply block. And yes, we can; this probably works:
my $chat = Supplier.new;
get -> 'chat' {
web-socket -> $incoming {
my $emit-and-discard = $incoming.map(-> $message {
$chat.emit(await $message.body-text);
Supply.from-list()
}).flat;
Supply.merge($chat, $emit-and-discard)
}
}
Noting it's some effort in Supply-space to map into nothing. I personally find that less readable - and this didn't even avoid a supply block, it's just hidden inside the implementation of merge. Trickier still are cases where the number of supplies that are tapped changes over time, such as in recursive file watching where new directories to watch may appear. I don't really know how'd I'd express that in terms of combinators that appear in the standard library.
I spent some time teaching reactive programming (not with Raku, but with .Net). Things were easy with one asynchronous stream, but got more difficult when we started getting to cases with multiple of them. Some things fit naturally into combinators like "merge" or "zip" or "combine latest". Others can be bashed into those kinds of shapes with enough creativity - but it often felt contorted to me rather than expressive. And what happens when the problem can't be expressed in the combinators? In Raku terms, one creates output Suppliers, taps input supplies, writes logic that emits things from the inputs into the outputs, and so forth. Subscription management, error propagation, completion propagation, and concurrency control have to be taken care of each time - and it's oh so easy to mess it up.
Of course, the existence of supply blocks doesn't stop being taking the fragile path in Raku too. This is what I meant when I said:
while Supplier is often reached for, many times one would be better off writing a supply block that emits the values
I wasn't thinking here about the publish/subscribe case, where we really do want to broadcast values and are at the entrypoint to a reactive chain. I was thinking about the cases where we tap one or more Supply, take the values, do something, and then emit things into another Supplier. Here is an example where I migrated such code towards a supply block; here is another example that came a little later on in the same codebase. Hopefully these examples clear up what I had in mind.

SFDC Apex Code: Access class level static variable from "Future" method

I need to do a callout to webservice from my ApexController class. To do this, I have an asycn method with attribute #future (callout=true). The webservice call needs to refeence an object that gets populated in save call from VF page.
Since, static (future) calls does not all objects to be passed in as method argument, I was planning to add the data in a static Map and access that in my static method to do a webservice call out. However, the static Map object is getting re-initalized and is null in the static method.
I will really appreciate if anyone can give me some pointeres on how to address this issue.
Thanks!
Here is the code snipped:
private static Map<String, WidgetModels.LeadInformation> leadsMap;
....
......
public PageReference save() {
if(leadsMap == null){
leadsMap = new Map<String, WidgetModels.LeadInformation>();
}
leadsMap.put(guid,widgetLead);
}
//make async call to Widegt Webservice
saveWidgetCallInformation(guid)
//async call to widge webserivce
#future (callout=true)
public static void saveWidgetCallInformation(String guid) {
WidgetModels.LeadInformation cachedLeadInfo =
(WidgetModels.LeadInformation)leadsMap.get(guid);
.....
//call websevice
}
#future is totally separate execution context. It won't have access to any history of how it was called (meaning all static variables are reset, you start with fresh governor limits etc. Like a new action initiated by the user).
The only thing it will "know" is the method parameters that were passed to it. And you can't pass whole objects, you need to pass primitives (Integer, String, DateTime etc) or collections of primitives (List, Set, Map).
If you can access all the info you need from the database - just pass a List<Id> for example and query it.
If you can't - you can cheat by serializing your objects and passing them as List<String>. Check the documentation around JSON class or these 2 handy posts:
https://developer.salesforce.com/blogs/developer-relations/2013/06/passing-objects-to-future-annotated-methods.html
https://gist.github.com/kevinohara80/1790817
Side note - can you rethink your flow? If the starting point is Visualforce you can skip the #future step. Do the callout first and then the DML (if needed). That way the usual "you have uncommitted work pending" error won't be triggered. This thing is there not only to annoy developers ;) It's there to make you rethink your design. You're asking the application to have open transaction & lock on the table(s) for up to 2 minutes. And you're giving yourself extra work - will you rollback your changes correctly when the insert went OK but callout failed?
By reversing the order of operations (callout first, then the DML) you're making it simpler - there was no save attempt to DB so there's nothing to roll back if the save fails.

Creating AKKA actor from string class names

I have a List (e.g. the output of a database query) variable, which I use to create actors (they could be many and they are varied). I use the following code (in TestedActor preStart()), the actor qualified name is from the List variable as an example):
Class<?> classobject = Class.forName("com.java.anything.actor.MyActor"); //create class from name string
ActorRef actref = getContext().actorOf(Props.create(classobject), actorname); //creation
the code was tested:
#Test
public void testPreStart() throws Exception {
final Props props = Props.create(TestedActor.class);
final TestActorRef<TestedActor > ref = TestActorRef.create(system, props, "testA");
#SuppressWarnings("unused")
final TestedActor actor = ref.underlyingActor();
}
EDIT : it is working fine (contrary to the previous post, where I have seen a timeout error, it turned out as an unrelated alarm).
I have googled some posts related to this issue (e.g. suggesting the usage of newInstance), however I am still confused as these were superseded by mentioning it as a bad pattern. So, I am looking for a solution in java, which is also safe from the akka point of view (or the confirmation of the above pattern).
Maybe if you would write us why you need to create those actors this way it would help to find the solution.
Actually most people will tell you that using reflection is not the best idea. Sometimes it's the only option but you should avoid it.
Maybe this would be a solution for you:
Since actors are really cheap you can create all of them upfront. How many of them do you have?
Now the query could return you a path to the actor, not the class. Select it with actorSelection and send messages to it.
If your actors does a long running job you can use a router or if you want to a Proxy Actor that will spawn other actors as needed. Other option is to create futures from a single actor.
It really depends on the case, because you may need to create multiple execution context's not to starve any of the actors (of futures).

Asynchronous network calls

I made a class that has an asynchronous OpenWebPage() function. Once you call OpenWebPage(someUrl), a handler gets called - OnPageLoad(reply). I have been using a global variable called lastAction to take care of stuff once a page is loaded - handler checks what is the lastAction and calls an appropriate function. For example:
this->lastAction == "homepage";
this->OpenWebPage("http://www.hardwarebase.net");
void OnPageLoad(reply)
{
if(this->lastAction == "homepage")
{
this->lastAction = "login";
this->Login(); // POSTs a form and OnPageLoad gets called again
}
else if(this->lastAction == "login")
{
this->PostLogin(); // Checks did we log in properly, sets lastAction as new topic and goes to new topic URL
}
else if(this->lastAction == "new topic")
{
this->WriteTopic(); // Does some more stuff ... you get the point
}
}
Now, this is rather hard to write and keep track of when we have a large number of "actions". When I was doing stuff in Python (synchronously) it was much easier, like:
OpenWebPage("http://hardwarebase.net") // Stores the loaded page HTML in self.page
OpenWebpage("http://hardwarebase.net/login", {"user": username, "pw": password}) // POSTs a form
if(self.page == ...): // now do some more checks etc.
// do something more
Imagine now that I have a queue class which holds the actions: homepage, login, new topic. How am I supposed to execute all those actions (in proper order, one after one!) via the asynchronous callback? The first example is totally hard-coded obviously.
I hope you understand my question, because frankly I fear this is the worst question ever written :x
P.S. All this is done in Qt.
You are inviting all manner of bugs if you try and use a single member variable to maintain state for an arbitrary number of asynchronous operations, which is what you describe above. There is no way for you to determine the order that the OpenWebPage calls complete, so there's also no way to associate the value of lastAction at any given time with any specific operation.
There are a number of ways to solve this, e.g.:
Encapsulate web page loading in an immutable class that processes one page per instance
Return an object from OpenWebPage which tracks progress and stores the operation's state
Fire a signal when an operation completes and attach the operation's context to the signal
You need to add "return" statement in the end of every "if" branch: in your code, all "if" branches are executed in the first OnPageLoad call.
Generally, asynchronous state mamangment is always more complicated that synchronous. Consider replacing lastAction type with enumeration. Also, if OnPageLoad thread context is arbitrary, you need to synchronize access to global variables.