Akka Remote Performance issue - akka

I am facing a performance issue in Akka remoting. I have 2 actors Actor1 and Actor2. The message sending between the actor is synchronous ask request from Actor1 to Actor2 and the response back from Actor2 to Actor1. Below is the sample code snippets and config of my Actor:
Actor1.java:
object Actor1 extends App {
val conf = ConfigFactory.load()
val system = ActorSystem("testSystem1", conf.getConfig("remote1"))
val actor = system.actorOf(Props[Actor1].withDispatcher("my-dispatcher"), "actor1")
implicit val timeOut: Timeout = Timeout(10 seconds)
class Actor1 extends Actor {
var value = 0
var actorRef: ActorRef = null
override def preStart(): Unit = {
println(self.path)
}
override def receive: Receive = {
case "register" =>
actorRef = sender()
println("Registering the actor")
val time = System.currentTimeMillis()
(1 to 300000).foreach(value => {
if (value % 10000 == 0) {
println("message count -- " + value + " --- time taken - " + (System.currentTimeMillis() - time))
}
Await.result(actorRef ? value, 10 seconds)
})
val totalTime = System.currentTimeMillis() - time
println("Total Time - " + totalTime)
}
}
}
Actor2.java:
object Actor2 extends App {
val conf = ConfigFactory.load()
val system = ActorSystem("testSystem1", conf.getConfig("remote2"))
val actor = system.actorOf(Props[Actor2].withDispatcher("my-dispatcher"), "actor2")
implicit val timeOut: Timeout = Timeout(10 seconds)
actor ! "send"
class Actor2 extends Actor {
var value = 0
var actorSelection: ActorSelection = context.actorSelection("akka://testSystem1#127.0.0.1:6061/user/actor1")
override def receive: Receive = {
case "send" =>
actorSelection ! "register"
case int: Int => {
sender() ! 1
}
}
}
}
application.conf:
remote1 {
my-dispatcher {
executor = "thread-pool-executor"
type = PinnedDispatcher
}
akka {
actor {
provider = remote
}
remote {
artery {
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 6061
}
}
}
}
remote2 {
my-dispatcher {
executor = "thread-pool-executor"
type = PinnedDispatcher
}
akka {
actor {
provider = remote
}
remote {
artery {
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 6062
}
}
}
}
Output:
message count -- 10000 --- time taken - 5871
message count -- 20000 --- time taken - 9043
message count -- 30000 --- time taken - 12198
message count -- 40000 --- time taken - 15363
message count -- 50000 --- time taken - 18649
message count -- 60000 --- time taken - 22074
message count -- 70000 --- time taken - 25487
message count -- 80000 --- time taken - 28820
message count -- 90000 --- time taken - 32118
message count -- 100000 --- time taken - 35634
message count -- 110000 --- time taken - 39146
message count -- 120000 --- time taken - 42539
message count -- 130000 --- time taken - 45997
message count -- 140000 --- time taken - 50013
message count -- 150000 --- time taken - 53466
message count -- 160000 --- time taken - 57117
message count -- 170000 --- time taken - 61246
message count -- 180000 --- time taken - 65051
message count -- 190000 --- time taken - 68809
message count -- 200000 --- time taken - 72908
message count -- 210000 --- time taken - 77091
message count -- 220000 --- time taken - 80855
message count -- 230000 --- time taken - 84679
message count -- 240000 --- time taken - 89089
message count -- 250000 --- time taken - 93132
message count -- 260000 --- time taken - 97360
message count -- 270000 --- time taken - 101442
message count -- 280000 --- time taken - 105656
message count -- 290000 --- time taken - 109665
message count -- 300000 --- time taken - 113706
Total Time - 113707
Is there any wrong I am doing here?. Any observation or suggestion to improve the performance?

The main issue I see with the code is Await.result(). That is a blocking operation, and will most likely affect performance.
I suggest collecting the results in a fixed array / list, use an integer as an array, and consider it complete when the expected number of responses have been received.

Related

Groovy list in a map not showing a loop count properly

I have this code in Groovy;
def execution = []
def executor =[:]
for(loopcount=1;loopcount<4;loopcount++){
executor.executor = 'jmeter'
executor.scenario = 'scenario' + loopcount
println executor.scenario
executor.concurrency = 2
execution.add(executor)
}
execution.each{
println executor.scenario
}
It is a list of three maps, all the same apart from the scenario suffix increments. I am expecting;
scenario1
scenario2
scenario3
scenario1
scenario2
scenario3
But I get;
scenario1
scenario2
scenario3
scenario3
scenario3
scenario3
It's definitely adding three different maps in the list because the .each command is returning three values. And they're definitely different values in executor.scenario because the println in the loop is giving the correct '1, 2, 3' count. But why don't they stay as different values in the list?
I've also tried execution.push(executor) but that gives the same results. For context, this yaml is what I'm aiming for eventually;
---
execution:
- executor: "jmeter"
scenario: "scenario1"
concurrency: 2
- executor: "jmeter"
scenario: "scenario2"
concurrency: 2
- executor: "jmeter"
scenario: "scenario3"
concurrency: 2
And apart from the scenario count the rest of it works fine.
problem:
def execution = []
def executor =[:]
for(loopcount=1;loopcount<4;loopcount++){
execution.add(executor) // <<-- this line adds the same variable to the list 4 times
}
to fix this - declare executor inside the for loop
def execution = []
for(loopcount=1;loopcount<4;loopcount++){
def executor =[:] // <<-- creates a new object in a loop
execution.add(executor) // <<-- adds new object to a list
}
probably to make it more clear, let me specify what [] and [:] means:
def execution = new ArrayList()
for(loopcount=1;loopcount<4;loopcount++){
def executor = new LinkedHashMap()
execution.add(executor)
}
however you could declare the variable before the loop but you have to assign a new object into it inside loop
def execution = []
def executor
for(loopcount=1;loopcount<4;loopcount++){
executor = [:]
execution.add(executor)times
}

Calculate Device seconds On using Kinesis Analytics

I'm experimenting with Kinesis analytics and have solved many problems with it but actually stuck with the following:
I actually have a stream with records that reflects when a Device is turned on an off like:
device_id | timestamp | reading
1 | 2011/09/01 22:30 | 1
1 | 2011/09/01 23:00 | 0
1 | 2011/09/02 03:30 | 1
1 | 2011/09/02 03:31 | 0
I'm using 1 for On and 0 for Off in the reading field.
What I'm trying to accomplish is create a PUMP that redirects the number of seconds a Device has been on every 5 minutes window to another stream looking like:
device_id | timestamp | reading
1 | 2011/09/01 22:35 | 300
1 | 2011/09/01 22:40 | 300
1 | 2011/09/01 22:45 | 300
1 | 2011/09/01 22:50 | 300
1 | 2011/09/01 22:55 | 300
1 | 2011/09/01 23:00 | 300
1 | 2011/09/01 23:05 | 0
1 | 2011/09/01 23:10 | 0
...
Not sure if this is something that can be accomplished with Kinesis Analytics, I can actually do it querying a SQL table but I'm stuck with the fact that is streaming data.
This is possible with Drools Kinesis Analytics (managed service on Amazon):
Types:
package com.text;
import java.util.Deque;
declare EventA
#role( event )
id: int;
timestamp: long;
on: boolean;
//not part of the message
seen: boolean;
end
declare Session
id: int #key;
events: Deque;
end
declare Report
id: int #key;
timestamp: long #key;
onInLast5Mins: int;
end
Rules:
package com.text;
import java.util.Deque;
import java.util.ArrayDeque;
declare enum Constants
// 20 seconds - faster to test
WINDOW_SIZE(20*1000);
value: int;
end
rule "Reporter"
// 20 seconds - faster to test
timer(cron:0/20 * * ? * * *)
when
$s: Session()
then
long now = System.currentTimeMillis();
int on = 0; //how long was on
int off = 0; //how long was off
int toPersist = 0; //last interesting event
for (EventA a : (Deque<EventA>)$s.getEvents()) {
toPersist ++;
boolean stop = false;
// time elapsed since the reading till now
int delta = (int)(now - a.getTimestamp());
if (delta >= Constants.WINDOW_SIZE.getValue()) {
delta = Constants.WINDOW_SIZE.getValue();
stop = true;
}
// remove time already counted
delta -= (on+off);
if (a.isOn())
on += delta;
else
off += delta;
if (stop)
break;
}
int toRemove = $s.getEvents().size() - toPersist;
while (toRemove > 0) {
// this event is out of window of interest - delete
delete($s.getEvents().removeLast());
toRemove --;
}
insertLogical(new Report($s.getId(), now, on));
end
rule "SessionCreate"
when
// for every new EventA
EventA(!seen, $id: id) from entry-point events
// check there is no session
not (exists(Session(id == $id)))
then
insert(new Session($id, new ArrayDeque()));
end
rule "SessionJoin"
when
// for every new EventA
$a : EventA(!seen) from entry-point events
// get event's session
$g: Session(id == $a.id)
then
$g.getEvents().push($a);
modify($a) {
setSeen(true),
setTimestamp(System.currentTimeMillis())
};
end
You can do this using SQL with the Stride HTTP API. You can chain together networks of continuous SQL queries and subscribe to streams of changes, as well as fire realtime webhooks if you want to take some kind of arbitrary action when this happens. See the Stride API docs for more info on this.

Siddhi query for to calculate a new value using current event's value and last event's value

Whenever an event arrives there needs be a query to calculate a new value using current event's value and last event's value and insert it in to a new stream.
For an example:
event [1] : speed = 0 timestamp = 1410513924817 Calculated value(Acceleration) : 0
event [2] : speed = 5 timestamp = 1410513924818 Calculated value(Acceleration) : ( 5 - 0)/1 = 5
event [3] : speed = 10 timestamp = 1410513924819 Calculated value(Acceleration) : (10- 5)/1 = 5
event [4] : speed = 13 timestamp = 1410513924820 Calculated value(Acceleration) : (13-10)/1 = 3
event [5] : speed = 14 timestamp = 1410513924821 Calculated value(Acceleration) : (14-13)/1 = 1
event [6](current) : speed = 15 timestamp = 1410513924822 Calculated value(Acceleration) : (15-14)/1 = 1
When using #window.lengthBatch(2) it allows to calculate the acceleration once for each two events. Does not fulfil the requirement.
Any thoughts?
you can use a sequence,
E.g
from a=SpeedStream,b=SpeedStream
select b.speed-a.speed as acceleration, b.speed as currentSpeed
insert into AccelerationStream

compare regular select and prepared select performace

I try to use prepared select for get data from mysql,beacuse I think this faster than regular select.
this is select syntax:
char *sql = "select id,d1,d2,d3,d4,d5 from pricelist where d1 > ? limit 1000000";
that id,d2,d3 type unsigned int and other __int64
I wirte my code for prepared like below:
stmt = mysql_stmt_init(conn);
mysql_stmt_prepare(stmt, sql, strlen(sql));
// Select
param[0].buffer_type = MYSQL_TYPE_LONG;
param[0].buffer = (void *) &myId;
param[0].is_unsigned = 1;
param[0].is_null = 0;
param[0].length = 0;
// Result
result[0].buffer_type = MYSQL_TYPE_LONG;
result[0].buffer = (void *) &id;
result[0].is_unsigned = 1;
result[0].is_null = &is_null[0];
result[0].length = 0;
result[1].buffer_type = MYSQL_TYPE_LONGLONG;
result[1].buffer = (void *) &d1;
result[1].is_unsigned = 1;
result[1].is_null = &is_null[0];
result[1].length = 0;
result[2].buffer_type = MYSQL_TYPE_LONG;
result[2].buffer = (void *) &d2;
result[2].is_unsigned = 1;
result[2].is_null = &is_null[0];
result[2].length = 0;
result[3].buffer_type = MYSQL_TYPE_LONG;
result[3].buffer = (void *) &d3;
result[3].is_unsigned = 1;
result[3].is_null = &is_null[0];
result[3].length = 0;
result[4].buffer_type = MYSQL_TYPE_LONGLONG;
result[4].buffer = (void *) &d4;
result[4].is_unsigned = 1;
result[4].is_null = &is_null[0];
result[4].length = 0;
result[5].buffer_type = MYSQL_TYPE_LONGLONG;
result[5].buffer = (void *) &d5;
result[5].is_unsigned = 1;
result[5].is_null = &is_null[0];
result[5].length = 0;
mysql_stmt_bind_param(stmt, param);
mysql_stmt_bind_result(stmt, result);
mysql_stmt_execute(stmt);
mysql_stmt_store_result(stmt);
while(mysql_stmt_fetch (stmt) == 0){
}
and my code for reqular select is like below:
mysql_query(conn,"select id ,d1,d2,d3,d4,d5 from pricebook where us > 12 limit 1000000")
result = mysql_use_result(conn);
while (mysql_fetch_row(result)){
}
I run this two functions from remote pc and check time period for each one,duration for both of then is same equal to 6 sec
and when I check pcap file I see that vol that sent for prepared is same with reqular query even that in prepared comperes data.
$ capinfos prepared.pcap regular.pcap
File name: prepared.pcap
File type: Wireshark - pcapng
File encapsulation: Ethernet
Packet size limit: file hdr: (not set)
Number of packets: 40 k
File size: 53 MB
Data size: 52 MB
Capture duration: 6 seconds
Start time: Thu Aug 22 09:41:54 2013
End time: Thu Aug 22 09:42:00 2013
Data byte rate: 8820 kBps
Data bit rate: 70 Mbps
Average packet size: 1278.63 bytes
Average packet rate: 6898 packets/sec
SHA1: 959e589b090e3354d275f122a6fe6fbcac2351df
RIPEMD160: 7db6a437535d78023579cf3426c4d88d8ff3ddc3
MD5: 888729dc4c09baf736df22ef34bffeda
Strict time order: True
File name: regular.pcap
File type: Wireshark - pcapng
File encapsulation: Ethernet
Packet size limit: file hdr: (not set)
Number of packets: 38 k
File size: 50 MB
Data size: 49 MB
Capture duration: 6 seconds
Start time: Thu Aug 22 09:41:05 2013
End time: Thu Aug 22 09:41:11 2013
Data byte rate: 7740 kBps
Data bit rate: 61 Mbps
Average packet size: 1268.65 bytes
Average packet rate: 6101 packets/sec
SHA1: badf2040d826e6b0cca089211ee559a7c8a29181
RIPEMD160: 68f3bb5d4fcfd640f2da9764ff8e9891745d4800
MD5: 4ab73a02889472dfe04ed7901976a48c
Strict time order: True
if this ok that duration is same or I don't use prepared select as well as?
how I can improve it?
thanks.
The database server executes prepared statements and regular statements with the same speed. The performance difference comes when you execute the same query with different parameters: a prepared statement is parsed and prepared for execution once and then can be executed cheaply with different parameters, while a regular statement has to be parsed every time you want to execute it.

Trouble getting scala actors to work over a range in parallel

A homework assignment for a computer networking software dev class, the prof has us building a port scanner for ports 1-1024 to be run against the local host. The point of the exercise is to demonstrate task level parallelism using actors. The prof provided code that scans each port in sequence. We are to create a version that does this in parallel, with an actor for each processor or hyper thread available to the system. The goal is to get the time to complete a full scan of all ports 1-1024 and compare the results of a parallel scan against the results of a serial scan. Here's my code for the parallel scan:
import java.net.Socket
import scala.actors._
import Actor._
import scala.collection.mutable.ArrayBuffer
object LowPortScanner {
var lastPort = 0
var openPorts = ArrayBuffer[Int]()
var longestRunTime = 00.00
var results = List[Tuple3[Int, Range, Double]]()
val host = "localhost"
val numProcs = 1 to Runtime.getRuntime().availableProcessors()
val portsPerProc = 1024 / numProcs.size
val caller = self
def main(args: Array[String]): Unit = {
//spawn an actor for each processor that scans a given port range
numProcs.foreach { proc =>
actor {
val portRange: Range = (lastPort + 1) to (lastPort + portsPerProc)
lastPort = lastPort + portsPerProc
caller ! scan(proc, portRange)
}
}
//catch results from the processor actors above
def act {
loop {
reactWithin(100) {
//update the list of results returned from scan
case scanResult: Tuple3[Int, Range, Double] =>
results = results ::: List(scanResult)
//check if all results have been returned for each actor
case TIMEOUT =>
if (results.size == numProcs.size) wrapUp
case _ =>
println("got back something weird from one of the port scan actors!")
wrapUp
}
}
}
//Attempt to open a socket on each port in the given range
//returns a Tuple3[procID: Int, ports: Range, time: Double
def scan(proc: Int, ports: Range) {
val startTime = System.nanoTime()
ports.foreach { n =>
try {
println("Processor " + proc + "is checking port " + n)
val socket = new Socket(host, n)
//println("Found open port: " + n)
openPorts += n
socket.close
} catch {
case e: Exception =>
//println("While scanning port " + n + " caught Exception: " + e)
}
}
(proc, ports, startTime - System.nanoTime())
}
//output results and kill the main actor
def wrapUp {
println("These are the open ports in the range 1-1024:")
openPorts.foreach { port => println(port) }
results.foreach { result => if (result._3 > longestRunTime) { longestRunTime = result._3} }
println("Time to scan ports 1 through 1024 is: %3.3f".format(longestRunTime / 1000))
caller ! exit
}
}
}
I have a quad core i7, so my numProcs = 8. On this hardware platform, each proc actor should scan 128 ports (1024/8 = 128). My intention is for the proc1 actor scan 0 - 128, proc2 should scan 129-256, etc... However, this isn't what's happening. Some of the actors end up working on the same range as other actors. The output sample below illustrates the issue:
Processor 2 is checking port 1
Processor 7 is checking port 385
Processor 1 is checking port 1
Processor 5 is checking port 1
Processor 4 is checking port 1
Processor 8 is checking port 129
Processor 3 is checking port 1
Processor 6 is checking port 257
Processor 1 is checking port 2
Processor 5 is checking port 2
Processor 1 is checking port 3
Processor 3 is checking port 2
Processor 5 is checking port 3
Processor 1 is checking port 4
EDIT
Final "working" code:
import java.net.Socket
import scala.actors._
import Actor._
import scala.collection.mutable.ArrayBuffer
object LowPortScanner {
var lastPort = 0
var openPorts = ArrayBuffer[Int]()
var longestRunTime = 00.00
var results = List[Tuple3[Int, Range, Double]]()
val host = "localhost"
val numProcs = 1 to Runtime.getRuntime().availableProcessors()
val portsPerProc = 1024 / numProcs.size
val caller = self
val procPortRanges = numProcs.foldLeft(List[Tuple2[Int, Range]]()) { (portRanges, proc) =>
val tuple2 = (proc.toInt, (lastPort + 1) to (lastPort + portsPerProc))
lastPort += portsPerProc
tuple2 :: portRanges
}
def main(args: Array[String]): Unit = {
//spawn an actor for each processor that scans a given port range
procPortRanges.foreach { proc =>
actor {
caller ! scan(proc._1, proc._2)
}
}
//catch results from the processor actors above
def act {
loop {
reactWithin(100) {
//update the list of results returned from scan
case scanResult: Tuple3[Int, Range, Double] =>
results = results ::: List(scanResult)
//check if results have been returned for each actor
case TIMEOUT =>
if (results.size == numProcs.size) wrapUp
case _ =>
println("got back something weird from one of the port scan actors!")
wrapUp
}
}
}
//Attempt to open a socket on each port in the given range
//returns a Tuple3[procID: Int, ports: Range, time: Double
def scan(proc: Int, ports: Range) {
val startTime = System.nanoTime()
ports.foreach { n =>
try {
println("Processor " + proc + "is checking port " + n)
val socket = new Socket(host, n)
//println("Found open port: " + n)
openPorts += n
socket.close
} catch {
case e: Exception =>
//println("While scanning port " + n + " caught Exception: " + e)
}
}
(proc, ports, startTime - System.nanoTime())
}
//output results and kill the main actor
def wrapUp {
println("These are the open ports in the range 1-1024:")
openPorts.foreach { port => println(port) }
results.foreach { result => if (result._3 > longestRunTime) { longestRunTime = result._3} }
println("Time to scan ports 1 through 1024 is: %3.3f".format(longestRunTime / 1000))
caller ! exit
}
}
}
On this hardware platform, each proc actor should scan 128 ports (1024/8 = 128).
Except you have
val portsPerProc = numProcs.size / 1024
and 8/1024 is 0. Note that you also have an off-by-one error which causes every actor to scan 1 more port than portsPerProc, it should scan either lastPort to (lastPort + portsPerProc) - 1 or (lastPort + 1) to (lastPort + portsPerProc).
For the future, if you have a different question, you should ask it separately :) But here you have a very obvious race condition: all actors are trying to execute
val portRange: Range = (lastPort + 1) to (lastPort + portsPerProc)
lastPort = lastPort + portsPerProc
concurrently. Think what happens when (for example) actors 1 and 2 execute first line before any actor gets to the second one.