Corda - problem while executing flow with multiple output states - blockchain

I'm trying to execute a corda flow with 3000 output states (Java) but I got the error:
[Thread-8 (ActiveMQ-IO-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$4#6a8da5c5)] impl.JournalImpl.run - appendAddRecord::java.lang.IllegalArgumentException: Record is too large to store 18603342 {}
java.lang.IllegalArgumentException: Record is too large to store 18603342
at org.apache.activemq.artemis.core.journal.impl.JournalImpl.switchFileIfNecessary(JournalImpl.java:2915) ~[artemis-journal-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl.appendRecord(JournalImpl.java:2640) ~[artemis-journal-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl.access$200(JournalImpl.java:88) ~[artemis-journal-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.core.journal.impl.JournalImpl$1.run(JournalImpl.java:778) [artemis-journal-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:42) [artemis-commons-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31) [artemis-commons-2.2.0.jar:2.2.0]
at org.apache.activemq.artemis.utils.actors.ProcessorBase$ExecutorTask.run(ProcessorBase.java:53) [artemis-commons-2.2.0.jar:2.2.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_181]
at java.lang.Thread.run(Unknown Source) [?:1.8.0_181]
To avoid this problem I divided the execution of the flow into more steps and call it n times (in this case 6) processing 500 output states in every execution.
This solution works, but there is a better/efficient solution to solve this problem?
Thank you in advance.

This error indicates that a message you are trying to send exceeds the network's max message size.
As of Corda 3.x, this max message size is hardcoded to 10MB (10,485,760 bytes).
In a future version of Corda, the network operator will be able to configure the max message size for the network as part of the network parameters.
The purpose of setting a max message size is to prevent large nodes from bullying smaller nodes by forcing them to process excessively large messages.

Related

Collect one cell from pyspark Dataframe failed [duplicate]

I get the following error when I add --conf spark.driver.maxResultSize=2050 to my spark-submit command.
17/12/27 18:33:19 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from /XXX.XX.XXX.XX:36245 is closed
17/12/27 18:33:19 WARN Executor: Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92)
at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:726)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply$mcV$sp(Executor.scala:755)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:755)
at org.apache.spark.executor.Executor$$anon$2$$anonfun$run$1.apply(Executor.scala:755)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
at org.apache.spark.executor.Executor$$anon$2.run(Executor.scala:755)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Connection from /XXX.XX.XXX.XX:36245 closed
at org.apache.spark.network.client.TransportResponseHandler.channelInactive(TransportResponseHandler.java:146)
The reason of adding this configuration was the error:
py4j.protocol.Py4JJavaError: An error occurred while calling o171.collectToPython.
: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 16 tasks (1048.5 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
Therefore, I increased maxResultSize to 2.5 Gb, but the Spark job fails anyway (the error shown above).
How to solve this issue?
It seems like the problem is the amount of data you are trying to pull back to to your driver is too large. Most likely you are using the collect method to retrieve all values from a DataFrame/RDD. The driver is a single process and by collecting a DataFrame you are pulling all of that data you had distributed across the cluster back to one node. This defeats the purpose of distributing it! It only makes sense to do this after you have reduced the data down to a manageable amount.
You have two options:
If you really need to work with all that data, then you should keep it out on the executors. Use HDFS and Parquet to save the data in a distributed manner and use Spark methods to work with the data on the cluster instead of trying to collect it all back to one place.
If you really need to get the data back to the driver, you should examine whether you really need ALL of the data or not. If you only need summary statistics then compute that out on the executors before calling collect. Or if you only need the top 100 results, then only collect the top 100.
Update:
There is another reason you can run into this error that is less obvious. Spark will try to send data back the driver beyond just when you explicitly call collect. It will also send back accumulator results for each task if you are using accumulators, data for broadcast joins, and some small status data about each task. If you have LOTS of partitions (20k+ in my experience) you can sometimes see this error. This is a known issue with some improvements made, and more in the works.
The options for getting past if if this is your issue are:
Increase spark.driver.maxResultSize or set it to 0 for unlimited
If broadcast joins are the culprit, you can reduce spark.sql.autoBroadcastJoinThreshold to limit the size of broadcast join data
Reduce the number of partitions
Cause: caused by actions like RDD's collect() that send big chunk of data to the driver
Solution:
set by SparkConf: conf.set("spark.driver.maxResultSize", "4g")
OR
set by spark-defaults.conf: spark.driver.maxResultSize 4g
OR
set when calling spark-submit: --conf spark.driver.maxResultSize=4g

How to solve stability problems in Google Dataflow

I have a Dataflow job that has been running stable for several months.
The last 3 days or so, I've problems with the job, it's getting stuck after a certain amount of time and the only thing I can do is stop the job and start a new one. This happened after 2, 6 and 24 hours of processing. Here is the latest exception:
java.lang.ExceptionInInitializerError
at org.apache.beam.runners.dataflow.worker.options.StreamingDataflowWorkerOptions$WindmillServerStubFactory.create (StreamingDataflowWorkerOptions.java:183)
at org.apache.beam.runners.dataflow.worker.options.StreamingDataflowWorkerOptions$WindmillServerStubFactory.create (StreamingDataflowWorkerOptions.java:169)
at org.apache.beam.sdk.options.ProxyInvocationHandler.returnDefaultHelper (ProxyInvocationHandler.java:592)
at org.apache.beam.sdk.options.ProxyInvocationHandler.getDefault (ProxyInvocationHandler.java:533)
at org.apache.beam.sdk.options.ProxyInvocationHandler.invoke (ProxyInvocationHandler.java:158)
at com.sun.proxy.$Proxy54.getWindmillServerStub (Unknown Source)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.<init> (StreamingDataflowWorker.java:677)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.fromDataflowWorkerHarnessOptions (StreamingDataflowWorker.java:562)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.main (StreamingDataflowWorker.java:274)
Caused by: java.lang.RuntimeException: Loading windmill_service failed:
at org.apache.beam.runners.dataflow.worker.windmill.WindmillServer.<clinit> (WindmillServer.java:42)
Caused by: java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0 (Native Method)
at sun.nio.ch.FileDispatcherImpl.write (FileDispatcherImpl.java:60)
at sun.nio.ch.IOUtil.writeFromNativeBuffer (IOUtil.java:93)
at sun.nio.ch.IOUtil.write (IOUtil.java:65)
at sun.nio.ch.FileChannelImpl.write (FileChannelImpl.java:211)
at java.nio.channels.Channels.writeFullyImpl (Channels.java:78)
at java.nio.channels.Channels.writeFully (Channels.java:101)
at java.nio.channels.Channels.access$000 (Channels.java:61)
at java.nio.channels.Channels$1.write (Channels.java:174)
at java.nio.file.Files.copy (Files.java:2909)
at java.nio.file.Files.copy (Files.java:3027)
at org.apache.beam.runners.dataflow.worker.windmill.WindmillServer.<clinit> (WindmillServer.java:39)
Seems like there is no space left on a device, but shouldn't this be managed by Google? Or is this an error in my job somehow?
UPDATE:
The workflow is as follows:
Reading mass data from PubSub (up to 1500/s)
Filter some messages
Keeping session window on key and grouping by it
Sort the data and do calculations
Output the data to another PubSub
You can increase the storage capacity in the parameter of your pipelise. Look at this one diskSizeGb in this page
In addition, more you keep data in memory, more you need memory. It's the case for the windows, if you never close them, or if you allow late data for too long time, you need a lot of memory to keep all these data up.
Tune either your pipeline, or your machine type. Or both!

Wowza Transcoder Add-on throws ArrayIndexOutOfBoundsException

I'm running Wowza 3.6.2 on Windows 8.1 (64 bit) and have enabled the Transcoder add-on. I'm using the transcoder to take JPEG-snapshots from the live stream. I've built a custom HTTPProvider, similar to what is described here. This works fine and I can get JPG-snapshots from the stream through my HTTPProvider.
The problem is that since I enabled the transcoder I get irritating error messages in my server log on onPublish and onUnPublish of any stream.
As a transcoder template I used the default transrate.xml that comes with the Wowza installation, without any modifications made to it.
When i publish to a stream asdf I get errors similar to this:
ERROR server comment - TranscoderSessionDestination.init[livereceiver/_definst_/asdf]: [asdf_160p]:java.lang.ArrayIndexOutOfBoundsException: 1
java.lang.ArrayIndexOutOfBoundsException: 1
at com.foo.wms.module.IncomingStreamEventHandler.getQueryStringMap(IncomingStreamEventHandler.java:191)
at com.foo.wms.module.IncomingStreamEventHandler.onPublish(IncomingStreamEventHandler.java:83)
at com.wowza.wms.stream.MediaStream.notifyActionPublish(Unknown Source)
at com.wowza.wms.stream.publish.Publisher.publish(Unknown Source)
at com.wowza.wms.stream.publish.Publisher.publish(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSessionDestination.init(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSession.a(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSession.handleOnMetadata(Unknown Source)
at com.wowza.wms.transcoder.model.LiveStreamTranscoder.handleOnMetadata(Unknown Source)
at com.wowza.wms.stream.live.LiveStreamTranscoderRunner.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
And when I unpublish the stream I get this:
ERROR server comment - TranscoderSessionDestination.shutdown: [asdf_160p]:java.lang.NullPointerException
java.lang.NullPointerException
at com.foo.wms.module.IncomingStreamEventHandler.onUnPublish(IncomingStreamEventHandler.java:166)
at com.wowza.wms.stream.MediaStream.notifyActionUnPublish(Unknown Source)
at com.wowza.wms.stream.publish.Publisher.publish(Unknown Source)
at com.wowza.wms.stream.publish.Publisher.unpublish(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSessionDestination.shutdown(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSession.c(Unknown Source)
at com.wowza.wms.transcoder.model.TranscoderSession.shutdown(Unknown Source)
at com.wowza.wms.transcoder.model.LiveStreamTranscoder.shutdown(Unknown Source)
at com.wowza.wms.stream.live.LiveStreamTranscoderRunner.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
I get three of each exception when I publish/unpublish (one for each Encode block that is enabled in the transrate.xml file).
Does anyone have an idea on what might be causing this?
Here is a better way to determine if the stream is a transcoded stream and not a source stream
if(stream.isTranscodeResult()) return;
After posting my question I had another look at the stacktrace and realized what the problem was - I had been looking in the wrong direction the whole time. Since the problem appeared when I enabled the transcoder, I concluded that's where the problem must be. What I didn't realize was that onPublish and onUnPublish fires multiple times when you use the transcoder - one time for the incoming stream, and one time for every transcoded stream.
Within the onPublish and onUnPublish methods of my module I do stuff, like read querystring-paramters, which are not present on the transcoded streams. That is why the exceptions are thrown when the onPublish and onUnPublish methods are called for the transcoded streams.
To remedy this, I added two lines at the beginning of onPublish and onUnPublish methods.
if (streamName.contains("_"))
return;
A somewhat ugly solution, but I am in control of all stream names and do not allow underscore in them anyway, so in my case this works fine.
Update:
#flux has provided a much nicer solution for how to check if the stream is the result of a transcode operation. See his answer for more info.

Couchbase - ElasticSearch Java Heap memory

We have a Couchbase instance mounted on a AmazoneWeb Service Server, and an Elastic Search instance running on the same server.
The connection bewtween the two of them is being done ok, and currently replicating fine until...
Out of the blue, we got the following error log on ElasticSearch:
[2013-08-29 21:27:34,947][WARN ][cluster.metadata ] [01-Thor] failed to dynamically update the mapping in cluster_state from shard
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:343)
at org.elasticsearch.common.io.FastByteArrayOutputStream.write(FastByteArrayOutputStream.java:103)
at org.elasticsearch.common.jackson.core.json.UTF8JsonGenerator._flushBuffer(UTF8JsonGenerator.java:1848)
at org.elasticsearch.common.jackson.core.json.UTF8JsonGenerator.writeString(UTF8JsonGenerator.java:436)
at org.elasticsearch.common.xcontent.json.JsonXContentGenerator.writeString(JsonXContentGenerator.java:84)
at org.elasticsearch.common.xcontent.XContentBuilder.field(XContentBuilder.java:314)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.doXContentBody(AbstractFieldMapper.java:601)
at org.elasticsearch.index.mapper.core.NumberFieldMapper.doXContentBody(NumberFieldMapper.java:286)
at org.elasticsearch.index.mapper.core.LongFieldMapper.doXContentBody(LongFieldMapper.java:338)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.toXContent(AbstractFieldMapper.java:595)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:920)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:852)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:920)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:852)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:920)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:852)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:920)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:852)
at org.elasticsearch.index.mapper.object.ObjectMapper.toXContent(ObjectMapper.java:920)
at org.elasticsearch.index.mapper.DocumentMapper.toXContent(DocumentMapper.java:700)
at org.elasticsearch.index.mapper.DocumentMapper.refreshSource(DocumentMapper.java:682)
at org.elasticsearch.index.mapper.DocumentMapper.<init>(DocumentMapper.java:342)
at org.elasticsearch.index.mapper.DocumentMapper$Builder.build(DocumentMapper.java:224)
at org.elasticsearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:231)
at org.elasticsearch.index.mapper.MapperService.parse(MapperService.java:380)
at org.elasticsearch.index.mapper.MapperService.merge(MapperService.java:190)
at org.elasticsearch.cluster.metadata.MetaDataMappingService$2.execute(MetaDataMappingService.java:185)
at org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:229)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:95)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
[2013-08-29 21:27:56,948][WARN ][indices.ttl ] [01-Thor] failed to execute ttl purge
java.lang.OutOfMemoryError: Java heap space
at org.apache.lucene.util.ByteBlockPool$Allocator.getByteBlock(ByteBlockPool.java:66)
at org.apache.lucene.util.ByteBlockPool.nextBuffer(ByteBlockPool.java:202)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:319)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:274)
at org.apache.lucene.search.ConstantScoreAutoRewrite$CutOffTermCollector.collect(ConstantScoreAutoRewrite.java:131)
at org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:79)
at org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
at org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:639)
at org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:686)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
at org.elasticsearch.indices.ttl.IndicesTTLService.purgeShards(IndicesTTLService.java:186)
at org.elasticsearch.indices.ttl.IndicesTTLService.access$000(IndicesTTLService.java:65)
at org.elasticsearch.indices.ttl.IndicesTTLService$PurgerThread.run(IndicesTTLService.java:122)
[2013-08-29 21:29:23,919][WARN ][indices.ttl ] [01-Thor] failed to execute ttl purge
java.lang.OutOfMemoryError: Java heap space
We tried changing several memory values, but we cant seem to get it right.
Did some one experienced the same issue?
A few troubleshooting tips:
Generally smart to dedicate one AWS instance only to Elasticsearch for predictable performance / ease of debugging.
Monitor your memory usage using the Bigdesk plugin. This will show you if your memory bottleneck is occurring from Elasticsearch - might be from the OS, simultaneous heavy querying and indexing, or else something unexpected.
Elasticsearch's Java heap should be set around 50% of your boxes's total memory.
This gist from Shay Banon offers several solutions to solve memory problems in Elasticsearch.

com.ctc.wstx.exc.WstxParsingException: Text size limit

I am sending a big attachment to a CXF webservice and I get the following exception:
Caused by: javax.xml.bind.UnmarshalException
- with linked exception:
[com.ctc.wstx.exc.WstxParsingException: Text size limit (134217728) exceeded
at [row,col {unknown-source}]: [1,134855131]]
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.handleStreamException(UnmarshallerImpl.java:426)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:362)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal(UnmarshallerImpl.java:339)
at org.apache.cxf.jaxb.JAXBEncoderDecoder.doUnmarshal(JAXBEncoderDecoder.java:769)
at org.apache.cxf.jaxb.JAXBEncoderDecoder.access$100(JAXBEncoderDecoder.java:94)
at org.apache.cxf.jaxb.JAXBEncoderDecoder$1.run(JAXBEncoderDecoder.java:797)
at java.security.AccessController.doPrivileged(Native Method)
at org.apache.cxf.jaxb.JAXBEncoderDecoder.unmarshall(JAXBEncoderDecoder.java:795)
... 25 more
The issue seems to come from the Woodstox library that says
Text size limit (134217728) exceeded
Does someone know if it is possible to increase this limit? if yes, how to do?
If it's coming from woodstox like that, then you aren't sending it as an MTOM attachment. My first suggestion would be to flip it to MTOM so it can be handled outside the XML parsing. Much more efficient as we can keep it as an inputstream or similar and not have it in memory.
If you want to keep it in the XML, you can set the property: "org.apache.cxf.stax.maxTextLength" to some larger value. Keep in mind, stuff coming in from the stax parser like this are held in memory as either a String or byte[] and will thus consume memory.