connection was closed in ignite map reduce tasks - mapreduce
Environment:
ignite server:
centos6.5 with kernel 2.6.32-431.el6.x86_64
ignite version 1.9
hadoop version 2.6.2
3 server nodes with each having '-Xms16g -Xmx16g -server -XX:+AggressiveOpts -XX:MaxMetaspaceSize=256m' set when started
I run a map reduce test job with ignite map reduce. The job is simply getting the average number for each people. The data is like:
Jack 0.35
Tom 0.78
Lily 0.92
Jack 0.28
Tom 0.18
...
At first, I generated a data set of 100M lines. It's about 2.53GB. The job finished correctly in about 30s. Then I generated a data set of 1 Billion lines, about 25.3GB. The job always failed with exceptions. I tried several times but the same result.
The ignite server node threw exception below:
[15:06:56,804][ERROR][sys-#2740%null%][GridTcpRestProtocol] Failed to process client request [ses=GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=549 lim=549 cap=8192], super=AbstractNioClientWorker [selector=sun.nio.ch.EPollSelectorImpl#1cba0431, idx=3, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-rest-3, gridName=null, finished=false, isCancelled=false, hashCode=906881587, interrupted=false, runner=grid-nio-worker-tcp-rest-3-#50%null%]]], writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null, super=GridNioSessionImpl [locAddr=/172.31.68.204:11211, rmtAddr=/172.31.68.202:39473, createTime=1493967985751, closeTime=1493968009502, bytesSent=2715, bytesRcvd=2641, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1493968016794, lastSndTime=1493967998303, lastRcvTime=1493968009502, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=GridTcpRestParser [jdkMarshaller=JdkMarshaller [], routerClient=false], directMode=false]], accepted=true]], msg=GridClientTaskRequest [taskName=o.a.i.i.processors.hadoop.proto.HadoopProtocolJobStatusTask, arg=HadoopProtocolTaskArguments []]]
class org.apache.ignite.IgniteCheckedException: Failed to send message (connection was closed): GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=549 lim=549 cap=8192], super=AbstractNioClientWorker [selector=sun.nio.ch.EPollSelectorImpl#1cba0431, idx=3, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-rest-3, gridName=null, finished=false, isCancelled=false, hashCode=906881587, interrupted=false, runner=grid-nio-worker-tcp-rest-3-#50%null%]]], writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null, super=GridNioSessionImpl [locAddr=/172.31.68.204:11211, rmtAddr=/172.31.68.202:39473, createTime=1493967985751, closeTime=1493968009502, bytesSent=2715, bytesRcvd=2641, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1493968016794, lastSndTime=1493967998303, lastRcvTime=1493968009502, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=GridTcpRestParser [jdkMarshaller=JdkMarshaller [], routerClient=false], directMode=false]], accepted=true]]
at org.apache.ignite.internal.util.IgniteUtils.cast(IgniteUtils.java:7239)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:170)
at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:119)
at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1$1.apply(GridTcpRestNioListener.java:264)
at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1$1.apply(GridTcpRestNioListener.java:261)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271)
at org.apache.ignite.internal.util.future.GridFutureAdapter.listen(GridFutureAdapter.java:228)
at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1.apply(GridTcpRestNioListener.java:261)
at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1.apply(GridTcpRestNioListener.java:229)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:259)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:389)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:355)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:332)
at org.apache.ignite.internal.processors.rest.GridRestProcessor$2$1.apply(GridRestProcessor.java:158)
at org.apache.ignite.internal.processors.rest.GridRestProcessor$2$1.apply(GridRestProcessor.java:155)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:259)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:389)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:355)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:332)
at org.apache.ignite.internal.util.future.GridFutureChainListener.applyCallback(GridFutureChainListener.java:78)
at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:70)
at org.apache.ignite.internal.util.future.GridFutureChainListener.apply(GridFutureChainListener.java:30)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:259)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:389)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:355)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:332)
at org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler$2.apply(GridTaskCommandHandler.java:294)
at org.apache.ignite.internal.processors.rest.handlers.task.GridTaskCommandHandler$2.apply(GridTaskCommandHandler.java:257)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:271)
at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListeners(GridFutureAdapter.java:259)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:389)
at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:355)
at org.apache.ignite.internal.processors.task.GridTaskWorker.finishTask(GridTaskWorker.java:1579)
at org.apache.ignite.internal.processors.task.GridTaskWorker.finishTask(GridTaskWorker.java:1547)
at org.apache.ignite.internal.processors.task.GridTaskWorker.reduce(GridTaskWorker.java:1157)
at org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:942)
at org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:996)
at org.apache.ignite.internal.processors.task.GridTaskProcessor$JobMessageListener.onMessage(GridTaskProcessor.java:1221)
at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1222)
at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:850)
at org.apache.ignite.internal.managers.communication.GridIoManager.access$2100(GridIoManager.java:108)
at org.apache.ignite.internal.managers.communication.GridIoManager$7.run(GridIoManager.java:790)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed to send message (connection was closed): GridSelectorNioSessionImpl [worker=ByteBufferNioClientWorker [readBuf=java.nio.HeapByteBuffer[pos=549 lim=549 cap=8192], super=AbstractNioClientWorker [selector=sun.nio.ch.EPollSelectorImpl#1cba0431, idx=3, bytesRcvd=0, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-rest-3, gridName=null, finished=false, isCancelled=false, hashCode=906881587, interrupted=false, runner=grid-nio-worker-tcp-rest-3-#50%null%]]], writeBuf=null, readBuf=null, inRecovery=null, outRecovery=null, super=GridNioSessionImpl [locAddr=/172.31.68.204:11211, rmtAddr=/172.31.68.202:39473, createTime=1493967985751, closeTime=1493968009502, bytesSent=2715, bytesRcvd=2641, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1493968016794, lastSndTime=1493967998303, lastRcvTime=1493968009502, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=GridTcpRestParser [jdkMarshaller=JdkMarshaller [], routerClient=false], directMode=false]], accepted=true]]
at org.apache.ignite.internal.util.nio.GridNioServer.send0(GridNioServer.java:554)
at org.apache.ignite.internal.util.nio.GridNioServer.send(GridNioServer.java:494)
at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionWrite(GridNioServer.java:3036)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:118)
at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionWrite(GridNioCodecFilter.java:94)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionWrite(GridNioFilterAdapter.java:118)
at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionWrite(GridNioFilterChain.java:264)
at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionWrite(GridNioFilterChain.java:189)
at org.apache.ignite.internal.util.nio.GridNioSessionImpl.send(GridNioSessionImpl.java:108)
at org.apache.ignite.internal.processors.rest.protocols.tcp.GridTcpRestNioListener$1.apply(GridTcpRestNioListener.java:258)
... 40 more
The job client threw exception below:
java.io.IOException: Failed to get job status: job_1fbf9083-9a44-4be9-9199-695a97652dc2_0002
at org.apache.ignite.internal.processors.hadoop.impl.proto.HadoopClientProtocol.getJobStatus(HadoopClientProtocol.java:197)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:326)
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:323)
at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:611)
at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1357)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1318)
at com.tscloud.sdk.test.ignite.MRTest.run(MRTest.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at com.tscloud.sdk.test.ignite.MRTest.main(MRTest.java:53)
Caused by: class org.apache.ignite.internal.client.impl.connection.GridClientConnectionResetException: Failed to perform request (connection failed): /172.31.68.204:11211
at org.apache.ignite.internal.client.impl.connection.GridClientConnection.getCloseReasonAsException(GridClientConnection.java:491)
at org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection.close(GridClientNioTcpConnection.java:339)
at org.apache.ignite.internal.client.impl.connection.GridClientNioTcpConnection.close(GridClientNioTcpConnection.java:299)
at org.apache.ignite.internal.client.impl.connection.GridClientConnectionManagerAdapter$NioListener.onDisconnected(GridClientConnectionManagerAdapter.java:630)
at org.apache.ignite.internal.util.nio.GridNioFilterChain$TailFilter.onSessionClosed(GridNioFilterChain.java:253)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:93)
at org.apache.ignite.internal.util.nio.GridNioCodecFilter.onSessionClosed(GridNioCodecFilter.java:70)
at org.apache.ignite.internal.util.nio.GridNioFilterAdapter.proceedSessionClosed(GridNioFilterAdapter.java:93)
at org.apache.ignite.internal.util.nio.GridNioServer$HeadFilter.onSessionClosed(GridNioServer.java:3005)
at org.apache.ignite.internal.util.nio.GridNioFilterChain.onSessionClosed(GridNioFilterChain.java:147)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.close(GridNioServer.java:2306)
at org.apache.ignite.internal.util.nio.GridNioServer$ByteBufferNioClientWorker.processRead(GridNioServer.java:929)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2026)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:1863)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1568)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
at java.lang.Thread.run(Thread.java:745)
The job configuration is below:
Configuration configuration = new Configuration();
configuration.set(MRConfig.FRAMEWORK_NAME, IgniteHadoopClientProtocolProvider.FRAMEWORK_NAME);
configuration.set(MRConfig.MASTER_ADDRESS, "172.31.68.202:11211");
configuration.set("fs.igfs.impl", "org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem");
configuration.set("fs.default.name", "igfs://igfs#172.31.68.202/");
I checked nodes status after the job failed using ignitevisorcmd.sh. All server nodes were OK, but there were sometime one of the node server was down. I did not know why it behaved like this.
Any help is appreciated.
Edit(2017-05-16):
I changed the hadoop core-site.xml and add hadoop.tmp.dir property as below
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/hadoop-2.6.2/tmp</value>
</property>
Then I reformatted hdfs and uploaded the 25.3GB data file. I run the test successfully. It turns out my hdfs has something wrong. Reformatting namenode solves the problem.
Before above steps, I tried checking the jvm heap usage by VisualVM.
One of the server node visualvm monitor snapshot
If your node was down temporarily, then it is reasonable that a job would fail-over to another node or fail altogether if there are no other nodes. I would check that your network was not reset or down. Also, you should check for the presence of any software firewalls between your nodes (it is best to disable operating system firewalls).
Related
Migrating data from Teradata to BigQuery
I'm running Teradata express on VMWare player. In order to migrate the data to BigQuery I'm trying to follow the official documentation capturing BigQuery transfer job. https://cloud.google.com/bigquery-transfer/docs/teradata-migration I have downloaded the required jars and trying to start the migration process alfer successful initialization. Providing the config file contents generated after successful initialization. TDExpress1620_Sles11:~/Desktop # cat testpath/bq.config { "agent-id": "84f7cf04-2133-4c5a-ae43-dd22a30281ba", "transfer-configuration": { "project-id": "106730138445", "location": "us", "id": "5e1c7eda-0000-249c-aab6-94eb2c06245a" }, "source-type": "teradata", "console-log": false, "silent": false, "teradata-config": { "connection": { "host": "localhost" }, "local-processing-space": "testpath", "database-credentials-file-path": "", "max-local-storage": "200GB", "gcs-upload-chunk-size": "32MB", "use-tpt": false, "max-sessions": 0, "spool-mode": "NoSpool", "max-parallel-upload": 1, "max-parallel-extract-threads": 1, "session-charset": "UTF8", "max-unload-file-size": "2GB" } After running the command to start the migration process, I come across below error. TDExpress1620_Sles11:~/Desktop # java -cp terajdbc4.jar:mirroring-agent.jar com.google.cloud.bigquery.dms.Agent --configuration-file=testpath/bq.config Reading data from gs://data_transfer_agent/latest/version WARNING: Failed to get the latest released agent version with error: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Exception in thread "main" com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69) at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1349) at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:398) at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1024) at com.google.common.util.concurrent.AbstractFuture.addListener(AbstractFuture.java:670) at com.google.common.util.concurrent.ForwardingListenableFuture.addListener(ForwardingListenableFuture.java:45) at com.google.api.core.ApiFutureToListenableFuture.addListener(ApiFutureToListenableFuture.java:52) at com.google.common.util.concurrent.Futures.addCallback(Futures.java:1330) at com.google.api.core.ApiFutures.addCallback(ApiFutures.java:63) at com.google.api.gax.grpc.GrpcExceptionCallable.futureCall(GrpcExceptionCallable.java:67) at com.google.api.gax.rpc.AttemptCallable.call(AttemptCallable.java:86) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: com.google.api.gax.rpc.AsyncTaskException: Asynchronous task failed at com.google.api.gax.rpc.ApiExceptions.callAndTranslateApiException(ApiExceptions.java:57) at com.google.api.gax.rpc.UnaryCallable.call(UnaryCallable.java:112) at com.google.cloud.bigquery.datatransfer.v1.DataTransferServiceClient.getTransferConfig(DataTransferServiceClient.java:741) at com.google.cloud.bigquery.datatransfer.v1.DataTransferServiceClient.getTransferConfig(DataTransferServiceClient.java:718) at com.google.cloud.bigquery.dms.gcloud.TransferServiceClient.getTransferConfig(TransferServiceClient.java:62) at com.google.cloud.bigquery.dms.gcloud.TransferServiceClient.getDestinationBucketName(TransferServiceClient.java:94) at com.google.cloud.bigquery.dms.Agent.main(Agent.java:235) Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception at io.grpc.Status.asRuntimeException(Status.java:533) at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:490) at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700) at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40) at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399) at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:500) at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:65) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:592) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:508) at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:632) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) ... 7 more Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: null: bigquerydatatransfer.googleapis.com/2404:6800:4009:810:0:0:0:200a:443 at io.grpc.netty.shaded.io.netty.channel.unix.Errors.throwConnectException(Errors.java:104) at io.grpc.netty.shaded.io.netty.channel.unix.Socket.connect(Socket.java:257) at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel.doConnect0(AbstractEpollChannel.java:730) at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel.doConnect(AbstractEpollChannel.java:715) at io.grpc.netty.shaded.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.connect(AbstractEpollChannel.java:557) at io.grpc.netty.shaded.io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1340) at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:532) at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:517) at io.grpc.netty.shaded.io.netty.channel.ChannelDuplexHandler.connect(ChannelDuplexHandler.java:50) at io.grpc.netty.shaded.io.grpc.netty.WriteBufferingAndExceptionHandler.connect(WriteBufferingAndExceptionHandler.java:136) at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:532) at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext.access$1000(AbstractChannelHandlerContext.java:38) at io.grpc.netty.shaded.io.netty.channel.AbstractChannelHandlerContext$9.run(AbstractChannelHandlerContext.java:522) at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) at io.grpc.netty.shaded.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:333) at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ... 1 more Caused by: java.net.NoRouteToHostException ... 19 more Please let me know if somebody has successfully implemented the same or able to figure out the exact issue being faced by me.
Issues while configuring the API Subscription BPS WSO2
So, i've my WSO2 BPS 3.6.0 configured to support SSL and a custom hostname i.e. mydomain.domain.com:9445 etc. and i'm trying to implement the API Subscription Workflow by following this documentation. Now i've performed the following steps: set the offset of wso2 bps to 2 and it is running fine with port: 9445 edited the wsa:Address tag in bothSubscriptionService.epr and SubscriptionCallbackService.epr located in API-M_HOME/business-processes/epr as the bps server had a customized hostname instead of localhost (not sure if performing this step was right) SubscriptionService.epr SubscriptionCallBackService.epr copy-pasted the epr folder from API-M_HOME/business-processes/epr to BPS_HOME/repository/conf/epr Added the required BPEL package and human task accordingly Navigated to the carbon console from APIM and edited the workflow-extensions.xml, here's how it looks like set the TaskCoordinationEnabled tag of b4p-cordination-config.xml to true located in BPS_Home\repository\conf Consider OTHER required configurations: At API Manager End: site.json file located at APIM_Home\repository\deployment\server\jaggeryapps\admin\site\conf { "theme": { "base": "wso2", "subtheme": "modern" }, "context": "/admin", "request_url": "READ_FROM_REQUEST", "tasksPerPage": 10, "allowedPermission": "/permission/admin/manage/apim_admin", "workflows": { "workFlowServerURL": "https://mydomain.domain.com:9445/services/", }, "ssoConfiguration": { "enabled": "false", "issuer": "API_WORKFLOW_ADMIN", "identityProviderURL": "https://localhost:9443/samlsso", "keyStorePassword": "", "identityAlias": "", "keyStoreName": "", "verifyAssertionValidityPeriod": "true", "audienceRestrictionsEnabled": "true", "responseSigningEnabled": "true", "assertionSigningEnabled": "true", "assertionEncryptionEnabled": "false", "idpInit" : "false", "idpInitSSOURL" : "https://localhost:9443/samlsso?spEntityID=API_WORKFLOW_ADMIN", "externalLogoutPage" : "https://localhost:9443/samlsso?slo=true" }, "reverseProxy": { "enabled": false, // values true , false , "auto" - will look for X-Forwarded-* headers "host": "sample.proxydomain.com", // If reverse proxy do not have a domain name use IP "context": "" //"regContext":"" // Use only if different path is used for registry } } the workflowconfiguration in api-manager.xml At BPS end: carbon.xml Issue: Now whenever a user navigates to APIM Store and subscribes to any API, the subscription request is listed at the APIM Admin console. When i select APPROVE from the provided ddl and click on the COMPLETE button, the record vanishes. However, this is the error that i see at WSO2's CMD windows: APIM's cmd window [2017-11-09 00:13:17,022] INFO - TimeoutHandler This engine will expire all cal lbacks after GLOBAL_TIMEOUT: 120 seconds, irrespective of the timeout action, af ter the specified or optional timeout [2017-11-09 00:13:17,164] ERROR - TargetHandler I/O error: Host name verificatio n failed for host : localhost javax.net.ssl.SSLException: Host name verification failed for host : localhost at org.apache.synapse.transport.http.conn.ClientSSLSetupHandler.verify(C lientSSLSetupHandler.java:171) at org.apache.http.nio.reactor.ssl.SSLIOSession.doHandshake(SSLIOSession .java:308) at org.apache.http.nio.reactor.ssl.SSLIOSession.isAppInputReady(SSLIOSes sion.java:410) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(Abstra ctIODispatch.java:119) at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor .java:159) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(Abstr actIOReactor.java:338) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(Abst ractIOReactor.java:316) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIO Reactor.java:277) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor. java:105) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker. run(AbstractMultiworkerIOReactor.java:586) at java.lang.Thread.run(Thread.java:745) [2017-11-09 00:13:17,188] WARN - EndpointContext Endpoint : AnonymousEndpoint w ith address https://localhost:9443/store/site/blocks/workflow/workflow-listener/ ajax/workflow-listener.jag will be marked SUSPENDED as it failed [2017-11-09 00:13:17,193] WARN - EndpointContext Suspending endpoint : Anonymou sEndpoint with address https://localhost:9443/store/site/blocks/workflow/workflo w-listener/ajax/workflow-listener.jag - current suspend duration is : 30000ms - Next retry after : Thu Nov 09 00:13:47 EST 2017 [2017-11-0900:13:17,201] INFO - LogMediator STATUS = Executing default 'fault' sequence, ERROR_CODE = 101500, ERROR_MESSAGE = Error in Sender [2017-11-09 00:14:17,238] INFO - SourceHandler Writer null when calling informW riterError [2017-11-09 00:14:17,238] WARN - SourceHandler Connection time out after reques t is read: http-incoming-1 Socket Timeout : 60000 Remote Address : /10.10.30.130 :49249 [2017-11-09 00:14:24,671] ERROR - AxisEngine The endpoint reference (EPR) for th e Operation not found is /services/WorkflowCallbackService and the WSA Action = null. If this EPR was previously reachable, please contact the server administra tor. org.apache.axis2.AxisFault: The endpoint reference (EPR) for the Operation not f ound is /services/WorkflowCallbackService and the WSA Action = null. If this EPR was previously reachable, please contact the server administrator. at org.apache.axis2.engine.DispatchPhase.checkPostConditions(DispatchPha se.java:102) at org.apache.axis2.engine.Phase.invoke(Phase.java:329) at org.apache.axis2.engine.AxisEngine.invoke(AxisEngine.java:261) at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:167) at org.apache.synapse.transport.passthru.ServerWorker.processNonEntityEn closingRESTHandler(ServerWorker.java:325) at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.j ava:158) at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(Native WorkerPool.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:617) at java.lang.Thread.run(Thread.java:745) [2017-11-09 00:14:24,673] ERROR - ServerWorker Error processing GET request for : /services/WorkflowCallbackService org.apache.axis2.AxisFault: The endpoint reference (EPR) for the Operation not f ound is /services/WorkflowCallbackService and the WSA Action = null. If this EPR was previously reachable, please contact the server administrator. at org.apache.axis2.engine.DispatchPhase.checkPostConditions(DispatchPha se.java:102) at org.apache.axis2.engine.Phase.invoke(Phase.java:329) at org.apache.axis2.engine.AxisEngine.invoke(AxisEngine.java:261) at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:167) at org.apache.synapse.transport.passthru.ServerWorker.processNonEntityEn closingRESTHandler(ServerWorker.java:325) at org.apache.synapse.transport.passthru.ServerWorker.run(ServerWorker.j ava:158) at org.apache.axis2.transport.base.threads.NativeWorkerPool$1.run(Native WorkerPool.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:617) at java.lang.Thread.run(Thread.java:745) BPS's cmd window: [2017-11-09 00:14:16,738] ERROR {org.wso2.carbon.bpel.core.ode.integration.Partn erService} - Error sending message to Axis2 for ODE mex {PartnerRoleMex#hqejbhc nphrcr2a32g83oh [PID {http://workflow.subscription.apimgt.carbon.wso2.org}Subscr iptionApprovalWorkFlowProcess-1] calling org.apache.ode.bpel.epr.WSAEndpoint#705 fc38f.resumeEvent(...) Status REQUEST} org.apache.axis2.AxisFault: Read timed out at org.apache.axis2.AxisFault.makeFault(AxisFault.java:430) at org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.jav a:199) at org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:77) at org.apache.axis2.transport.http.CommonsHTTPTransportSender.writeMessa geWithCommons(CommonsHTTPTransportSender.java:451) at org.apache.axis2.transport.http.CommonsHTTPTransportSender.invoke(Com monsHTTPTransportSender.java:278) at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442) at org.apache.axis2.description.OutOnlyAxisOperationClient.executeImpl(O utOnlyAxisOperation.java:297) at org.apache.axis2.client.OperationClient.execute(OperationClient.java: 149) at org.wso2.carbon.bpel.core.ode.integration.utils.AxisServiceUtils.invo keService(AxisServiceUtils.java:323) at org.wso2.carbon.bpel.core.ode.integration.PartnerService.invoke(Partn erService.java:333) at org.wso2.carbon.bpel.core.ode.integration.BPELMessageExchangeContextI mpl.invokePartner(BPELMessageExchangeContextImpl.java:43) at org.apache.ode.bpel.engine.BpelRuntimeContextImpl.invoke(BpelRuntimeC ontextImpl.java:897) at org.apache.ode.bpel.runtime.INVOKE.run(INVOKE.java:130) at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.ode.jacob.vpu.JacobVPU$JacobThreadImpl.run(JacobVPU.java:4 51) at org.apache.ode.jacob.vpu.JacobVPU.execute(JacobVPU.java:139) at org.apache.ode.bpel.engine.BpelRuntimeContextImpl.execute(BpelRuntime ContextImpl.java:1002) at org.apache.ode.bpel.engine.PartnerLinkMyRoleImpl.invokeInstance(Partn erLinkMyRoleImpl.java:250) at org.apache.ode.bpel.engine.BpelProcess$1.invoke(BpelProcess.java:288) at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java :224) at org.apache.ode.bpel.engine.BpelProcess.invokeProcess(BpelProcess.java :279) at org.apache.ode.bpel.engine.BpelProcess.handleJobDetails(BpelProcess.j ava:434) at org.apache.ode.bpel.engine.BpelEngineImpl.onScheduledJob(BpelEngineIm pl.java:558) at org.apache.ode.bpel.engine.BpelServerImpl.onScheduledJob(BpelServerIm pl.java:467) at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob$1.call(SimpleS cheduler.java:633) at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob$1.call(SimpleS cheduler.java:627) at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(Simpl eScheduler.java:298) at org.apache.ode.scheduler.simple.SimpleScheduler.execTransaction(Simpl eScheduler.java:253) at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob.call(SimpleSch eduler.java:627) at org.apache.ode.scheduler.simple.SimpleScheduler$RunJob.call(SimpleSch eduler.java:611) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:150) at java.net.SocketInputStream.read(SocketInputStream.java:121) at sun.security.ssl.InputRecord.readFully(InputRecord.java:465) at sun.security.ssl.InputRecord.read(InputRecord.java:503) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:961) at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:918) at sun.security.ssl.AppInputStream.read(AppInputStream.java:105) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read(BufferedInputStream.java:265) at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java: 78) at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106 ) at org.apache.commons.httpclient.HttpConnection.readLine(HttpConnection. java:1116) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$Http ConnectionAdapter.readLine(MultiThreadedHttpConnectionManager.java:1413) at org.apache.commons.httpclient.HttpMethodBase.readStatusLine(HttpMetho dBase.java:1973) at org.apache.commons.httpclient.HttpMethodBase.readResponse(HttpMethodB ase.java:1735) at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.j ava:1098) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Htt pMethodDirector.java:398) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMe thodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.jav a:397) at org.apache.axis2.transport.http.AbstractHTTPSender.executeMethod(Abst ractHTTPSender.java:659) at org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.jav a:195) ... 34 more What could be the issue here? Any idea? do let me know. Thanks Note that the bps workflow for API STATE CHANGE works just fine with the same configurations
Please note, that you are using calls with HTTPS with specific domain names Host name verification failed for host : localhost at org.apache.synapse.transport.http.conn.ClientSSLSetupHandler.verify(ClientSSLSetupHandler.java:171) at org.apache.http.nio.reactor.ssl.SSLIOSession.doHandshake(SSLIOSession .java:308) the certificate provided is CN=localhost, so indeed the host verification fails what you can do about it simplest way is switching to http when on secure network (behind firewall, vpn, ..) update SSL certificates of BPS and APIM to match their hostnames and they have to trust each others certificate (or certificate issuer) disable SSL hostname validation in axis2.xml (I do not recommend it, good for DEV, VERY BAD for PROD) - set <parameter name="HostnameVerifier">AllowAll</parameter>
Runtime exception while executing sqoop job
I am trying to execute sqoop job in biginsights. I am importing data from oracle db to hdfs. Below is the sqoop command which starts executing mapper and stops after sometime. sudo sqoop import --connect "jdbc:oracle:thin:#(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = ***.**.**.***)(PORT = 1908)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = GPRSDB1) ) )" --username **** --password ***** --query "select distinct pr_con_id from s_org_ext where \$CONDITIONS" --where "PAR_DIVN_ID ='1-1POG9V' AND rownum< 100" --target-dir /user/bigsql/crm/s_contact_mh --fields-terminated-by ',' --m 1 Below is the error : Error: java.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLException: The Network Adapter could not establish the connection at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:167) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:746) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.RuntimeException: java.sql.SQLException: The Network Adapter could not establish the connection at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:220) at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:165) ... 9 more Caused by: java.sql.SQLException: The Network Adapter could not establish the connection at oracle.jdbc.driver.SQLStateMapping.newSQLException(SQLStateMapping.java:70) at oracle.jdbc.driver.DatabaseError.newSQLException(DatabaseError.java:133) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:199) at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:480) at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:413) at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:508) at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:203) at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:33) at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:510) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:215) at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:302) at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:213) ... 10 more Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:328) at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:421) at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:630) at oracle.net.ns.NSProtocol.connect(NSProtocol.java:206) at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:966) at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:292) ... 18 more Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:208) at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:127) at oracle.net.nt.ConnOption.connect(ConnOption.java:126) at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:306) ... 23 more Please help me out to solve this issue. Thanks in advance.
SQOOP Eval function just acts on the RDBMS database and returns you the resultset. Here hadoop does not come into picture. SQOOP Import function tries to import the data from RDBMS and load it into HDFS. For Sqoop Import to work as desired the db server should be reachable from all nodes in the cluster.
wso2 bam2.4 connect to external cassandra failed by .AuthenticationException
I think I have followed all the examples what I can find, and I am working on bam2.4.1, the apache cassandra is 2.1.2. I have configured casandra to disable security by setting cassandra.yaml authenticator: AllowAllAuthenticator authorizer: AllowAllAuthorizer Then I follow bam2.4.1 document to connect to external cassandra, then start wso2 bam by wso2server.bat -Ddisable.cassandra.server.startup=true All most everything works except authentication [2014-11-25 20:39:41,295] ERROR {org.wso2.carbon.bam.notification.task.Notificat ionDispatchTask} - Error executing notification dispatch task: Access denied fo r user wso2bam to login TCP,192.168.1.7:7611,TCP,192.168.1.7:7711 org.wso2.carbon.databridge.commons.exception.AuthenticationException: Access den ied for user wso2bam to login TCP,192.168.1.7:7611,TCP,192.168.1.7:7711 at org.wso2.carbon.databridge.agent.thrift.internal.publisher.authentica tor.AgentAuthenticator.connect(AgentAuthenticator.java:54) at org.wso2.carbon.databridge.agent.thrift.DataPublisher.start(DataPubli sher.java:273) at org.wso2.carbon.databridge.agent.thrift.DataPublisher.<init>(DataPubl isher.java:211) at org.wso2.carbon.bam.notification.task.NotificationDispatchTask.initPu blisherKS(NotificationDispatchTask.java:100) at org.wso2.carbon.bam.notification.task.NotificationDispatchTask.execut e(NotificationDispatchTask.java:210) at org.wso2.carbon.ntask.core.impl.TaskQuartzJobAdapter.execute(TaskQuar tzJobAdapter.java:67) at org.quartz.core.JobRunShell.run(JobRunShell.java:213) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:47 1) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.wso2.carbon.databridge.agent.thrift.exception.AgentAuthenticatorE xception: Thrift exception at org.wso2.carbon.databridge.agent.thrift.internal.publisher.authentica tor.ThriftAgentAuthenticator.connect(ThriftAgentAuthenticator.java:51) at org.wso2.carbon.databridge.agent.thrift.internal.publisher.authentica tor.AgentAuthenticator.connect(AgentAuthenticator.java:51) ... 12 more I found some more exception like Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketExcep tion: Connection closed by remote host at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTranspo rt.java:147) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.j ava:163) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryP rotocol.java:91) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at org.wso2.carbon.databridge.commons.thrift.service.secure.ThriftSecure EventTransmissionService$Client.send_connect(ThriftSecureEventTransmissionServic e.java:82) at org.wso2.carbon.databridge.commons.thrift.service.secure.ThriftSecure EventTransmissionService$Client.connect(ThriftSecureEventTransmissionService.jav a:73) at org.wso2.carbon.databridge.agent.thrift.internal.publisher.authentica tor.ThriftAgentAuthenticator.connect(ThriftAgentAuthenticator.java:47) ... 13 more Caused by: java.net.SocketException: Connection closed by remote host at sun.security.ssl.SSLSocketImpl.checkWrite(SSLSocketImpl.java:1506) at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:70) at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTranspo rt.java:145) ... 19 more
java.io.IOException: Broken pipe on increasing number of mappers/reducers, a lot
I am running MapReduce job on a hadoop cluster of 6 nodes with 4 map tasks and 10 reduce tasks configured. Mapper/Reducer fails a lot on increasing number of map/reduce tasks as below, I encounter the following error: stderr logs java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 143 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:130) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) and this: syslog logs 2014-03-01 15:11:30,118 WARN org.apache.hadoop.streaming.PipeMapRed: java.io.IOException: Broken pipe at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:260) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.DataOutputStream.flush(DataOutputStream.java:106) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:569) at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:130) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2014-03-01 15:11:30,118 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed! 2014-03-01 15:11:30,121 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1 2014-03-01 15:11:30,146 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds. 2014-03-01 15:11:30,146 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName hduser for UID 1001 from the native implementation 2014-03-01 15:11:30,147 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 143 at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362) at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576) at org.apache.hadoop.streaming.PipeReducer.reduce(PipeReducer.java:130) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.mapred.Child.main(Child.java:249) 2014-03-01 15:11:30,149 WARN org.apache.hadoop.mapred.Task: Parent died. Exiting attempt_201402281751_0042_r_000004_0 2014-03-01 15:11:31,252 INFO org.apache.hadoop.streaming.PipeMapRed: Records R/W=983976/1957694 Even the simplest program is creating this problem: mapper.py #!/usr/bin/env python import sys for line in sys.stdin: if line: print "%s\n%s"%(line, line) reducer.py #!/usr/bin/env python import sys for line in sys.stdin: if line: print "%s"%(line) I am using the following command to run hadoop. hadoop jar /usr/local/hadoop/contrib/streaming/hadoop-streaming-1.0.3.jar -D mapred.reduce.tasks=10 -file /home/hduser/code/K1D/code1/mapper2.py -mapper mapper2.py -file /home/hduser/code/K1D/code1/reducer2.py -reducer reducer2.py -input /user/hduser/data-out/part-00000 -output /user/hduser/data-out1 -partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner Can you suggest anything?
I ran into something similar and realized that I didn't have the exact version of python installed into one of my data nodes. I had #!/usr/bin/env python which I changed to #!/usr/bin/env python2.7