Fiware: can not start cygnus as service

Fiware: can not start cygnus as service - centos7

I installed cygnus using RPMs on fiware image CentOS-7-x64 and I can't start it as a service, Here is my logs:
[centos#cygnus-mongo conf]$ sudo service cygnus start
Starting cygnus (via systemctl): Job for cygnus.service failed. See 'systemctl status cygnus.service' and 'journalctl -xn' for details.
[FAILED]
[centos#cygnus-mongo conf]$ sudo journalctl -xn
-- Logs begin at mer. 2015-10-07 07:48:29 UTC, end at mer. 2015-10-07 10:02:35 UTC. --
oct. 07 10:02:20 cygnus-mongo.novalocal su[5700]: pam_unix(su:session): session closed for user cygnus
oct. 07 10:02:22 cygnus-mongo.novalocal cygnus[5695]: cat: /var/run/cygnus/cygnus_mongo.pid: No such file or directory
oct. 07 10:02:22 cygnus-mongo.novalocal cygnus[5695]: [FAILED]
oct. 07 10:02:22 cygnus-mongo.novalocal cygnus[5695]: rm: cannot remove ‘/var/run/cygnus/cygnus_mongo.pid’: No such file or directory
oct. 07 10:02:22 cygnus-mongo.novalocal systemd[1]: cygnus.service: control process exited, code=exited status=1
oct. 07 10:02:22 cygnus-mongo.novalocal systemd[1]: Failed to start SYSV: cygnus.
-- Subject: Unit cygnus.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit cygnus.service has failed.
--
-- The result is failed.
oct. 07 10:02:22 cygnus-mongo.novalocal systemd[1]: Unit cygnus.service entered failed state.
oct. 07 10:02:34 cygnus-mongo.novalocal dhclient[1064]: DHCPREQUEST on eth0 to 192.168.111.71 port 67 (xid=0x761299ef)
oct. 07 10:02:34 cygnus-mongo.novalocal dhclient[1064]: DHCPACK from 192.168.111.71 (xid=0x761299ef)
oct. 07 10:02:35 cygnus-mongo.novalocal sudo[5774]: centos : TTY=pts/0 ; PWD=/usr/cygnus/conf ; USER=root ; COMMAND=/bin/journalctl -xn
Actually the directory /var/run/cygnus was not created, is it going to be created automatically?
Here is my configuration files:
agent_mongo.conf
cygnusagent.sources = http-source
cygnusagent.sinks = mongo-sink
cygnusagent.channels = mongo-channel
#=============================================
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = mongo-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = def_serv
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts gi
# TimestampInterceptor, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# GroupinInterceptor, do not change
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
# Grouping rules for the GroupingInterceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
# ============================================
# OrionMongoSink configuration
# sink class, must not be changed
cygnusagent.sinks.mongo-sink.type = com.telefonica.iot.cygnus.sinks.OrionMongoSink
# channel name from where to read notification events
cygnusagent.sinks.mongo-sink.channel = mongo-channel
# FQDN/IP:port where the MongoDB server runs (standalone case) or comma-separated list of FQDN/IP:port pairs where the MongoDB replica set members run
cygnusagent.sinks.mongo-sink.mongo_hosts = 127.0.0.1:27017
# a valid user in the MongoDB server (or empty if authentication is not enabled in MongoDB)
cygnusagent.sinks.mongo-sink.mongo_username =
# password for the user above (or empty if authentication is not enabled in MongoDB)
cygnusagent.sinks.mongo-sink.mongo_password =
# prefix for the MongoDB databases
cygnusagent.sinks.mongo-sink.db_prefix = kura_
# prefix pro the MongoDB collections
cygnusagent.sinks.mongo-sink.collection_prefix = kura_
# true is collection names are based on a hash, false for human redable collections
cygnusagent.sinks.mongo-sink.should_hash = false
#=============================================
# mongo-channel configuration
# channel type (must not be changed)
cygnusagent.channels.mongo-channel.type = memory
# capacity of the channel
cygnusagent.channels.mongo-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.mongo-channel.transactionCapacity = 100
cygnus_instance_mongo.conf :
# Who to run cygnus as. Note that you may need to use root if you want
# to run cygnus in a privileged port (<1024)
CYGNUS_USER=cygnus
# Where is the config folder
CONFIG_FOLDER=/usr/cygnus/conf
# Which is the config file
CONFIG_FILE=/usr/cygnus/conf/agent_mongo.conf
# Name of the agent. The name of the agent is not trivial, since it is the base for the Flume parameters
# naming conventions, e.g. it appears in .sources.http-source.channels=...
AGENT_NAME=cygnusagent
# Name of the logfile located at /var/log/cygnus. It is important to put the extension '.log' in order to the log rotation works properly
LOGFILE_NAME=cygnus.log
# Administration port. Must be unique per instance
ADMIN_PORT=8081
# Polling interval (seconds) for the configuration reloading
POLLING_INTERVAL=30
Edit: add logs after lunching cygnus as a standalone application:
[centos#cygnus-mongo iot]$ ./cygnus.sh
+ exec /usr/lib/jvm/java-1.6.0-openjdk.x86_64/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/usr/cygnus/conf:/usr/cygnus/lib/*:/usr/cygnus/plugins.d/cygnus/lib/*:/usr/cygnus/plugins.d/cygnus/libext/*' -Djava.library.path= com.telefonica.iot.cygnus.nodes.CygnusApplication -f /usr/cygnus/conf/agent_mongo.conf -n cygnusagent
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/cygnus/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/cygnus/plugins.d/cygnus/lib/cygnus-0.8.2-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-10-08 15:50:32,629 (main) [INFO - com.telefonica.iot.cygnus.nodes.CygnusApplication.main(CygnusApplication.java:235)] Starting a Jetty server listening on port 8081 (Management Interface)
2015-10-08 15:50:32,655 (main) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-10-08 15:50:32,656 (main) [INFO - com.telefonica.iot.cygnus.nodes.CygnusApplication.main(CygnusApplication.java:238)] Starting Cygnus application
2015-10-08 15:50:32,656 (Thread-1) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] jetty-6.1.26
2015-10-08 15:50:32,684 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start(PollingPropertiesFileConfigurationProvider.java:61)] Configuration provider starting
2015-10-08 15:50:32,694 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloading configuration file:/usr/cygnus/conf/agent_mongo.conf
2015-10-08 15:50:32,714 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:101)] Configuration property ignored: cygnusagent.sinks.mongo-sink.mongo_username =
2015-10-08 15:50:32,714 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,715 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,715 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:101)] Configuration property ignored: cygnusagent.sinks.mongo-sink.mongo_password =
2015-10-08 15:50:32,715 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:930)] Added sinks: mongo-sink Agent: cygnusagent
2015-10-08 15:50:32,716 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,716 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,716 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,716 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1016)] Processing:mongo-sink
2015-10-08 15:50:32,731 (Thread-1) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] Started SocketConnector#0.0.0.0:8081
2015-10-08 15:50:32,744 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration for agents: [cygnusagent]
2015-10-08 15:50:32,745 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:150)] Creating channels
2015-10-08 15:50:32,758 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:40)] Creating instance of channel mongo-channel type memory
2015-10-08 15:50:32,765 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:205)] Created channel mongo-channel
2015-10-08 15:50:32,766 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:39)] Creating instance of source http-source, type org.apache.flume.source.http.HTTPSource
2015-10-08 15:50:32,782 (conf-file-poller-0) [INFO - com.telefonica.iot.cygnus.handlers.OrionRestHandler.<init>(OrionRestHandler.java:75)] Cygnus version (0.8.2.UNKNOWN)
2015-10-08 15:50:32,808 (conf-file-poller-0) [INFO - com.telefonica.iot.cygnus.handlers.OrionRestHandler.configure(OrionRestHandler.java:141)] Startup completed
2015-10-08 15:50:32,836 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:40)] Creating instance of sink: mongo-sink, type: com.telefonica.iot.cygnus.sinks.OrionMongoSink
2015-10-08 15:50:32,856 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:119)] Channel mongo-channel connected to [http-source, mongo-sink]
2015-10-08 15:50:32,872 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:IDLE} }} sinkRunners:{mongo-sink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#7caba647 counterGroup:{ name:null counters:{} } }} channels:{mongo-channel=org.apache.flume.channel.MemoryChannel{name: mongo-channel}} }
2015-10-08 15:50:32,872 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel mongo-channel
2015-10-08 15:50:32,968 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: CHANNEL, name: mongo-channel, registered successfully.
2015-10-08 15:50:32,968 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: CHANNEL, name: mongo-channel started
2015-10-08 15:50:32,969 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink mongo-sink
2015-10-08 15:50:32,970 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source http-source
2015-10-08 15:50:32,972 (lifecycleSupervisor-1-4) [INFO - com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize(GroupingInterceptor.java:92)] Grouping rules read:
2015-10-08 15:50:32,974 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.stopAllComponents(Application.java:101)] Shutting down configuration: { sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:IDLE} }} sinkRunners:{mongo-sink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#7caba647 counterGroup:{ name:null counters:{} } }} channels:{mongo-channel=org.apache.flume.channel.MemoryChannel{name: mongo-channel}} }
2015-10-08 15:50:32,975 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.stopAllComponents(Application.java:105)] Stopping Source http-source
2015-10-08 15:50:32,978 (lifecycleSupervisor-1-1) [INFO - com.telefonica.iot.cygnus.sinks.OrionMongoBaseSink.start(OrionMongoBaseSink.java:139)] [mongo-sink] Startup completed
2015-10-08 15:50:32,984 (lifecycleSupervisor-1-4) [
- com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.parseGroupingRules(GroupingInterceptor.java:165)] Error while parsing the Json-based grouping rules file. Details=null
2015-10-08 15:50:32,984 (lifecycleSupervisor-1-4) [WARN - com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize(GroupingInterceptor.java:98)] Grouping rules syntax has errors
2015-10-08 15:50:33,030 (lifecycleSupervisor-1-4) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] jetty-6.1.26
2015-10-08 15:50:33,081 (lifecycleSupervisor-1-4) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] Started SocketConnector#0.0.0.0:5050
2015-10-08 15:50:33,082 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:110)] Monitoried counter group for type: SOURCE, name: http-source, registered successfully.
2015-10-08 15:50:33,082 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: SOURCE, name: http-source started
2015-10-08 15:50:33,083 (conf-file-poller-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:171)] Stopping component: EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:START} }
2015-10-08 15:50:33,083 (conf-file-poller-0) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] Stopped SocketConnector#0.0.0.0:5050
2015-10-08 15:50:33,185 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:139)] Component type: SOURCE, name: http-source stopped
2015-10-08 15:50:33,185 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:145)] Shutdown Metric for type: SOURCE, name: http-source. source.start.time == 1444319433082
2015-10-08 15:50:33,185 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:151)] Shutdown Metric for type: SOURCE, name: http-source. source.stop.time == 1444319433185
2015-10-08 15:50:33,186 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.accepted == 0
2015-10-08 15:50:33,186 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.received == 0
2015-10-08 15:50:33,186 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.append.accepted == 0
2015-10-08 15:50:33,186 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.append.received == 0
2015-10-08 15:50:33,187 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.events.accepted == 0
2015-10-08 15:50:33,187 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.events.received == 0
2015-10-08 15:50:33,187 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: SOURCE, name: http-source. src.open-connection.count == 0
2015-10-08 15:50:33,187 (conf-file-poller-0) [INFO - org.apache.flume.source.http.HTTPSource.stop(HTTPSource.java:172)] Http source http-source stopped. Metrics: SOURCE:http-source{src.events.accepted=0, src.events.received=0, src.append.accepted=0, src.append-batch.accepted=0, src.open-connection.count=0, src.append-batch.received=0, src.append.received=0}
2015-10-08 15:50:33,187 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.stopAllComponents(Application.java:115)] Stopping Sink mongo-sink
2015-10-08 15:50:33,188 (conf-file-poller-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:171)] Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#7caba647 counterGroup:{ name:null counters:{} } }
2015-10-08 15:50:33,189 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.stopAllComponents(Application.java:125)] Stopping Channel mongo-channel
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise(LifecycleSupervisor.java:171)] Stopping component: org.apache.flume.channel.MemoryChannel{name: mongo-channel}
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:139)] Component type: CHANNEL, name: mongo-channel stopped
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:145)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.start.time == 1444319432968
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:151)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.stop.time == 1444319433190
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.capacity == 1000
2015-10-08 15:50:33,190 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.current.size == 0
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.event.put.attempt == 0
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.event.put.success == 0
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.event.take.attempt == 1
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.stop(MonitoredCounterGroup.java:167)] Shutdown Metric for type: CHANNEL, name: mongo-channel. channel.event.take.success == 0
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)] Starting new configuration:{ sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state:START} }} sinkRunners:{mongo-sink=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#7caba647 counterGroup:{ name:null counters:{runner.backoffs.consecutive=0} } }} channels:{mongo-channel=org.apache.flume.channel.MemoryChannel{name: mongo-channel}} }
2015-10-08 15:50:33,191 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:145)] Starting Channel mongo-channel
2015-10-08 15:50:33,192 (lifecycleSupervisor-1-2) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: CHANNEL, name: mongo-channel started
2015-10-08 15:50:33,192 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:173)] Starting Sink mongo-sink
2015-10-08 15:50:33,193 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Starting Source http-source
2015-10-08 15:50:33,193 (lifecycleSupervisor-1-1) [INFO - com.telefonica.iot.cygnus.sinks.OrionMongoBaseSink.start(OrionMongoBaseSink.java:139)] [mongo-sink] Startup completed
2015-10-08 15:50:33,194 (lifecycleSupervisor-1-6) [INFO - com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize(GroupingInterceptor.java:92)] Grouping rules read:
2015-10-08 15:50:33,194 (lifecycleSupervisor-1-6) [ERROR - com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.parseGroupingRules(GroupingInterceptor.java:165)] Error while parsing the Json-based grouping rules file. Details=null
2015-10-08 15:50:33,194 (lifecycleSupervisor-1-6) [WARN - com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize(GroupingInterceptor.java:98)] Grouping rules syntax has errors
2015-10-08 15:50:33,195 (lifecycleSupervisor-1-6) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] jetty-6.1.26
2015-10-08 15:50:33,197 (lifecycleSupervisor-1-6) [INFO - org.mortbay.log.Slf4jLog.info(Slf4jLog.java:67)] Started SocketConnector#0.0.0.0:5050
2015-10-08 15:50:33,197 (lifecycleSupervisor-1-6) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:94)] Component type: SOURCE, name: http-source started

Cygnus is supposed to create /var/run/cygnus/ when started. You can check here the path specification, and here the creation and PID assignement.
I'm wondering which are the permissions of your /var/run... Maybe they are too restrictive for the cygnus user.
Anyway, are you able to run Cygnus as a standalone application (not as a service) with no errors? I mean, executing this command:
$ /usr/cygnus/bin/cygnus-flume-ng agent --conf /usr/cygnus/conf -f /usr/cygnus/conf/agent_mongo.conf -n cygnusagent -Dflume.root.logger=INFO,console

Related

akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing failed. How can it be solved?

How can I solve this problem -- I am trying to run akka cluster on minikube. But failed to create a cluster.
17:46:49.093 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
My config is --
akka {
actor {
provider = cluster
}
cluster {
shutdown-after-unsuccessful-join-seed-nodes = 60s
}
coordinated-shutdown.exit-jvm = on
management {
cluster.bootstrap {
contact-point-discovery {
discovery-method = kubernetes-api
}
}
}
}
my yaml
kind: Deployment
metadata:
labels:
app: appka
name: appka
spec:
replicas: 2
selector:
matchLabels:
app: appka
template:
metadata:
labels:
app: appka
spec:
containers:
- name: appka
image: akkacluster:latest
imagePullPolicy: Never
readinessProbe:
httpGet:
path: /ready
port: management
periodSeconds: 10
failureThreshold: 10
initialDelaySeconds: 20
livenessProbe:
httpGet:
path: /alive
port: management
periodSeconds: 10
failureThreshold: 10
initialDelaySeconds: 20
ports:
- name: management
containerPort: 8558
protocol: TCP
- name: http
containerPort: 8080
protocol: TCP
- name: remoting
containerPort: 25520
protocol: TCP
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-pods
subjects:
- kind: User
name: system:serviceaccount:default:default
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
Unfortunately my cluster is not formaing---
kubectl logs pod/appka-7c4b7df7f7-5v7cc
17:46:32.026 [appka-akka.actor.default-dispatcher-3] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
SLF4J: A number (1) of logging calls during the initialization phase have been intercepted and are
SLF4J: now being replayed. These are subject to the filtering rules of the underlying logging system.
SLF4J: See also http://www.slf4j.org/codes.html#replay
17:46:33.644 [appka-akka.actor.default-dispatcher-3] INFO akka.remote.artery.tcp.ArteryTcpTransport - Remoting started with transport [Artery tcp]; listening on address [akka://appka#172.17.0.4:25520] with UID [-8421566647681174079]
17:46:33.811 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Starting up, Akka version [2.6.14] ...
17:46:34.491 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Registered cluster JMX MBean [akka:type=Cluster]
17:46:34.512 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - Started up successfully
17:46:34.883 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - No downing-provider-class configured, manual cluster downing required, see https://doc.akka.io/docs/akka/current/typed/cluster.html#downing
17:46:34.884 [appka-akka.actor.default-dispatcher-3] INFO akka.cluster.Cluster - Cluster Node [akka://appka#172.17.0.4:25520] - No seed nodes found in configuration, relying on Cluster Bootstrap for joining
17:46:39.084 [appka-akka.actor.default-dispatcher-11] INFO akka.management.internal.HealthChecksImpl - Loading readiness checks [(cluster-membership,akka.management.cluster.scaladsl.ClusterMembershipCheck), (sharding,akka.cluster.sharding.ClusterShardingHealthCheck)]
17:46:39.090 [appka-akka.actor.default-dispatcher-11] INFO akka.management.internal.HealthChecksImpl - Loading liveness checks []
17:46:39.104 [appka-akka.actor.default-dispatcher-3] INFO ClusterListenerActor$ - started actor akka://appka/user - (class akka.actor.typed.internal.adapter.ActorRefAdapter)
17:46:39.888 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Binding Akka Management (HTTP) endpoint to: 172.17.0.4:8558
17:46:40.525 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for ClusterHttpManagementRouteProvider
17:46:40.806 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for ClusterBootstrap
17:46:40.821 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Using self contact point address: http://172.17.0.4:8558
17:46:40.914 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Including HTTP management routes for HealthCheckRoutes
17:46:44.198 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Initiating bootstrap procedure using kubernetes-api method...
17:46:44.200 [appka-akka.actor.default-dispatcher-3] INFO akka.management.cluster.bootstrap.ClusterBootstrap - Bootstrap using `akka.discovery` method: kubernetes-api
17:46:44.226 [appka-akka.actor.default-dispatcher-3] INFO akka.management.scaladsl.AkkaManagement - Bound Akka Management (HTTP) endpoint to: 172.17.0.4:8558
17:46:44.487 [appka-akka.actor.default-dispatcher-6] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Locating service members. Using discovery [akka.discovery.kubernetes.KubernetesApiServiceDiscovery], join decider [akka.management.cluster.bootstrap.LowestAddressJoinDecider], scheme [http]
17:46:44.490 [appka-akka.actor.default-dispatcher-6] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:44.493 [appka-akka.actor.default-dispatcher-6] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:45.626 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:45.627 [appka-akka.actor.default-dispatcher-12] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:48.428 [appka-akka.actor.default-dispatcher-13] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:48.485 [appka-akka.actor.default-dispatcher-22] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:48.586 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.LowestAddressJoinDecider - Discovered [2] contact points, confirmed [0], which is less than the required [2], retrying
17:46:49.092 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-4.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-4.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:49.093 [appka-akka.actor.default-dispatcher-12] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:49.603 [appka-akka.actor.default-dispatcher-22] INFO akka.management.cluster.bootstrap.LowestAddressJoinDecider - Discovered [2] contact points, confirmed [0], which is less than the required [2], retrying
17:46:49.682 [appka-akka.actor.default-dispatcher-21] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Looking up [Lookup(appka,None,Some(tcp))]
17:46:49.683 [appka-akka.actor.default-dispatcher-21] INFO akka.discovery.kubernetes.KubernetesApiServiceDiscovery - Querying for pods with label selector: [app=appka]. Namespace: [default]. Port: [None]
17:46:49.726 [appka-akka.actor.default-dispatcher-12] INFO akka.management.cluster.bootstrap.internal.BootstrapCoordinator - Located service members based on: [Lookup(appka,None,Some(tcp))]: [ResolvedTarget(172-17-0-4.default.pod.cluster.local,None,Some(/172.17.0.4)), ResolvedTarget(172-17-0-3.default.pod.cluster.local,None,Some(/172.17.0.3))], filtered to [172-17-0-4.default.pod.cluster.local:0, 172-17-0-3.default.pod.cluster.local:0]
17:46:50.349 [appka-akka.actor.default-dispatcher-21] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-3.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-3.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused
17:46:50.504 [appka-akka.actor.default-dispatcher-11] WARN akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing [http://172-17-0-4.default.pod.cluster.local:8558/bootstrap/seed-nodes] failed due to: Tcp command [Connect(172-17-0-4.default.pod.cluster.local:8558,None,List(),Some(10 seconds),true)] failed because of java.net.ConnectException: Connection refused

You are missing akka.remote setting block. Something like:
akka {
actor {
# provider=remote is possible, but prefer cluster
provider = cluster
}
remote {
artery {
transport = tcp # See Selecting a transport below
canonical.hostname = "127.0.0.1"
canonical.port = 25520
}
}
}

How to deploy image classifier with resnet50 model on AWS endpoint to predict without worker dying?

Created a imageclassifier model built on renet50 to identify dog breeds. I created it in sagemaker studio. Tuning and training are done, I deployed it, but when I try to predict on it, it fails. I believe this is related to the pid of the worker because its first warning I see.
Getting following Cloudwatch log output says worker pid not available yet then soon after the worker dies.
timestamp,message,logStreamName
1648240674535,"2022-03-25 20:37:54,107 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...",AllTraffic/i-055c5d00e53e84b93
1648240674535,"2022-03-25 20:37:54,188 [INFO ] main org.pytorch.serve.ModelServer - ",AllTraffic/i-055c5d00e53e84b93
1648240674535,Torchserve version: 0.4.0,AllTraffic/i-055c5d00e53e84b93
1648240674535,TS Home: /opt/conda/lib/python3.6/site-packages,AllTraffic/i-055c5d00e53e84b93
1648240674535,Current directory: /,AllTraffic/i-055c5d00e53e84b93
1648240674535,Temp directory: /home/model-server/tmp,AllTraffic/i-055c5d00e53e84b93
1648240674535,Number of GPUs: 0,AllTraffic/i-055c5d00e53e84b93
1648240674535,Number of CPUs: 1,AllTraffic/i-055c5d00e53e84b93
1648240674535,Max heap size: 6838 M,AllTraffic/i-055c5d00e53e84b93
1648240674535,Python executable: /opt/conda/bin/python3.6,AllTraffic/i-055c5d00e53e84b93
1648240674535,Config file: /etc/sagemaker-ts.properties,AllTraffic/i-055c5d00e53e84b93
1648240674535,Inference address: http://0.0.0.0:8080,AllTraffic/i-055c5d00e53e84b93
1648240674535,Management address: http://0.0.0.0:8080,AllTraffic/i-055c5d00e53e84b93
1648240674535,Metrics address: http://127.0.0.1:8082,AllTraffic/i-055c5d00e53e84b93
1648240674535,Model Store: /.sagemaker/ts/models,AllTraffic/i-055c5d00e53e84b93
1648240674535,Initial Models: model.mar,AllTraffic/i-055c5d00e53e84b93
1648240674535,Log dir: /logs,AllTraffic/i-055c5d00e53e84b93
1648240674535,Metrics dir: /logs,AllTraffic/i-055c5d00e53e84b93
1648240674535,Netty threads: 0,AllTraffic/i-055c5d00e53e84b93
1648240674535,Netty client threads: 0,AllTraffic/i-055c5d00e53e84b93
1648240674535,Default workers per model: 1,AllTraffic/i-055c5d00e53e84b93
1648240674535,Blacklist Regex: N/A,AllTraffic/i-055c5d00e53e84b93
1648240674535,Maximum Response Size: 6553500,AllTraffic/i-055c5d00e53e84b93
1648240674536,Maximum Request Size: 6553500,AllTraffic/i-055c5d00e53e84b93
1648240674536,Prefer direct buffer: false,AllTraffic/i-055c5d00e53e84b93
1648240674536,Allowed Urls: [file://.*|http(s)?://.*],AllTraffic/i-055c5d00e53e84b93
1648240674536,Custom python dependency for model allowed: false,AllTraffic/i-055c5d00e53e84b93
1648240674536,Metrics report format: prometheus,AllTraffic/i-055c5d00e53e84b93
1648240674536,Enable metrics API: true,AllTraffic/i-055c5d00e53e84b93
1648240674536,Workflow Store: /.sagemaker/ts/models,AllTraffic/i-055c5d00e53e84b93
1648240674536,"2022-03-25 20:37:54,195 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...",AllTraffic/i-055c5d00e53e84b93
1648240675536,"2022-03-25 20:37:54,217 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: model.mar",AllTraffic/i-055c5d00e53e84b93
1648240675536,"2022-03-25 20:37:55,505 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model model loaded.",AllTraffic/i-055c5d00e53e84b93
1648240675786,"2022-03-25 20:37:55,515 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.",AllTraffic/i-055c5d00e53e84b93
1648240675786,"2022-03-25 20:37:55,569 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080",AllTraffic/i-055c5d00e53e84b93
1648240675786,"2022-03-25 20:37:55,569 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.",AllTraffic/i-055c5d00e53e84b93
1648240675786,"2022-03-25 20:37:55,569 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082",AllTraffic/i-055c5d00e53e84b93
1648240675786,Model server started.,AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,727 [WARN ] pool-2-thread-1 org.pytorch.serve.metrics.MetricCollector - worker pid is not available yet.",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,812 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:100.0|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,813 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:38.02598190307617|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,813 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:12.715518951416016|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,814 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:25.1|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,815 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:29583.98046875|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,815 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:1355.765625|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,816 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:5.7|#Level:Host|#hostname:container-0.local,timestamp:1648240675",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,994 [INFO ] W-9000-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,994 [INFO ] W-9000-model_1-stdout MODEL_LOG - [PID]48",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,994 [INFO ] W-9000-model_1-stdout MODEL_LOG - Torch worker started.",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,994 [INFO ] W-9000-model_1-stdout MODEL_LOG - Python runtime: 3.6.13",AllTraffic/i-055c5d00e53e84b93
1648240676036,"2022-03-25 20:37:55,999 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,006 [INFO ] W-9000-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - Backend worker process died.",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - Traceback (most recent call last):",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 182, in <module>",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - worker.run_server()",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 154, in run_server",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,111 [INFO ] W-9000-model_1-stdout MODEL_LOG - self.handle_connection(cl_socket)",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 116, in handle_connection",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - service, result, code = self.load_model(msg)",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 89, in load_model",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_loader.py"", line 110, in load",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,112 [INFO ] W-9000-model_1-stdout MODEL_LOG - initialize_fn(service.context)",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: model, error: Worker died.",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/home/model-server/tmp/models/23b30361031647d08792d32672910688/handler_service.py"", line 51, in initialize",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [INFO ] W-9000-model_1-stdout MODEL_LOG - super().initialize(context)",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-model_1-stderr",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-model_1-stdout",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/sagemaker_inference/default_handler_service.py"", line 66, in initialize",AllTraffic/i-055c5d00e53e84b93
1648240676286,"2022-03-25 20:37:56,113 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-model_1-stdout",AllTraffic/i-055c5d00e53e84b93
1648240676536,"2022-03-25 20:37:56,114 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.",AllTraffic/i-055c5d00e53e84b93
1648240676536,"2022-03-25 20:37:56,416 [INFO ] W-9000-model_1-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-model_1-stderr",AllTraffic/i-055c5d00e53e84b93
1648240676536,"2022-03-25 20:37:56,461 [INFO ] W-9000-model_1 ACCESS_LOG - /169.254.178.2:39848 ""GET /ping HTTP/1.1"" 200 9",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:56,461 [INFO ] W-9000-model_1 TS_METRICS - Requests2XX.Count:1|#Level:Host|#hostname:container-0.local,timestamp:null",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,567 [INFO ] W-9000-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,568 [INFO ] W-9000-model_1-stdout MODEL_LOG - [PID]86",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,568 [INFO ] W-9000-model_1-stdout MODEL_LOG - Torch worker started.",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,568 [INFO ] W-9000-model_1-stdout MODEL_LOG - Python runtime: 3.6.13",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,568 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,569 [INFO ] W-9000-model_1-stdout MODEL_LOG - Connection accepted: /home/model-server/tmp/.ts.sock.9000.",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,642 [INFO ] W-9000-model_1-stdout MODEL_LOG - Backend worker process died.",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,642 [INFO ] W-9000-model_1-stdout MODEL_LOG - Traceback (most recent call last):",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,642 [INFO ] epollEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,642 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 182, in <module>",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - worker.run_server()",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 154, in run_server",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.BatchAggregator - Load model failed: model, error: Worker died.",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - self.handle_connection(cl_socket)",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 116, in handle_connection",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-model_1-stderr",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - service, result, code = self.load_model(msg)",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [WARN ] W-9000-model_1 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-model_1-stdout",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout MODEL_LOG - File ""/opt/conda/lib/python3.6/site-packages/ts/model_service_worker.py"", line 89, in load_model",AllTraffic/i-055c5d00e53e84b93
1648240677787,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-model_1-stdout",AllTraffic/i-055c5d00e53e84b93
1648240678037,"2022-03-25 20:37:57,643 [INFO ] W-9000-model_1 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.",AllTraffic/i-055c5d00e53e84b93
1648240679288,"2022-03-25 20:37:57,991 [INFO ] W-9000-model_1-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-model_1-stderr",AllTraffic/i-055c5d00e53e84b93
1648240679288,"2022-03-25 20:37:59,096 [INFO ] W-9000-model_1-stdout MODEL_LOG - Listening on port: /home/model-server/tmp/.ts.sock.9000",AllTraffic/i-055c5d00e53e84b93
1648240679288,"2022-03-25 20:37:59,097 [INFO ] W-9000-model_1-stdout MODEL_LOG - [PID]114",AllTraffic/i-055c5d00e53e84b93
Model tuning and training came out alright so I'm not sure why it won't predict if that is fine. Someone mentioned to me that it might be due to entry point script, but I don't know what would cause it fail in predicting after deployed if it can predict fine during training.
Entry point script:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.models as models
import torchvision.transforms as transforms
import json
import copy
import argparse
import os
import logging
import sys
from tqdm import tqdm
from PIL import ImageFile
import smdebug.pytorch as smd
ImageFile.LOAD_TRUNCATED_IMAGES = True
logger=logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler(sys.stdout))
def test(model, test_loader, criterion, hook):
model.eval()
running_loss=0
running_corrects=0
hook.set_mode(smd.modes.EVAL)
for inputs, labels in test_loader:
outputs=model(inputs)
loss=criterion(outputs, labels)
_, preds = torch.max(outputs, 1)
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
##total_loss = running_loss // len(test_loader)
##total_acc = running_corrects.double() // len(test_loader)
##logger.info(f"Testing Loss: {total_loss}")
##logger.info(f"Testing Accuracy: {total_acc}")
logger.info("New test acc")
logger.info(f'Test set: Accuracy: {running_corrects}/{len(test_loader.dataset)} = {100*(running_corrects/len(test_loader.dataset))}%)')
def train(model, train_loader, validation_loader, criterion, optimizer, hook):
epochs=50
best_loss=1e6
image_dataset={'train':train_loader, 'valid':validation_loader}
loss_counter=0
hook.set_mode(smd.modes.TRAIN)
for epoch in range(epochs):
logger.info(f"Epoch: {epoch}")
for phase in ['train', 'valid']:
if phase=='train':
model.train()
logger.info("Model Trained")
else:
model.eval()
running_loss = 0.0
running_corrects = 0
for inputs, labels in image_dataset[phase]:
outputs = model(inputs)
loss = criterion(outputs, labels)
if phase=='train':
optimizer.zero_grad()
loss.backward()
optimizer.step()
logger.info("Model Optimized")
_, preds = torch.max(outputs, 1)
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss // len(image_dataset[phase])
epoch_acc = running_corrects // len(image_dataset[phase])
if phase=='valid':
logger.info("Model Validating")
if epoch_loss<best_loss:
best_loss=epoch_loss
else:
loss_counter+=1
logger.info(loss_counter)
'''logger.info('{} loss: {:.4f}, acc: {:.4f}, best loss: {:.4f}'.format(phase,
epoch_loss,
epoch_acc,
best_loss))'''
if phase=="train":
logger.info("New epoch acc for Train:")
logger.info(f"Epoch {epoch}: Loss {loss_counter/len(train_loader.dataset)}, Accuracy {100*(running_corrects/len(train_loader.dataset))}%")
if phase=="valid":
logger.info("New epoch acc for Valid:")
logger.info(f"Epoch {epoch}: Loss {loss_counter/len(train_loader.dataset)}, Accuracy {100*(running_corrects/len(train_loader.dataset))}%")
##if loss_counter==1:
## break
##if epoch==0:
## break
return model
def net():
model = models.resnet50(pretrained=True)
for param in model.parameters():
param.requires_grad = False
model.fc = nn.Sequential(
nn.Linear(2048, 128),
nn.ReLU(inplace=True),
nn.Linear(128, 133))
return model
def create_data_loaders(data, batch_size):
train_data_path = os.path.join(data, 'train')
test_data_path = os.path.join(data, 'test')
validation_data_path=os.path.join(data, 'valid')
train_transform = transforms.Compose([
transforms.RandomResizedCrop((224, 224)),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
])
test_transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
])
train_data = torchvision.datasets.ImageFolder(root=train_data_path, transform=train_transform)
train_data_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_data = torchvision.datasets.ImageFolder(root=test_data_path, transform=test_transform)
test_data_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=True)
validation_data = torchvision.datasets.ImageFolder(root=validation_data_path, transform=test_transform)
validation_data_loader = torch.utils.data.DataLoader(validation_data, batch_size=batch_size, shuffle=True)
return train_data_loader, test_data_loader, validation_data_loader
def main(args):
logger.info(f'Hyperparameters are LR: {args.lr}, Batch Size: {args.batch_size}')
logger.info(f'Data Paths: {args.data}')
train_loader, test_loader, validation_loader=create_data_loaders(args.data, args.batch_size)
model=net()
hook = smd.Hook.create_from_json_file()
hook.register_hook(model)
criterion = nn.CrossEntropyLoss(ignore_index=133)
optimizer = optim.Adam(model.fc.parameters(), lr=args.lr)
logger.info("Starting Model Training")
model=train(model, train_loader, validation_loader, criterion, optimizer, hook)
logger.info("Testing Model")
test(model, test_loader, criterion, hook)
logger.info("Saving Model")
torch.save(model.cpu().state_dict(), os.path.join(args.model_dir, "model.pth"))
if __name__=='__main__':
parser=argparse.ArgumentParser()
'''
TODO: Specify any training args that you might need
'''
parser.add_argument(
"--batch-size",
type=int,
default=64,
metavar="N",
help="input batch size for training (default: 64)",
)
parser.add_argument(
"--test-batch-size",
type=int,
default=1000,
metavar="N",
help="input batch size for testing (default: 1000)",
)
parser.add_argument(
"--epochs",
type=int,
default=5,
metavar="N",
help="number of epochs to train (default: 10)",
)
parser.add_argument(
"--lr", type=float, default=0.01, metavar="LR", help="learning rate (default: 0.01)"
)
parser.add_argument(
"--momentum", type=float, default=0.5, metavar="M", help="SGD momentum (default: 0.5)"
)
# Container environment
parser.add_argument("--hosts", type=list, default=json.loads(os.environ["SM_HOSTS"]))
parser.add_argument("--current-host", type=str, default=os.environ["SM_CURRENT_HOST"])
parser.add_argument("--model-dir", type=str, default=os.environ["SM_MODEL_DIR"])
parser.add_argument("--data", type=str, default=os.environ["SM_CHANNEL_TRAINING"])
parser.add_argument("--num-gpus", type=int, default=os.environ["SM_NUM_GPUS"])
args=parser.parse_args()
main(args)
To test the model on the endpoint I sent over an image using the following code:
from sagemaker.serializers import IdentitySerializer
import base64
predictor.serializer = IdentitySerializer("image/png")
with open("Akita_00282.jpg", "rb") as f:
payload = f.read()
response = predictor.predict(payload)```

The model serving workers are either dying because they cannot load your model or deserialize the payload you are sending to them.
Note that you have to provide a model_fn implementation. Please read these docs here or this blog here to know more about how to adapt the inference scripts for SageMaker deployment. If you do not want to override the input_fn, predict_fn, and/or output_fn handlers, you can find their default implementations, for example, here.

Unable to create/read document to HDFS deployed with AWS EBS in EKS cluster

I have EKS cluster with EBS storage class/volume.
I am able to deploy hdfs namenode and datanode images (bde2020/hadoop-xxx) using statefulset successfully.
When I am trying to put a file to hdfs from my machine using hdfs://:, it gives me success, but it does not get written on datanode.
In namenode log, I see below error.
Can it be something to do with EBS volume? I cannot even upload/download files from namenode GUI. Can it be due to as datanode host name hdfs-data-X.hdfs-data.pulse.svc.cluster.local is not resolvable to my local machine?
Please help
2020-05-12 17:38:51,360 INFO hdfs.StateChange: BLOCK* allocate blk_1073741825_1001, replicas=10.8.29.112:9866, 10.8.29.176:9866, 10.8.29.188:9866 for /vault/a.json
2020-05-12 17:39:13,036 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
2020-05-12 17:39:13,036 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2020-05-12 17:39:13,036 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2020-05-12 17:39:13,036 INFO hdfs.StateChange: BLOCK* allocate blk_1073741826_1002, replicas=10.8.29.176:9866, 10.8.29.188:9866 for /vault/a.json
2020-05-12 17:39:34,607 INFO namenode.FSEditLog: Number of transactions: 11 Total time for transactions(ms): 23 Number of transactions batched in Syncs: 3 Number of syncs: 8 SyncTimes(ms): 23
2020-05-12 17:39:35,146 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 2 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
2020-05-12 17:39:35,146 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 2 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2020-05-12 17:39:35,146 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 2 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2020-05-12 17:39:35,147 INFO hdfs.StateChange: BLOCK* allocate blk_1073741827_1003, replicas=10.8.29.188:9866 for /vault/a.json
2020-05-12 17:39:57,319 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology
2020-05-12 17:39:57,319 WARN protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 3 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2020-05-12 17:39:57,319 WARN blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2020-05-12 17:39:57,320 INFO ipc.Server: IPC Server handler 5 on default port 8020, call Call#12 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.254.40.95:59328
java.io.IOException: File /vault/a.json could only be written to 0 of the 1 minReplication nodes. There are 3 datanode(s) running and 3 node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2219)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2789)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:892)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:574)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
My namenode web page shows below:
Node Http Address Last contact Last Block Report Capacity Blocks Block pool used Version
hdfs-data-0.hdfs-data.pulse.svc.cluster.local:9866 http://hdfs-data-0.hdfs-data.pulse.svc.cluster.local:9864 1s 0m
975.9 MB
0 24 KB (0%) 3.2.1
hdfs-data-1.hdfs-data.pulse.svc.cluster.local:9866 http://hdfs-data-1.hdfs-data.pulse.svc.cluster.local:9864 2s 0m
975.9 MB
0 24 KB (0%) 3.2.1
hdfs-data-2.hdfs-data.pulse.svc.cluster.local:9866 http://hdfs-data-2.hdfs-data.pulse.svc.cluster.local:9864 1s 0m
975.9 MB
0 24 KB (0%) 3.2.1
My deployment:
NameNode:
#clusterIP service of namenode
apiVersion: v1
kind: Service
metadata:
name: hdfs-name
namespace: pulse
labels:
component: hdfs-name
spec:
ports:
- port: 8020
protocol: TCP
name: nn-rpc
- port: 9870
protocol: TCP
name: nn-web
selector:
component: hdfs-name
type: ClusterIP
---
#namenode stateful deployment
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hdfs-name
namespace: pulse
labels:
component: hdfs-name
spec:
serviceName: hdfs-name
replicas: 1
selector:
matchLabels:
component: hdfs-name
template:
metadata:
labels:
component: hdfs-name
spec:
initContainers:
- name: delete-lost-found
image: busybox
command: ["sh", "-c", "rm -rf /hadoop/dfs/name/lost+found"]
volumeMounts:
- name: hdfs-name-pv-claim
mountPath: /hadoop/dfs/name
containers:
- name: hdfs-name
image: bde2020/hadoop-namenode
env:
- name: CLUSTER_NAME
value: hdfs-k8s
- name: HDFS_CONF_dfs_permissions_enabled
value: "false"
ports:
- containerPort: 8020
name: nn-rpc
- containerPort: 9870
name: nn-web
volumeMounts:
- name: hdfs-name-pv-claim
mountPath: /hadoop/dfs/name
#subPath: data #subPath required as on root level, lost+found folder is created which does not cause to run namenode --format
volumeClaimTemplates:
- metadata:
name: hdfs-name-pv-claim
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: ebs
resources:
requests:
storage: 1Gi
Datanode:
#headless service of datanode
apiVersion: v1
kind: Service
metadata:
name: hdfs-data
namespace: pulse
labels:
component: hdfs-data
spec:
ports:
- port: 80
protocol: TCP
selector:
component: hdfs-data
clusterIP: None
type: ClusterIP
---
#datanode stateful deployment
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hdfs-data
namespace: pulse
labels:
component: hdfs-data
spec:
serviceName: hdfs-data
replicas: 3
selector:
matchLabels:
component: hdfs-data
template:
metadata:
labels:
component: hdfs-data
spec:
containers:
- name: hdfs-data
image: bde2020/hadoop-datanode
env:
- name: CORE_CONF_fs_defaultFS
value: hdfs://hdfs-name:8020
volumeMounts:
- name: hdfs-data-pv-claim
mountPath: /hadoop/dfs/data
volumeClaimTemplates:
- metadata:
name: hdfs-data-pv-claim
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: ebs
resources:
requests:
storage: 1Gi

It seems to be issue with the datanode not reachable over rpc port from my client machine.
I had datanodes http port reachable from my client machine. Tried using webhdfs:// (instead of hdfs://) after putting mapping of datanode podname vs IP in hosts file, it worked out.

404 not found on running URL after bluemix deployment

I have written a sample MVC code using the Spring framework and I have deployed it in Bluemix.
When running the deployed URL, I am receiving the following error.
The application or context root for this request has not been found
What am i doing wrong ? Anything needed to be changed in web.xml?
Logs message
[AUDIT ] CWWKE0001I: The server defaultServer has been launched.
[AUDIT ] CWWKG0028A: Processing included configuration resource:
/home/vcap/app/wlp/usr/servers/defaultServer/runtime-vars.xml
[INFO ] CWWKE0002I: The kernel started after 10.005 seconds
[INFO ] CWWKF0007I: Feature update started.
[INFO ] CWWKO0219I: TCP Channel httpEndpoint-179 has been started
and is now listening for requests on host * (IPv6) port 61031.
[INFO ] CWWKO0219I: TCP Channel defaultHttpEndpoint has been
started and is now listening for requests on host localhost (IPv4:
127.0.0.1) port 9080.
[INFO ] CWSCX0122I: Register management Bean provider:
com.ibm.ws.cloudoe.management.client.provider.dump.JavaDumpBeanProvider#c68ae63e.
[INFO ] CWSCX0122I: Register management Bean provider:
com.ibm.ws.cloudoe.management.client.provider.logging.LibertyLoggingBeanProvider#f0d6d754.
[INFO ] SRVE0169I: Loading Web Module:
com.ibm.ws.cloudoe.management.client.liberty.connector.
[INFO ] SRVE0250I: Web Module
com.ibm.ws.cloudoe.management.client.liberty.connector has been bound
to default_host.
[AUDIT ] CWWKT0016I: Web application available (default_host):
http://localhost:9080/IBMMGMTRest/
[INFO ] CWWKZ0018I: Starting application myapp.
[INFO ] SRVE0169I: Loading Web Module: TaxBillReminder.
[INFO ] SRVE0250I: Web Module TaxBillReminder has been bound to
default_host.
[AUDIT ] CWWKT0016I: Web application available (default_host):
http://localhost:9080/
[AUDIT ] CWWKZ0001I: Application myapp started in 2.113 seconds.
[AUDIT ] CWWKF0012I: The server installed the following features:
[json-1.0, jpa-2.0, icap:managementConnector-1.0, beanValidation-1.0,
jdbc-4.0, managedBeans-1.0, jsf-2.0, jsp-2.2, servlet-3.0, jaxrs-1.1,
jndi-1.0, appState-1.0, ejbLite-3.1, cdi-1.0].
[INFO ] CWWKF0008I: Feature update completed in 9.472 seconds.
[AUDIT ] CWWKF0011I: The server defaultServer is ready to run a
smarter planet.
[INFO ] SESN8501I: The session manager did not find a persistent
storage location; HttpSession objects will be stored in the local
application server's memory.
[INFO ] SESN0176I: A new session context will be created for
application key default_host/
[INFO ] SESN0172I: The session manager is using the Java default
SecureRandom implementation for session ID generation.
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[INFO ] FFDC1015I: An FFDC Incident has been created:
"java.util.ServiceConfigurationError:
javax.servlet.ServletContainerInitializer: Provider
org.cloudfoundry.reconfiguration.spring.AutoReconfigurationServletContainerInitializer
could not be instantiated
com.ibm.ws.webcontainer.osgi.DynamicVirtualHost startWebApp" at
ffdc_15.05.22_06.28.59.0.log TaxBillReminder.mybluemix.net -
[22/05/2015:06:28:58 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.42:31418 x_forwarded_for:"-" vcap_request_id:430a380b-a68e-4123-6ff8-c87348c535a3
response_time:0.813611619 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:00 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.46:42514 x_forwarded_for:"-" vcap_request_id:c54dff7f-908f-4cc1-49d9-de6d8bd04fe7
response_time:0.127545436 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:01 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.43:29980 x_forwarded_for:"-" vcap_request_id:23bc66ac-c78e-42ab-5a07-60f99ffc492b
response_time:0.117255613 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". [WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:03 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.43:23392 x_forwarded_for:"-" vcap_request_id:c255a3fb-5eb1-44f5-4c08-b22222a4c8b7
response_time:0.111495485 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:04 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.46:41130
x_forwarded_for:"-"
vcap_request_id:0c009c84-f0c0-46e9-7b6d-da8e3ff91a55
response_time:0.115888617 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. [INFO ] SESN0175I: An existing session context
will be used for application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:05 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.46:52243 x_forwarded_for:"-" vcap_request_id:c4c29b52-ff3a-48b6-47e4-7e1fce0c3f74
response_time:0.187145593 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:06 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.42:11225
x_forwarded_for:"-"
vcap_request_id:54e0e021-826e-443b-6a7a-5f6bbc28a926
response_time:0.132534560 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp.
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:08 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.43:32255
x_forwarded_for:"-"
vcap_request_id:0ac50be0-e2e9-436c-4e97-d854f78e1f49
response_time:0.089186493 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp.
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:09 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.46:39103
x_forwarded_for:"-"
vcap_request_id:ddc4754a-cf0f-494c-78de-26fcd61ba1af
response_time:0.102293236 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp.
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:10 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.42:30749
x_forwarded_for:"-"
vcap_request_id:fa6ba947-4b8c-474b-4b48-ace26fc3274e
response_time:0.091226461 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp.
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:11 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.46:46353
x_forwarded_for:"-"
vcap_request_id:dfc99308-11c0-4ea7-48ca-b4061b3b4c6f
response_time:0.096913693 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp.
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:12 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.46:57429 x_forwarded_for:"-" vcap_request_id:4f7e9876-cf5d-46c2-6cb1-19f00329e029
response_time:0.100562784 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:13 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.43:52701 x_forwarded_for:"-" vcap_request_id:fd13c364-d65a-4ca6-66b1-9bc49c1ea427
response_time:0.098537113 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:15 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.42:10951 x_forwarded_for:"-" vcap_request_id:883eb6fc-cdb4-45c6-41f6-cc65970ef256
response_time:0.095498510 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:16 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.42:30830 x_forwarded_for:"-" vcap_request_id:fc251ebf-da3a-48ae-4312-5218bd83808b
response_time:0.134904531 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15". TaxBillReminder.mybluemix.net - [22/05/2015:06:29:17 +0000] "GET
/ HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.42:54827
x_forwarded_for:"-"
vcap_request_id:e09e1926-860b-481e-4b48-ed5a66330580
response_time:0.084558083 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. [INFO ] SESN0175I: An existing session context
will be used for application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:18 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.42:31009 x_forwarded_for:"-" vcap_request_id:a9c3a69f-ae27-4c72-7422-608fe01451fd
response_time:0.092770319 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:19 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.46:55458 x_forwarded_for:"-" vcap_request_id:20ebe389-2371-455a-5832-71c85f48c46d
response_time:0.083255059 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:21 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.46:44171 x_forwarded_for:"-" vcap_request_id:14081f78-3959-462f-5602-dd474718094c
response_time:0.104446356 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. TaxBillReminder.mybluemix.net -
[22/05/2015:06:29:22 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0"
75.126.70.43:21091 x_forwarded_for:"-" vcap_request_id:930a620b-e6a2-4bdb-6b72-36c072eea29b
response_time:0.100104583 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. taxbillreminder.mybluemix.net -
[22/05/2015:06:29:23 +0000] "GET / HTTP/1.1" 404 217 "-" "Mozilla/5.0
(compatible; MSIE 10.0; Windows NT 6.1; Win64; x64; Trident/6.0)"
75.126.70.43:45588 x_forwarded_for:"-" vcap_request_id:cd805473-5b36-423c-441f-4a013e0c91c3
response_time:0.092833842 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
[INFO ] SESN0175I: An existing session context will be used for
application key default_host/
[INFO ] JSPG8502I: The value of the JSP attribute jdkSourceLevel is
"15".
[WARNING ] SRVE0274W: Error while adding servlet mapping for
path-->/forms/, wrapper-->ServletWrapper[dispatcher:[/forms/]],
application-->myapp. taxbillreminder.mybluemix.net -
[22/05/2015:06:30:31 +0000] "GET / HTTP/1.1" 404 217 "-" "Mozilla/5.0
(Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Firefox/31.0"
75.126.70.43:54400 x_forwarded_for:"-" vcap_request_id:7ca062d7-13ff-4ae2-5441-265d3c2194b5
response_time:0.424214609 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211

You need to look at the context-root or contextRoot that might be defined in your server.xml or web.xml. If there is no context-root or contextRoot defined then the name of the liberty application is used, see here for the rules. The route to your app running on liberty will normally be something like this:
http://your_bluemix_app.mybluemix.net/the_liberty_app_name
The deployed url that you see Bluemix report is the base url for the application which in this case is a liberty server, so you need to append your context-root (or liberty app name) for your app to it.
You can imagine that you can push 2 or more liberty apps packaged in one liberty server to Bluemix. In this case you have one Bluemix app with 2 web applications running within it that can be accessed like this:
http://your_bluemix_app.mybluemix.net/the_liberty_app_name_1
http://your_bluemix_app.mybluemix.net/the_liberty_app_name_2

I had similar issue. The solution described at http://developer.ibm.com/answers/answers/185697/view.html worked for me.
Looks like the application failed to initialize because of the following:
[INFO ] FFDC1015I: An FFDC Incident has been created: "java.util.ServiceConfigurationError: javax.servlet.ServletContainerInitializer: Provider org.cloudfoundry.reconfiguration.spring.AutoReconfigurationServletContainerInitializer could not be instantiated com.ibm.ws.webcontainer.osgi.DynamicVirtualHost startWebApp" at ffdc_15.05.22_06.28.59.0.log TaxBillReminder.mybluemix.net - [22/05/2015:06:28:58 +0000] "GET / HTTP/1.1" 404 217 "-" "Java/1.8.0" 75.126.70.42:31418 x_forwarded_for:"-" vcap_request_id:430a380b-a68e-4123-6ff8-c87348c535a3 response_time:0.813611619 app_id:70683a0f-06f4-4ad9-93b7-b37dc8241211
Your application is spring and the autoconfiguration is causing problems.
With the latest Liberty buildpack, you can set JBP_CONFIG_SPRINGAUTORECONFIGURATION environment variable to '[enabled: false]' to disable Spring auto-reconfiguration. I think in your case the Spring auto-reconfiguration bit is the cause of this problem. Using the cf client execute and then restage your application:
$ cf set-env myApplication JBP_CONFIG_SPRINGAUTORECONFIGURATION '[enabled: false]'

Cannot create sink whose type is HDFS in flume-ng

I have a flume-ng which write logs to HDFS.
I made one agent in a single node.
But it is not running.
There is my configuration.
# example2.conf: A single-node Flume configuration
# Name the components on this agent
agent1.sources = source1
agent1.sinks = sink1
agent1.channels = channel1
# Describe/configure source1
agent1.sources.source1.type = avro
agent1.sources.source1.bind = localhost
agent1.sources.source1.port = 41414
# Use a channel which buffers events in memory
agent1.channels.channel1.type = memory
agent1.channels.channel1.capacity = 10000
agent1.channels.channel1.transactionCapacity = 100
# Describe sink1
agent1.sinks.sink1.type = HDFS
agent1.sinks.sink1.hdfs.path = hdfs://dbkorando.kaist.ac.kr:9000/flume
# Bind the source and sink the channel
agent1.sources.source1.channels = channel1
agent1.sinks.sink1.channel = channel1
and i command
flume-ng agent -n agent1 -c conf -C /home/hyahn/hadoop-0.20.2/hadoop-0.20.2-core.jar -f conf/example2.conf -Dflume.root.logger=INFO,console
The Result is
Info: Including Hadoop libraries found via (/home/hyahn/hadoop-0.20.2/bin/hadoop) for HDFS access
+ exec /usr/java/jdk1.7.0_02/bin/java -Xmx20m -Dflume.root.logger=INFO,console -cp '/etc/flume-ng/conf:/usr/lib/flume-ng/lib/*:/home/hyahn/hadoop-0.20.2/hadoop-0.20.2-core.jar' -Djava.library.path=:/home/hyahn/hadoop-0.20.2/bin/../lib/native/Linux-amd64-64 org.apache.flume.node.Application -n agent1 -f conf/example2.conf
2012-11-27 15:33:17,250 (main) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 1
2012-11-27 15:33:17,253 (main) [INFO - org.apache.flume.node.FlumeNode.start(FlumeNode.java:54)] Flume node starting - agent1
2012-11-27 15:33:17,257 (lifecycleSupervisor-1-1) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider.start(AbstractFileConfigurationProvider.java:67)] Configuration provider starting
2012-11-27 15:33:17,257 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.node.nodemanager.DefaultLogicalNodeManager.start(DefaultLogicalNodeManager.java:203)] Node manager starting
2012-11-27 15:33:17,258 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.lifecycle.LifecycleSupervisor.start(LifecycleSupervisor.java:67)] Starting lifecycle supervisor 9
2012-11-27 15:33:17,258 (conf-file-poller-0) [INFO - org.apache.flume.conf.file.AbstractFileConfigurationProvider$FileWatcherRunnable.run(AbstractFileConfigurationProvider.java:195)] Reloading configuration file:conf/example2.conf
2012-11-27 15:33:17,266 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:988)] Processing:sink1
2012-11-27 15:33:17,266 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:988)] Processing:sink1
2012-11-27 15:33:17,267 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:988)] Processing:sink1
2012-11-27 15:33:17,268 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:902)] Added sinks: sink1 Agent: agent1
2012-11-27 15:33:17,290 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:122)] Post-validation flume configuration contains configuration for agents: [agent1]
2012-11-27 15:33:17,290 (conf-file-poller-0) [INFO - org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:249)] Creating channels
2012-11-27 15:33:17,354 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.(MonitoredCounterGroup.java:68)] Monitoried counter group for type: CHANNEL, name: channel1, registered successfully.
2012-11-27 15:33:17,355 (conf-file-poller-0) [INFO - org.apache.flume.conf.properties.PropertiesFileConfigurationProvider.loadChannels(PropertiesFileConfigurationProvider.java:273)] created channel channel1
2012-11-27 15:33:17,368 (conf-file-poller-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.(MonitoredCounterGroup.java:68)] Monitoried counter group for type: SOURCE, name: source1, registered successfully.
2012-11-27 15:33:17,378 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:70)] Creating instance of sink: sink1, type: HDFS
As above, the problem that flume-ng stop at the sink generating part has occurred.
What is the problem?

you need to open another window and send an avro command at port 41414 as:
bin/flume-ng avro-client --conf conf -H localhost -p 41414 -F /home/hadoop1/aaa.txt -Dflume.root.logger=DEBUG,console
here i have a file named aaa.txt at /home/hadoop1/ directory
your flume will read this file and send to hdfs.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Fiware: can not start cygnus as service - centos7

Related

akka.management.cluster.bootstrap.internal.HttpContactPointBootstrap - Probing failed. How can it be solved?

How to deploy image classifier with resnet50 model on AWS endpoint to predict without worker dying?

Unable to create/read document to HDFS deployed with AWS EBS in EKS cluster

404 not found on running URL after bluemix deployment

Cannot create sink whose type is HDFS in flume-ng

Categories

Resources