Error when i'm going to start slurm in node - centos7

I'm preparing a cluster with personal machine, I mount Centos 7 in the server and I'm trying to start the slurm clients but when I typed this command:
pdsh -w n[00-09] systemctl start slurmd
I had this error:
n07: Job for slurmd.service failed because the control process exited with error code. See "systemctl status slurmd.service" and "journalctl -xe" for details.
pdsh#localhost: n07: ssh exited with exit code 1
I had that message for all the nodes.
[root#localhost ~]# systemctl status slurmd.service -l
● slurmd.service - Slurm node daemon
Loaded: loaded (/usr/lib/systemd/system/slurmd.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2020-12-22 18:27:30 CST; 27min ago
Process: 1589 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=203/EXEC)
Dec 22 18:27:30 localhost.localdomain systemd[1]: Starting Slurm node daemon...
Dec 22 18:27:30 localhost.localdomain systemd[1]: slurmd.service: control process exited, code=exited status=203
Dec 22 18:27:30 localhost.localdomain systemd[1]: Failed to start Slurm node daemon.
Dec 22 18:27:30 localhost.localdomain systemd[1]: Unit slurmd.service entered failed state.
Dec 22 18:27:30 localhost.localdomain systemd[1]: slurmd.service failed.
This is the slurm.conf file:
ClusterName=linux
ControlMachine=localhost
#ControlAddr=
#BackupController=
#BackupAddr=
#
SlurmUser=slurm
#SlurmdUser=root
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
StateSaveLocation=/var/spool/slurm/ctld
SlurmdSpoolDir=/var/spool/slurm/d
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
ProctrackType=proctrack/pgid
#PluginDir=
#FirstJobId=
#MaxJobCount=
#PlugStackConfig=
#PropagatePrioProcess=
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#Prolog=
#Epilog=
#SrunProlog=
#SrunEpilog=
#TaskProlog=
#TaskEpilog=
#TaskPlugin=
#TrackWCKey=no
#TreeWidth=50
#TmpFS=
#UsePAM=
#
#TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
#
#SCHEDULING
SchedulerType=sched/backfill
#SchedulerAuth=
#SelectType=select/linear
FastSchedule=1
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=14-0
#PriorityUsageResetPeriod=14-0
#PriorityWeightFairshare=100000
#PriorityWeightAge=1000
#PriorityWeightPartition=10000
#PriorityWeightJobSize=1000
#PriorityMaxAge=1-0
#
#LOGGING
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurmd.log
JobCompType=jobcomp/none
#JobCompLoc=
#
#ACCOUNTING #JobAcctGatherType=jobacct_gather/linux #JobAcctGatherFrequency=30 # #AccountingStorageType=accounting_storage/slurmdbd #AccountingStorageHost= #AccountingStorageLoc= #AccountingStoragePass= #AccountingStorageUser= # #COMPUTE NODES # OpenHPC default configuration TaskPlugin=task/affinity PropagateResourceLimitsExcept=MEMLOCK AccountingStorageType=accounting_storage/filetxt Epilog=/etc/slurm/slurm.epilog.clean NodeName=n[00-09] Sockets=1 CoresPerSocket=6 ThreadsPerCore=2 State=UNKNOWN
PartitionName=normal Nodes=n[00-09] Default=YES MaxTime=24:00:00 State=UP
ReturnToService=1
The control machine was set by hostname -s

Related

Cannot run puma with systemd on debian while the same start command in terminal works (fe_sendauth: no password supplied error)

I try to manage Puma server on my ruby on rails web-site with systemd. Puma cannot start with following error: PG::ConnectionBad: fe_sendauth: no password supplied. When I start Puma myself in terminal with the same start command as in systemd it run correctly. Please help.
I use RoR 4.2.11.1, postgresql 11.2, on Debian 9.12, which run on VirtualBox 6.0.
web-site file structure:
/mytarifs/current - symlink to last release
/mytarifs/releases - relseas
/mytarifs/shared - shared files like database connections
I start Puma in terminal with success by following command:
root#mt-staging-1:/mytarifs/current# bundle exec puma -C config/puma.production.rb
Database_URL environment variable:
DATABASE_URL=postgresql://login_name:password#localhost:5432/db_tarif
With this database url I can connect to my db with psql
error log:
Mar 07 02:20:39 mt-staging-1 systemd[1]: Started puma for mytarifs (production).
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] Puma starting in cluster mode...
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] * Version 4.3.3 (ruby 2.3.8-p459), codename: Mysterious Traveller
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] * Min threads: 0, max threads: 5
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] * Environment: production
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] * Process workers: 1
Mar 07 02:20:40 mt-staging-1 puma[12237]: [12237] * Preloading application
Mar 07 02:20:47 mt-staging-1 puma[12237]: The PGconn, PGresult, and PGError constants are deprecated, and will be
Mar 07 02:20:47 mt-staging-1 puma[12237]: removed as of version 1.0.
Mar 07 02:20:47 mt-staging-1 puma[12237]: You should use PG::Connection, PG::Result, and PG::Error instead, respectively.
Mar 07 02:20:47 mt-staging-1 puma[12237]: Called from /mytarifs/releases/20200306184828/vendor/bundle/ruby/2.3.0/gems/activesupport-4.2.11.1/lib/active_support/dependencies.rb:240:in `load_dependency'
/mytarifs/current/config/puma.production.rb
threads Integer(ENV['MIN_THREADS'] || 0), Integer(ENV['MAX_THREADS'] || 5)
workers Integer(ENV['PUMA_WORKERS'] || 1)
preload_app!
bind 'unix:///mytarifs/shared/tmp/sockets/puma.sock'
pidfile '/mytarifs/shared/tmp/pids/puma.production.pid'
state_path '/mytarifs/shared/tmp/pids/puma.state'
rackup DefaultRackup
environment ENV['RACK_ENV'] || 'production'
on_worker_boot do
ActiveSupport.on_load(:active_record) do
ActiveRecord::Base.establish_connection
end
end
/mytarifs/current/config/database.yml
default: &default
adapter: postgresql
encoding: unicode
pool: 125
username: <%= ENV["PG_USERNAME"] %>
password: <%= ENV["PG_PASSWORD"] %>
host: localhost
template: template0
reconnect: true
production:
<<: *default
url: <%= ENV["DATABASE_URL"] %>
/etc/systemd/system/puma.service
[Unit]
Description=puma for mytarifs (production)
After=network.target
[Service]
Type=simple
Environment=RAILS_ENV=production
Environment=PUMA_DEBUG=1
WorkingDirectory=/mytarifs/current
ExecStart=/root/.rbenv/shims/bundle exec puma -e production -C config/puma.production.rb
ExecReload=/bin/kill -TSTP $MAINPID
ExecStop=/bin/kill -TERM $MAINPID
User=root
Group=root
RestartSec=1
Restart=on-failure
SyslogIdentifier=puma
[Install]
WantedBy=multi-user.target
ok, I found the reason for mistake. It is because environment variables are not available (equal to "") when systemd executes.
I do not know how get environment variables from memory, but systemd can take them from file with directive EnvironmentFile=/absolute/path/to/environment/file

Adding content_type to render_to_response in Django's views.py causing 'Server Error (500)'

In django 1.11, in views.py I am using the render_to_response function as follows:
return render_to_response(domainObject.template_path, context_dict, context)
This works fine. Now I am trying to specify the content_type for this response as 'txt/html'. So I switch to
content_type = 'txt/html'
return render_to_response(domainObject.template_path, context_dict, context, content_type)
But with this setup the server returns a
Server Error (500)
Following the documentation at https://docs.djangoproject.com/en/1.8/topics/http/shortcuts/#render-to-response I think I am providing the variables in the right order...
Here is the full 'def' block for reference:
def myview(request):
context = RequestContext(request)
if request.homepage:
migrationObject = calltomigration()
else:
integrationObject = Integration.objects.filter(subdomain_slug=request.subdomain).get()
except ObjectDoesNotExist:
logger.warning(ObjectDoesNotExist)
raise Http404
sectionContent = None
if not request.homepage:
sectionContent = getLeafpageSectionContent(referenceObject)
context_dict = {
'reference': referenceObject,
'sectionContent': sectionContent,
'is_homepage': request.homepage
}
# content_type = 'txt/html'
return render_to_response(domainObject.template_path, context_dict, context)
Here is the NGINX status:
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-01-17 16:34:15 UTC; 40s ago
Docs: man:nginx(8)
Process: 14517 ExecStop=/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid (code=exited, status=2)
Process: 14558 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 14546 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 14562 (nginx)
Tasks: 2 (limit: 1152)
CGroup: /system.slice/nginx.service
├─14562 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
└─14564 nginx: worker process
Jan 17 16:34:15 ip-172-31-8-232 systemd[1]: nginx.service: Failed with result 'timeout'.
Jan 17 16:34:15 ip-172-31-8-232 systemd[1]: Stopped A high performance web server and a reverse proxy server.
Jan 17 16:34:15 ip-172-31-8-232 systemd[1]: Starting A high performance web server and a reverse proxy server...
Jan 17 16:34:15 ip-172-31-8-232 systemd[1]: nginx.service: Failed to parse PID from file /run/nginx.pid: Invalid argument
Jan 17 16:34:15 ip-172-31-8-232 systemd[1]: Started A high performance web server and a reverse proxy server.
[1]+ Done sudo systemctl restart nginx ```
Today I fixed the issue. I found out that in render_to_response, the MIME type has to be specified in the third position (at least in the setup I am working on). Most OS/browser combinations figured out the misformed MIME type, with the exception of Edge on PC. Fixed now!
The standard Django shortcuts function 'render' provides the same functionality as 'render_on_response'. Django's 'render_to_reponse' function was depreciated in 2.2 and is officially removed from the Django in 3.0. You can check the release notes here:
https://docs.djangoproject.com/en/3.0/releases/3.0/
Check out the official documentation for render function below,
https://docs.djangoproject.com/en/3.0/topics/http/shortcuts/
Also, the context_type should be 'text/html' instead of 'txt/html'.

Celery - Permission Problem - Create folder

I use celery (jobs manager) on prod mode for a website (Django) on a centos7 server.
My problem is that in a celery task my function did not create folder (see my_function).
the function
def my_fucntion():
parent_folder = THE_PARENT_PATH
if not os.path.exists(centrifuge_recentrifuge_work_dir_path):
os.makedirs(centrifuge_recentrifuge_work_dir_path)
# The folder THE_PARENT_PATH is created
celery_task(parent_folder)
the celery task
#app.task(name='a task')
def celery_task(parent_folder):
import getpass; print("permission : ", getpass.getuser())
# permission : apache
path_1 = os.path.join(parent_folder, "toto")
if not os.path.exists(path_1):
os.makedirs(path_1)
# The folder path_1 is NOT created
..... some others instructions...
# Singularity image run (needed the path_1 folder)
I use Supervisord for daemonization of celery.
 celery.init
[program:sitecelery]
command=/etc/supervisord.d/celery.sh
directory=/mnt/site/
user=apache
numprocs=1
stdout_logfile=/var/log/celery/worker.log
stderr_logfile=/var/log/celery/worker.log
autostart=true
autorestart=true
priority=999
The folder path_1 is created when user=root but i want that it was not rot but apache user.
celery.sh
#!/bin/bash
cd /mnt/site/
exec ../myenv/bin/python3 -m celery -A site.celery_settings worker -l info --autoscale 20
sudo systemctl status supervisord
● supervisord.service - Process Monitoring and Control Daemon
Loaded: loaded (/usr/lib/systemd/system/supervisord.service; disabled; vendor preset: disabled)
Active: active (running) since lun. 2018-10-15 09:09:05 CEST; 4min 59s ago
Process: 61477 ExecStart=/usr/bin/supervisord -c /etc/supervisord.conf (code=exited, status=0/SUCCESS)
Main PID: 61480 (supervisord)
CGroup: /system.slice/supervisord.service
├─61480 /usr/bin/python /usr/bin/supervisord -c /etc/supervisord.conf
└─61491 ../myenv/bin/python3 -m celery -A Site_CNR.celery_settings worker -l info --autoscale 20
oct. 15 09:09:05 web01 systemd[1]: Starting Process Monitoring and Control Daemon...
oct. 15 09:09:05 web01 systemd[1]: Started Process Monitoring and Control Daemon.
oct. 15 09:09:17 web01 Singularity[61669]: action-suid (U=48,P=61669)> Home directory is not owned by calling user: /usr/share/httpd
oct. 15 09:09:17 web01 Singularity[61669]: action-suid (U=48,P=61669)> Retval = 255
oct. 15 09:09:17 web01 Singularity[61678]: action-suid (U=48,P=61678)> Home directory is not owned by calling user: /usr/share/httpd
oct. 15 09:09:17 web01 Singularity[61678]: action-suid (U=48,P=61678)> Retval = 255
EDIT 1 os.makedirs
In the celery tasks :
if not os.path.exists(path_1):
print("test")
# test
print(os.makedirs(path_1))
# None
os.makedirs(path_1)
The os.makedirs return None :/
I dont know why but with this correction on a post error of this problem with a sudo chown -R apache:apache /usr/share/httpd/ resolve this problem oO

Sphinx installation centos7

I just updated sphinx to the latest version on a dedicated server running with centos 7, but after hours of search I can't find the problem.
The sphinx index has created well, but I can't start search daemon. I got this messages all the time :
systemctl status searchd.service
searchd.service - SphinxSearch Search Engine
Loaded: loaded (/usr/lib/systemd/system/searchd.service; disabled; vendor preset: disabled)
Active: failed (Result: timeout) since Sat 2018-03-24 21:14:09 CET; 3min 4s ago
Process: 17865 ExecStartPre=/bin/chown sphinx.sphinx /var/run/sphinx (code=exited, status=0/SUCCESS)
Process: 17863 ExecStartPre=/bin/mkdir -p /var/run/sphinx (code=killed, signal=TERM)
Mar 24 21:14:09 systemd[1]: Starting SphinxSearch Search Engine...
Mar 24 21:14:09 systemd[1]: searchd.service start-pre operation timed out. Terminating.
Mar 24 21:14:09 systemd[1]: Failed to start SphinxSearch Search Engine.
Mar 24 21:14:09 systemd[1]: Unit searchd.service entered failed state.
Mar 24 21:14:09 systemd[1]: searchd.service failed.
I have really no idea where this problem comes from.
In your systemd service file (mine is in /usr/lib/systemd/system/searchd.service) comment out:
/bin/chown sphinx.sphinx /var/run/sphinx
/bin/mkdir -p /var/run/sphinx manually
(you can run these commands manually if it's not done yet).
Then change from
Type=forking
to
Type=simple
Then do systemctl daemon-reload and you can start/stop/status the service:
[root#server ~]# cat /usr/lib/systemd/system/searchd.service
[Unit]
Description=SphinxSearch Search Engine
After=network.target remote-fs.target nss-lookup.target
After=syslog.target
[Service]
Type=simple
User=sphinx
Group=sphinx
# Run ExecStartPre with root-permissions
PermissionsStartOnly=true
#ExecStartPre=/bin/mkdir -p /var/run/sphinx
#ExecStartPre=/bin/chown sphinx.sphinx /var/run/sphinx
# Run ExecStart with User=sphinx / Group=sphinx
ExecStart=/usr/bin/searchd --config /etc/sphinx/sphinx.conf
ExecStop=/usr/bin/searchd --config /etc/sphinx/sphinx.conf --stopwait
KillMode=process
KillSignal=SIGTERM
SendSIGKILL=no
LimitNOFILE=infinity
TimeoutStartSec=infinity
PIDFile=/var/run/sphinx/searchd.pid
[Install]
WantedBy=multi-user.target
Alias=sphinx.service
Alias=sphinxsearch.service
[root#server ~]# systemctl start searchd
[root#server ~]# systemctl status searchd
● searchd.service - SphinxSearch Search Engine
Loaded: loaded (/usr/lib/systemd/system/searchd.service; disabled; vendor preset: disabled)
Active: active (running) since Sun 2018-03-25 10:41:24 EDT; 4s ago
Process: 111091 ExecStop=/usr/bin/searchd --config /etc/sphinx/sphinx.conf --stopwait (code=exited, status=1/FAILURE)
Main PID: 112030 (searchd)
CGroup: /system.slice/searchd.service
├─112029 /usr/bin/searchd --config /etc/sphinx/sphinx.conf
└─112030 /usr/bin/searchd --config /etc/sphinx/sphinx.conf
Mar 25 10:41:24 server.domain.com searchd[112026]: Sphinx 2.3.2-id64-beta (4409612)
Mar 25 10:41:24 server.domain.com searchd[112026]: Copyright (c) 2001-2016, Andrew Aksyonoff
Mar 25 10:41:24 server.domain.com searchd[112026]: Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Mar 25 10:41:24 server.domain.com searchd[112026]: Sphinx 2.3.2-id64-beta (4409612)
Mar 25 10:41:24 server.domain.com searchd[112026]: Copyright (c) 2001-2016, Andrew Aksyonoff
Mar 25 10:41:24 server.domain.com searchd[112026]: Copyright (c) 2008-2016, Sphinx Technologies Inc (http://sphinxsearch.com)
Mar 25 10:41:24 server.domain.com searchd[112026]: precaching index 'test1'
Mar 25 10:41:24 server.domain.com searchd[112026]: WARNING: index 'test1': prealloc: failed to open /var/lib/sphinx/test1.sph: No such file or directory...T SERVING
Mar 25 10:41:24 server.domain.com searchd[112026]: precaching index 'testrt'
Mar 25 10:41:24 server.domain.com systemd[1]: searchd.service: Supervising process 112030 which is not our child. We'll most likely not notice when it exits.
Hint: Some lines were ellipsized, use -l to show in full.
[root#server ~]# systemctl stop searchd
[root#server ~]# systemctl status searchd
● searchd.service - SphinxSearch Search Engine
Loaded: loaded (/usr/lib/systemd/system/searchd.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2018-03-25 10:41:36 EDT; 1s ago
Process: 112468 ExecStop=/usr/bin/searchd --config /etc/sphinx/sphinx.conf --stopwait (code=exited, status=1/FAILURE)
Main PID: 112030
Mar 25 10:41:24 server.domain.com searchd[112026]: WARNING: index 'test1': prealloc: failed to open /var/lib/sphinx/test1.sph: No such file or directory...T SERVING
Mar 25 10:41:24 server.domain.com searchd[112026]: precaching index 'testrt'
Mar 25 10:41:24 server.domain.com systemd[1]: searchd.service: Supervising process 112030 which is not our child. We'll most likely not notice when it exits.
Mar 25 10:41:33 server.domain.com systemd[1]: Stopping SphinxSearch Search Engine...
Mar 25 10:41:33 server.domain.com searchd[112468]: [Sun Mar 25 10:41:33.183 2018] [112468] using config file '/etc/sphinx/sphinx.conf'...
Mar 25 10:41:33 server.domain.com searchd[112468]: [Sun Mar 25 10:41:33.183 2018] [112468] stop: successfully sent SIGTERM to pid 112030
Mar 25 10:41:36 server.domain.com systemd[1]: searchd.service: control process exited, code=exited status=1
Mar 25 10:41:36 server.domain.com systemd[1]: Stopped SphinxSearch Search Engine.
Mar 25 10:41:36 server.domain.com systemd[1]: Unit searchd.service entered failed state.
Mar 25 10:41:36 server.domain.com systemd[1]: searchd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
I had the same problem and finally found the solution that worked for me.
I have edited my "/etc/systemd/system/sphinx.service" to look like
[Unit]
Description=SphinxSearch Search Engine
After=network.target remote-fs.target nss-lookup.target
After=syslog.target
[Service]
User=sphinx
Group=sphinx
RuntimeDirectory=sphinxsearch
RuntimeDirectoryMode=0775
# Run ExecStart with User=sphinx / Group=sphinx
ExecStart=/usr/bin/searchd --config /etc/sphinx/sphinx.conf
ExecStop=/usr/bin/searchd --config /etc/sphinx/sphinx.conf --stopwait
KillMode=process
KillSignal=SIGTERM
SendSIGKILL=no
LimitNOFILE=infinity
TimeoutStartSec=infinity
#PIDFile=/var/run/sphinx/searchd.pid
PIDFile=/var/run/sphinxsearch/searchd.pid
[Install]
WantedBy=multi-user.target
Alias=sphinx.service
Alias=sphinxsearch.service
In that case my searchd is able to survive the reboot. The solution from previous post have the problem with searchd starting after reboot before the /var/run/sphinxsearch dir was deleting after reboot in my case.
The fact is that RHEL (CentOS) 7 does not perceive the "Infinity" value of the "TimeoutStartSec"parameter. You must set a numeric value. For Example, TimeoutStartSec=600

Perl regex dotall problems

I'm trying to fetch certain values from a file that I've created with a system command. The file is in order and the regex is working up until I reach a "newline". I've tried to get it to grab the other value in multiple ways, but I can't seem to figure it out. Where am I going wrong?
Here is the code
sub servicechoise2 {
my $sys_com = "Servicestatus.txt";
print "type status you would like to see status of: ";
my $service = <>;
chomp $service;
system( "systemctl status $service > $sys_com" );
open( my $fh2, "<", $sys_com );
my #services;
while ( my $line = <$fh2> ) {
if ( $line =~ /([a-z]+.service)\s-.*(running|dead)/s ) {
my %hash2 = (
"servicename" => $1,
"servicestatus" => $2
);
push( #services, \%hash2 );
}
}
return \#services;
}
and here is the file I'm parsing
sshd.service - OpenSSH server daemon Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled) Active: active (running) since Fri 2015-08-21 18:20:06 CEST; 1h 32min ago Main PID: 1297 (sshd) CGroup: /system.slice/sshd.service
└─1297 /usr/sbin/sshd -D
Aug 21 18:20:06 Thomas-PC systemd[1]: Started OpenSSH server daemon. Aug 21 18:20:07 Thomas-PC sshd[1297]: Server listening on 0.0.0.0 port
22. Aug 21 18:20:07 Thomas-PC sshd[1297]: Server listening on :: port 22.
cups.service - CUPS Printing Service Loaded: loaded (/usr/lib/systemd/system/cups.service; enabled) Active: active (running) since Fri 2015-08-21 18:20:33 CEST; 1h 32min ago Main PID: 3657 (cupsd) CGroup: /system.slice/cups.service
└─3657 /usr/sbin/cupsd -f
Aug 21 18:20:33 Thomas-PC systemd[1]: Started CUPS Printing Service.
ntpd.service - Network Time Service Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled) Active: inactive (dead)
named.service - Berkeley Internet Name Domain (DNS) Loaded: loaded (/usr/lib/systemd/system/named.service; enabled) Active: active (running) since Fri 2015-08-21 18:20:10 CEST; 1h 32min ago Process: 2477 ExecStart=/usr/sbin/named -u named $OPTIONS (code=exited, status=0/SUCCESS) Process: 1302 ExecStartPre=/usr/sbin/named-checkconf -z /etc/named.conf (code=exited, status=0/SUCCESS) Main PID: 2502 (named) CGroup: /system.slice/named.service
└─2502 /usr/sbin/named -u named
Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.biz/A/IN': 2001:503:7bbb:ffff:ffff:ffff:ffff:ff7e#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.biz/AAAA/IN': 2001:503:7bbb:ffff:ffff:ffff:ffff:ff7e#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.biz/A/IN': 2001:500:3682::12#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.biz/AAAA/IN': 2001:500:3682::12#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'ns2.isc.ultradns.net/A/IN': 2001:502:4612::e8#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.com/AAAA/IN': 2001:502:f3ff::e8#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.com/AAAA/IN': 2610:a1:1016::e8#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.co.uk/AAAA/IN': 2610:a1:1017::e8#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.co.uk/A/IN': 2610:a1:1017::e8#53 Aug 21 19:20:11 Thomas-PC named[2502]: error (network unreachable) resolving 'pdns196.ultradns.biz/A/IN': 2610:a1:1015::e8#53
postfix.service - Postfix Mail Transport Agent Loaded: loaded (/usr/lib/systemd/system/postfix.service; enabled) Active: active (running) since Fri 2015-08-21 18:20:10 CEST; 1h 32min ago Process: 1335 ExecStart=/usr/sbin/postfix start (code=exited, status=0/SUCCESS) Process: 1328 ExecStartPre=/usr/libexec/postfix/chroot-update (code=exited, status=0/SUCCESS) Process: 1298 ExecStartPre=/usr/libexec/postfix/aliasesdb (code=exited, status=0/SUCCESS) Main PID: 2531 (master) CGroup: /system.slice/postfix.service
├─2531 /usr/libexec/postfix/master -w
├─2534 pickup -l -t unix -u
└─2535 qmgr -l -t unix -u
Aug 21 18:20:06 Thomas-PC systemd[1]: Starting Postfix Mail Transport Agent... Aug 21 18:20:09 Thomas-PC postfix/postfix-script[2510]: warning: group or other writable: /etc/postfix/./main.cf Aug 21 18:20:10 Thomas-PC postfix/postfix-script[2529]: starting the Postfix mail system Aug 21 18:20:10 Thomas-PC postfix/master[2531]: daemon started -- version 2.10.1, configuration /etc/postfix Aug 21 18:20:10 Thomas-PC systemd[1]: Started Postfix Mail Transport Agent. Aug 21 18:23:08 Thomas-PC postfix/smtpd[4293]: connect from localhost[127.0.0.1] Aug 21 18:23:08 Thomas-PC postfix/smtpd[4293]: NOQUEUE: reject: RCPT from localhost[127.0.0.1]: 550 5.1.1 <a14thona#localhost>: Recipient address rejected: User unknown in local recipient table; from=<admin#localhost> to=<a14thona#localhost> proto=ESMTP helo=<localhost.localdomain> Aug 21 18:23:08 Thomas-PC postfix/smtpd[4293]: lost connection after RCPT from localhost[127.0.0.1] Aug 21 18:23:08 Thomas-PC postfix/smtpd[4293]: disconnect from localhost[127.0.0.1]
the subroutine returns this array of hashes
[
{ servicename => "sshd.service", servicestatus => "running" },
{ servicename => "cups.service", servicestatus => "running" },
{ servicename => "ntpd.service", servicestatus => "dead" },
{ servicename => "named.service", servicestatus => "running" },
{ servicename => "postfix.service", servicestatus => "running" },
]
I would try to read the answer into a var, and then process the tokens using split ( it seems there are empty lines between tokens) something like:
open(F,"<",file) || die "...";
{ local $/; $in=; } # slurp file
foreach $line ( split(/\n\n/,$in) )
{
if ( $line =~ /([a-z]+.service)\s-.*(running|dead)/s ) {
 ......
}