dtrace: use pid provider on multiple libraries simultaneously - dtrace

This dtrace script will fire every time any function is called in libx by process 12345.
dtrace -q -n 'pid12345:libx::entry { printf("probe fired"); }'
But what I really want is to detect function calls in several libraries, say libx, liby, and libz... something like:
dtrace -q -n 'pid12345:libx,liby,libz::entry { printf("probe fired"); }'
Does anyone know if this is possible using the pid provider - or any other provider?
Thanks!

You can use globbing in the probe description, e.g.
dtrace -q -n 'pid12345:lib[xyz].so::entry { printf("probe fired"); }'

I think I found the answer to my own question, but would welcome any other suggestions. The solution I found is to add multiple comma-separated probes:
dtrace -q -n 'pid12345:libx::entry, pid12345:liby::entry, pid12345:libz::entry { printf("probe fired"); }'

Related

How to stop grep after matching

In Windows, I would have done a search for finding a folders name using findsr Similarly, I want to get a specific folder using grep
In windows, I'm using svnlook tree -t [repos_path] | findstr (13\.9\.[0-9]+\/)
In Ec2 Maiche (Linux) svnlook tree /var/www/svn/ILS | grep -Eo '(13\.9\.[0-9]+\/)'
and I got the repos that I need
13.9.4/
13.9.5/
13.9.6/
13.9.7/
my problem is the grep line in Linux doesn't want to stop (exit) it's still running.
how could I stop it after matching?
You can specify the -m: maximal number of counts. After the specified number of matching lines, grep will stop.
After ^Z the svnlook is paused. You kan kill (^C) the program, send it to the background (bg) or continue (fg).
When you want to interrupt you can use ^C or start the grep with a -m option.

Find if a port is available to use in linux using c++

I am working on a C++ Project. To fulfill one of the requirement, I need to check if a port is available for using in my application anytime. To fulfill this , I have come to this following solution.
#include <iostream>
#include <cstdlib>
#include <stdexcept>
#include <string>
#include <stdio.h>
std::string _executeShellCommand(std::string command) {
char buffer[256];
std::string result = "";
const char * cmd = command.c_str();
FILE* pipe = popen(cmd, "r");
if (!pipe) throw std::runtime_error("popen() failed!");
try {
while (!feof(pipe))
if (fgets(buffer, 128, pipe) != NULL)
result += buffer;
} catch (...) {
pclose(pipe);
throw;
}
pclose(pipe);
return result;
}
bool _isAvailablePort(unsigned short usPort){
char shellCommand[256], pcPort[6];
sprintf(shellCommand, "netstat -lntu | awk '{print $4}' | grep ':' | cut -d \":\" -f 2 | sort | uniq | grep %hu", usPort);
sprintf(pcPort, "%hu", usPort);
std::string output = _executeShellCommand(std::string(shellCommand));
if(output.find(std::string(pcPort)) != std::string::npos)
return false;
else
return true;
}
int main () {
bool res = _isAvailablePort(5678);
return 0;
}
Here Basically the _executeShellCommand function can excute any shell command anytime and can return the stdout output as return string.
And I am executing the following shell command in that function.
netstat -lntu | awk '{print $4}' | grep ':' | cut -d \":\" -f 2 | sort | uniq | grep portToCheck
So, if the port is already in use, the _executeShellCommand will return the PortValue itself, else it will return Blank. So, checking the returned string, I can decide.
So far so good.
Now, I want to be make my Project completely Crash-proof. So, before firing the netstat command, I want to make sure if it really exists or not. I want help in this case. I know, It's kind of stupid to doubt the availability of netstat command in a linux machine. I am just thinking of some user who removed netstat binary from his machine for some reason.
N.B. : I don't want make a bind() call to chack if the port is available or not. Also, it will be best if I can check if netstat command is available without calling _executeShellCommand for another time (i.e. without executing another Shell Command).
An even better idea is to make your code work completely without netstat altogether.
On Linux, all that netstat does (for your use case) is read the contents of /proc/net/tcp, which enumerates all ports in use.
All you have to do is open /proc/net/tcp yourself, and parse it. This becomes just an ordinary, boring, file parsing code. Can't get much more "crash-proof" than that.
You will find the documentation of the format of /proc/net/tcp in Linux manual pages.
In the unlikely event that you need to check UDP ports, this would be /proc/net/udp.
Of course, there is a race window between the time you check /proc/net/tcp, where someone can grab the port. But that's also true with netstat as well, and since that's going to be a much slower process, this will actually be an improvement, and reduce the race window significantly.
Since you're asking for a way to check if netstat command is available, I won't try to suggest the other ways in C++. The shell way is checking return code of the following command:
command -v netstat
If netstat binary is available in $PATH, then the command returns 0. In Bash it usually looks like this:
command -v netstat
if [ $? -eq 0 ]; then
netstat # ...
else
echo >&2 "Error: netstat is not available"
fi
Or simply
command -v netstat >/dev/null && netstat # ...

Disk Space Usage Profiling in Linux

I am running C++ program on Linux [Red Hat], This program creates temporary files on the Hard Disk to compute its result. I need to know how much space does this program use from the Disk while it is running. I could not able to change the source code to keep the files so I can subtract the size of the Program folder before and after producing the results. is there any profiling tool or command line that I can use to help me in this situation.
You can use du -h /path/todir before and after the program run from the terminal, I suppose.
du - estimate file space usage
-h, --human-readable
print sizes in human readable format (e.g., 1K 234M 2G)
If you need further options just take a look at man du
You may make use of stats maintained in /proc/[pid]/io .
From /proc man page :
This file contains I/O statistics for the process, for
example:
# cat /proc/3828/io
rchar: 323934931
wchar: 323929600
syscr: 632687
syscw: 632675
read_bytes: 0
write_bytes: 323932160
cancelled_write_bytes: 0
write_bytes: bytes written
Attempt to count the number of bytes which this process
caused to be sent to the storage layer.
You may write script something like ,
#!/bin/bash
c=0
echo $1
rm /tmp/writtenBytes.txt
psout=`ps aux | grep "$1" | grep -v $0 | grep -v grep `
while [[ "$psout" != "" ]]
do
pid=`echo $psout | awk '{print $2}'`
cat /proc/$pid/io | grep write | grep -v cancelled >> /tmp/writtenBytes.txt
psout=`ps aux | grep "$1" | grep -v $0 | grep -v grep`
echo $psout
sleep 1
done
Run this script as
bash -x getIO.sh "postgres: stats collector"
This script will create file /tmp/writtenBytes.txt, containing number of bytes written by processes named "postgres: stats collector"

Hide cat prompt errors

I would like to set a script in order to continuously parse a specific marker in a xml file.
The script contains the following while loop:
function scan_t()
{
INPUT_FILE=${1}
while : ; do
if [[ -f "$INPUT_FILE" ]]
then
ret=`cat ${INPUT_FILE} | grep "<data>" | awk -F"=|>" '{print $2}' | awk -F"=|<" '{print $1}'`
if [[ "$ret" -ne 0 ]] && [[ -n "$ret" ]]
then
...
fi
fi
done
}
scant_t "/tmp/test.xml"
The line format is :
<data>0</data> or <data>1</data> <data>2</data> ..
Even if the condition if [[ -f "$INPUT_FILE" ]] has been added to the script, sometimes I get:
cat: /tmp/test.xml: No such file or directory.
Indeed, the $INPUT_FILE is normally consumed by an other process which is charged to suppress the file after reading.
This while loop is only used for test, the cat error doesn't matter but I would like to hide this return because it pollutes the terminal a lot.
If some other process can also read and remove the file before this script sees it, you've designed your system with a race condition. (I assume that "charged to suppress" means "designed to unlink"...)
If it's optional for this script to see every input file, then just redirect stderr to /dev/null (i.e. ignore errors when the race condition bites). If it's not optional, then have this script rename the input file to something else, and have the other process watch for that. Check for that file existing before you do the rename, to make sure you don't overwrite a file the other process hasn't read yet.
Your loop has a horrible design. First, you're busy-waiting (with no sleep at all) on the file coming into existence. Second, you're running 4 programs when the input exists, instead of 1.
The busy-wait can be avoided by using inotifywait to watch the directory for changes. So the if [[ -f $INPUT_FILE ]] loop body only runs after a modification to the directory, rather than as fast as a CPU core can run it.
The second is simpler to address: never cat file | something. Either something file, or something < file if something doesn't take filenames on its command line, or behaves differently. cat is only useful if you have multiple files to concatenate. For reading a file into a shell variable, use foo=$(<file).
I see from comments you've already managed to turn your whole pipeline into a single command. So write
INPUT_FILE=foo;
inotifywait -m -e close_write -e moved_to --format %f . |
while IFS= read -r event_file;do
[[ $event_file == $INPUT_FILE ]] &&
awk -F '[<,>]' '/data/ {printf "%s ",$3} END {print ""}' "$INPUT_FILE" 2>/dev/null
# echo "$event_file" &&
# date;
done
# tested and working with the commented-out echo/date commands
Note that I'm waiting for close_write and moved_to, rather than other events, to avoid jumping the gun and reading a file that's not finished being written. Put $INPUT_FILE in its own directory, so you don't get false-positive events waking up your loop for other filenames.
To also implement the rename-to-input-for-next-stage suggestion, you'd put a while [[ -e $INPUT2 ]]; do sleep 0.2; done; mv -n "$INPUT_FILE" "$INPUT2" busy-wait loop after the awk.
An alternative would be to run inotifywait once per loop iteration, but that has the potential for you to get stuck with $INPUT_FILE created before inotifywait started watching. So the producer would be waiting for the consumer to consume, and the consumer wouldn't see the event.
# Race condition with an asynchronous producer, DON'T USE
while inotifywait -qq -e close_write -e moved_to; do
[[ $event_file == $INPUT_FILE ]] &&
awk -F '[<,>]' '/data/ {printf "%s ",$3} END {print ""}' "$INPUT_FILE" 2>/dev/null
done
There doesn't seem to be a way to specify the name of a file that doesn't exist yet, even as a filter, so the loop body needs to test for the specific file existing in the dir before using.
If you don't have inotifywait available, you could just put a sleep into the loop. GNU sleep supports fractional seconds, like sleep 0.5. Busybox probably doesn't. You might want to write a tiny trivial C program anyway, which keeps trying to open(2) the file in a loop that includes a usleep or nanosleep. When open succeeds, redirect stdin from that, and exec your awk program. That way, there's no race possible between a stat and an open.
#include <unistd.h> // for usleep/dup2
#include <sys/types.h> // for open
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h> // for perror
void waitloop(const char *path)
{
const char *const awk_args[] = { "-F", "[<,>]",
"/data/ {printf \"%s \",$3} END {print \"\"}",
path
};
while(42) {
int fd = open(path, O_RDONLY);
if (-1 != fd) {
// if you fork() here, you can avoid the shell loop too.
dup2(fd, 0); // redirect stdin from fd. In theory should check for error here, too.
close(fd); // and do this in the parent after fork
execv("/usr/bin/awk", (char * const*)awk_args); // execv's prototype doesn't prevent it from modifying the strings?
} else if(errno != ENOENT) {
perror("opening the file");
} // else ignore ENOENT
usleep(10000); // 10 milliseconds.
}
}
// optional TODO: error-check *all* the system calls.
This compiles, but I haven't tested it. Looping inside a single process doing open / usleep is much lighter weight than running a whole process to do sleep 0.01 from a shell.
Even better would be to use inotify to watch for directory events to detect the file appearing, instead of usleep. To avoid a race, after setting up the inotify watch, do another check for the file existing, in case it got created after your last check, but before the inotify watch became active.

fetching a http process user using bash

Whats the best way to fetch a web process user (apache|nginx|www-data) for bash script usage?
In my case for setting up folder permissions and changing to the poper owner.
Currently I'm using:
ps aux | grep -E "(www-data|apache|nginx).*(httpd|apache2|nginx)" \
| grep -o "^[a-z\-]*" | head -n1
inside a bash script to fetch the owner of the http process.
Any hints on a more smartly solution or a better regex whould be great.
Your solution will really depend on your operating system. One option might be to check whether likely candidates exist in your password file:
user=$(awk -F: '/www|http/{print $1;exit}' /etc/passwd)
If you really want to look for the owner of running processes, remember that Apache often launches a root-owned "master" process, then launches children as the web user. So perhaps something like this:
user=$(ps aux|awk '$1=="root"{next} /www|http|apache/{print $1;exit}')
But you should also be able to determine things based on OS detection, since things tend to follow standards:
case "`uname -s`" in
Darwin) user=_www; uid=70 ;;
FreeBSD) user=www; uid=80 ;;
Linux)
if grep Ubuntu /etc/lsb-release; then
user=www-data; uid=$(id -u www-data)
elif [ -f /etc/debian_version ]; then
user=www-data; uid=$(id -u www-data)
elif etc
etc
fi
;;
esac
I'm not up on the best ways to detect different Linux distros, so that may require a bit of additional research for you.