boost::pool real memory allocations - c++

I'm trying to use the boost::pool memory pool from #include "boost/pool/pool.hpp".
I want to check how much memory is allocated with boost::pool so I run the command system("ps aux | grep myProgramExe | grep -v grep | awk '{print $5}'"); which gives me the (from the ps man page) VSZ - virtual memory size of the process in KiB (1024-byte units). Device mappings are currently excluded; this is subject to change. (alias vsize).
I'm getting something strange:
Code (code is indented 4 spaces, plus another 4 because it's embedded in a list)
int main()
{
{
boost::pool<> pool(4, 1);
system("ps aux | grep boostHash | grep -v grep | awk '{print \"1. \" $5}'");
void *a = pool.malloc();
pool.free(a);
system("ps aux | grep boostHash | grep -v grep | awk '{print \"2. \" $5}'");
}
system("ps aux | grep boostHash | grep -v grep | awk '{print \"3. \" $5}'");
}
The output is:
1. 18908
2. 19040
3. 19040
Which is strange, because:
a. I wanted to allocate only 4 bytes (the next allocation should be only 1 instance).
b. The memory isn't being free'd when the block is over and pool is dead.
Now I want to allocate instance of size 128, and I want to allocate 1024 like this in the next allocation:
int main()
{
{
boost::pool<> pool(128, 1024);
system("ps aux | grep boostHash | grep -v grep | awk '{print \"4. \" $5}'");
void *a = pool.malloc();
pool.free(a);
system("ps aux | grep boostHash | grep -v grep | awk '{print \"5. \" $5}'");
}
system("ps aux | grep boostHash | grep -v grep | awk '{print \"6. \" $5}'");
}
Output:
4. 18908
5. 19040
6. 18908
Which is fine, because:
a. I wanted to allocate 128 * 1024 = 131072 bytes, and got 19040 - 18908 = 132KB = 135168 bytes. 135168 - 131072 = 4096 bytes (that's just the pool over head, I think).
b. When the block ended, the memory was free'd.
destructor
int main() {
{
boost::pool<> *pool = new boost::pool<>(128, 1024);
system("ps aux | grep boostHash | grep -v grep | awk '{print \"7. \" $5}'");
void *a = pool->malloc();
pool->free(a);
delete pool;
system("ps aux | grep boostHash | grep -v grep | awk '{print \"8. \" $5}'");
}
system("ps aux | grep boostHash | grep -v grep | awk '{print \"9. \" $5}'");
}
Output:
7. 19040
8. 19040
9. 19040
This is strange,
a. For some reason, the size is already allocated (before I called pool.malloc().
b. The size isn't being free'd in delete.
Is this explainable?
Do I need to use another tool instead of ps to see the memory used by the program?

Is this explainable?
Yes.
Do I need to use another tool instead of ps to see the memory used by the program?
You are seeing the memory used by the program.
What you didn't take into account: memory allocation routines are heavily optimized. Libraries (like libc) will use various strategies to make allocation/reallocation faster, for various scenarios. Here are some common memory management strategies:
request memory preemptively, from the operating system; this allows the application to perform subsequent internal allocations of the same type, without the cost of requesting more memory from the OS;
caching released memory; this allows an applications to reuse memory (received from the OS) for subsequent allocations (and again, avoid the overhead of talking with the OS about it)

Related

stop condition for emulating "grep -oE" with awk

I'm trying to emulate GNU grep -Eo with a standard awk call.
What the man says about the -o option is:
-o --only-matching
     Print only the matched (non-empty) parts of matching lines, with each such part on a separate output line.
For now I have this code:
#!/bin/sh
regextract() {
[ "$#" -ge 2 ] || return 1
__regextract_ere=$1
shift
awk -v FS='^$' -v ERE="$__regextract_ere" '
{
while ( match($0,ERE) && RLENGTH > 0 ) {
print substr($0,RSTART,RLENGTH)
$0 = substr($0,RSTART+1)
}
}
' "$#"
}
My question is: In the case that the matching part is 0-length, do I need to continue trying to match the rest of the line or should I move to the next line (like I already do)? I can't find a sample of input+regex that would need the former but I feel like it might exist. Any idea?
Here's a POSIX awk version, which works with a* (or any POSIX awk regex):
echo abcaaaca |
awk -v regex='a*' '
{
while (match($0, regex)) {
if (RLENGTH) print substr($0, RSTART, RLENGTH)
$0 = substr($0, RSTART + (RLENGTH > 0 ? RLENGTH : 1))
if ($0 == "") break
}
}'
Prints:
a
aaa
a
POSIX awk and grep -E use POSIX extended regular expressions, except that awk allows C escapes (like \t) but grep -E does not. If you wanted strict compatibility you'd have to deal with that.
If you can consider a gnu-awk solution then using RS and RT may give identical behavior of grep -Eo.
# input data
cat file
FOO:TEST3:11
BAR:TEST2:39
BAZ:TEST0:20
Using grep -Eo:
grep -Eo '[[:alnum:]]+' file
FOO
TEST3
11
BAR
TEST2
39
BAZ
TEST0
20
Using gnu-awk with RS and RT using same regex:
awk -v RS='[[:alnum:]]+' 'RT != "" {print RT}' file
FOO
TEST3
11
BAR
TEST2
39
BAZ
TEST0
20
More examples:
grep -Eo '\<[[:digit:]]+' file
11
39
20
awk -v RS='\\<[[:digit:]]+' 'RT != "" {print RT}' file
11
39
20
Thanks to the various comments and answers I think that I have a working, robust, and (maybe) efficient code now:
tested on AIX/Solaris/FreeBSD/macOS/Linux
#!/bin/sh
regextract() {
[ "$#" -ge 1 ] || return 1
[ "$#" -eq 1 ] && set -- "$1" -
awk -v FS='^$' '
BEGIN {
ere = ARGV[1]
delete ARGV[1]
}
{
tail = $0
while ( tail != "" && match(tail,ere) ) {
if (RLENGTH) {
print substr(tail,RSTART,RLENGTH)
tail = substr(tail,RSTART+RLENGTH)
} else
tail = substr(tail,RSTART+1)
}
}
' "$#"
}
regextract "$#"
notes:
I pass the ERE string along the file arguments so that awk doesn't pre-process it (thanks #anubhava for pointing that out); C-style escape sequences will still be translated by the regex engine of awk though (thanks #dan for pointing that out).
Because assigning $0 does reset the values of all fields,
I chose FS = '^$' for limiting the overhead
Copying $0 in a separate variable nullifies the overhead induced by assigning $0 in the while loop (thanks #EdMorton for pointing that out).
a few examples:
# Multiple matches in a single line:
echo XfooXXbarXXX | regextract 'X*'
X
XX
XXX
# Passing the regex string to awk as a parameter versus a file argument:
echo '[a]' | regextract_as_awk_param '\[a]'
a
echo '[a]' | regextract '\[a]'
[a]
# The regex engine of awk translates C-style escape sequences:
printf '%s\n' '\t' | regextract '\t'
printf '%s\n' '\t' | regextract '\\t'
\t
Your code will malfunction for match which might have zero or more characters, consider following simple example, let file.txt content be
1A2A3
then
grep -Eo A* file.txt
gives output
A
A
your while's condition is match($0,ERE) && RLENGTH > 0, in this case former part gives true, but latter gives false as match found is zero-length before first character (RSTART was set to 1), thus body of while will be done zero times.

awk: splitting with a regex

I'm trying to parse lines with fields separated by "|" and space padding. I thought it would be as simple as this:
$ echo "1 a | 2 b | 3 c " | awk -F' *| *' '{ print "-->" $2 "<--" }'
However, what I get is
-->a<--
instead of the expected
-->2 b<--
I'm using GNU Awk 4.0.1.
When you use ' *| *', awkinterprets it as space OR space. Hence the output you get is correct one. If you need to have | as a delimiter, just escape it.
$ echo "1 a | 2 b | 3 c " | awk -F' *\\| *' '{ print "-->" $2 "<--" }'
-->2 b<--
Notice that you have to escape it twice, since in awk, \| is considered | as well which will again get interpreted as logical OR.
Because of this, it is very popular to escape such special characters in character class [].
$ echo "1 a | 2 b | 3 c " | awk -F' *[|] *' '{ print "-->" $2 "<--" }'
-->2 b<--
echo "1 a | 2 b | 3 c " | awk -F '|' '{print $2}' | tr -d ' '
produces "2 b" for me

Replace similar strings in a file in place

I have a file with the following types of pairs of strings:
Call Stack: [UniqueObject1] | [UnOb2] | [SuspectedObject1] | [SuspectedObject2] | [SuspectedObject3] | [UnOb3] | [UnOb4] | [UnOb5] | ... end till unique objects
Call Stack: [UniqueObject1] | [UnOb2] | 0x28798765 | 0x18793765 | 0x48792767 | [UnOb3] | [UnOb4] | [UnOb5] | ... end till unique objects
There are many such pairs that occur in the file.
The attributes of this pair are that the first part of the pair has "SuspectedObject1","SuspectedObject2" and so on, which in the second part of the pair are replaced by HEX-VALUES of the address of those objects.
What I want to do is, remove all the second part of the pairs.
Please note the pairs do not occur in any specific order and might be separated by many lines in between.
I plan to iterate through each line of this file, if I see a hex-string given as an address instead of a suspected object, I would want to start comparing the following regex
Call Stack: [UniqueObject1] | [UnOb2] | * | * | * | [UnOb3] | [UnOb4] | [UnOb5] | ... end till unique objects
in the whole file and if a string does match, I want to remove this specific line from the file.
Can someone suggest a shell way to do this?
If I have understood your question correctly, you may need to use awk. Run like:
awk -f script.awk file file
Contents of script.awk:
BEGIN {
FS=" \\| "
}
FNR==NR {
$3=$4=$5=""
a[$0]++
next
}
$3 ~ /^0x[0-9]{8}$/ {
r=$0
$3=$4=$5=""
if (a[$0]<2) {
print r
}
next
}1
Alternatively, here's the one-liner:
awk -F ' \\| ' 'FNR==NR { $3=$4=$5=""; a[$0]++; next } $3 ~ /^0x[0-9]{8}$/ { r=$0; $3=$4=$5=""; if (a[$0]<2) print r; next }1' file{,}

sorting array in bash and regex

I have an array that have the CPU core num and a number for each core.
the array is totals.
How can I sort
totals=( CPU0=12345 CPU1=23456 CPU3=01234)
according to numbers and return the sorted version of cpu number for example (3,0,1) means it is sorted and core 3 is the min and core 1 is the max, in bash? and then assign (3,0,1) to an array?
Try this for sorting:
echo ${totals[*]} | tr ' ' '\n' | sort -n -t= -k2
To store only the CPU numbers in a new array, try:
sorted_cpu_numbers=( $(echo ${totals[*]} | tr ' ' '\n' | sort -n -t= -k2 | awk -F= '{print substr($1, 4, length($1))}') )

how to store the output of system() call?

I am using system(3) on Linux in c++ programs. Now I need to store the output of system(3) in an array or sequence. How I can store the output of system(3).
I am using following:
system("grep -A1 \"<weakObject>\" file_name | grep \"name\" |
grep -Po \"xoc.[^<]*\" | cut -d \".\" -f5 ");
which gives output:
changin
fdjgjkds
dglfvk
dxkfjl
I need to store this output to an array of strings or Sequence of string.
Thanks in advance
system spawns a new shell process that isn't connected to the parent through a pipe or something else.
You need to use the popen library function instead. Then read the output and push each string into your array as you encounter newline characters.
FILE *fp = popen("grep -A1 \"<weakObject>\" file_name | grep \"name\" |
grep -Po \"xoc.[^<]*\" | cut -d \".\" -f5 ", "r");
char buf[1024];
while (fgets(buf, 1024, fp)) {
/* do something with buf */
}
fclose(fp);
You should use popen to read the command output from stdin. so, you'd do something like:
FILE *pPipe;
pPipe = popen("grep -A1 \"\" file_name | grep \"name\" | grep -Po \"xoc.[^<]*\" | cut -d \".\" -f5 ", "rt")
to open it in read text mode and then use fgets or something similar to read from the pipe:
fgets(psBuffer, 128, pPipe)
The esier way:
std::stringstream result_stream;
std::streambuf *backup = std::cout.rdbuf( result_stream.rdbuf() );
int res = system("grep -A1 \"<weakObject>\" file_name | grep \"name\" |
grep -Po \"xoc.[^<]*\" | cut -d \".\" -f5 ");
std::cout.rdbuf(backup);
std::cout << result_stream.str();