Longest common subsequence: multiprocess version errors but sequential version works - c++

I've been attempting to solve the longest common subsequence problem using multiprocessing and multi-threading, and I have implemented a multiprocess version of the code, using the usual dynamic programming approach: generate a score matrix, each element depends on the one to its left, north-west and directly above.
In my multiprocess approach, I have adopted propagating the wavefront along the anti-diagonals of the score matrix, and to make life easy, I have performed a shear transform on said score matrix, so that each antidiagonal is now horizontal (this is for improved memory access):
Following is my code (admittedly rather long, which allows for some set-up):
#include <algorithm>
#include <atomic>
#include <cstring>
#include <fcntl.h>
#include <fstream>
#include <iostream>
#include <string>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include <unistd.h>
#include <x86intrin.h>
#define LOGICAL_CORES (int) sysconf(_SC_NPROCESSORS_CONF) /* Number of
logical cores on system: indirectly determines number of processes run */
#define MAX_WORK_SIZE 256 /* Maximum amount
of work for each worker process */
#define NUM_ANTIDIAGS (X + Y + 1) /* Number of anti-
diagonals to process */
#define ANTIDIAG_SIZE std::max(X, Y) /* Length of each
anti-diagonal of the score matrix */
#define ANTIDIAG_REAL_SIZE ((ANTIDIAG_SIZE + MAX_WORK_SIZE - 1) & -MAX_WORK_SIZE) /* Length of each
anti-diagonal in memory: a multiple of MAX_WORKER_SIZE */
#define NUM_WORKERS (ANTIDIAG_REAL_SIZE / MAX_WORK_SIZE) /* Total number of
worker processes */
// The sizes of the input strings
u_int32_t X, Y;
u_int32_t *back; /* The back anti-diagonal, read from */
u_int32_t *middle; /* The middle antidiagonal, read from */
u_int32_t *front; /* The front antidiagonal, written to */
struct sync_container {
pthread_barrier_t barrier; /* A barrier, to ensure all threads are synchronised */
pthread_barrierattr_t barrierattr; /* Barrier attributes */
std::vector<pid_t> pids; /* A list of process IDs of the worker processes */
};
u_int32_t *data;
sync_container *sync_data;
std::string seq_1;
std::string seq_2;
void read_files(std::ifstream f1, std::ifstream f2)
{
if (f1.fail() || f2.fail())
{
std::cout << "Error reading files; exiting." << std::endl;
exit(EXIT_FAILURE);
}
f1 >> X >> seq_1;
f1.close();
f2 >> Y >> seq_2;
f2.close();
}
void shm_setup()
{
data = reinterpret_cast<u_int32_t *>(mmap(nullptr, 4 * ANTIDIAG_REAL_SIZE * sizeof(uint32_t),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0));
back = reinterpret_cast<u_int32_t *>(data);
middle = back + ANTIDIAG_REAL_SIZE;
front = middle + ANTIDIAG_REAL_SIZE;
memset(back, 0, ANTIDIAG_REAL_SIZE * sizeof(u_int32_t));
memset(middle, 0, ANTIDIAG_REAL_SIZE * sizeof(u_int32_t));
memset(front, 0, ANTIDIAG_REAL_SIZE * sizeof(u_int32_t));
sync_data = static_cast<sync_container *>(mmap(nullptr, sizeof(sync_container),
PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANON, 0, 0));
}
void cleanup()
{
munmap(data, 3 * ANTIDIAG_REAL_SIZE * sizeof(u_int32_t));
munmap(sync_data, sizeof(sync_container));
exit(0);
}
int main(int argc, char **argv)
{
if (argc != 3)
{
std::cout << "Usage: [executable] [file 1] [file 2]" << std::endl;
return 1;
}
read_files(std::ifstream(argv[1], std::ifstream::in),
std::ifstream(argv[2], std::ifstream::in));
// Initialise shared memory and arrays
shm_setup();
// Initialise barrier
pthread_barrierattr_init(&sync_data->barrierattr);
pthread_barrierattr_setpshared(&sync_data->barrierattr, PTHREAD_PROCESS_SHARED);
pthread_barrier_init(&sync_data->barrier, &sync_data->barrierattr, NUM_WORKERS + 1);
int pid = 0;
int worker_id = 0;
for (; worker_id < NUM_WORKERS; ++worker_id)
{
pid = fork();
if (pid) sync_data->pids[worker_id] = pid;
else
break;
}
pthread_barrier_wait(&sync_data->barrier);
for (int antidiag_idx = 2; antidiag_idx < NUM_ANTIDIAGS; ++antidiag_idx)
{
pthread_barrier_wait(&sync_data->barrier);
if (!pid) // worker processes go here
{
for (int element = MAX_WORK_SIZE * worker_id; element < (antidiag_idx * worker_id) + MAX_WORK_SIZE; ++element)
{
if (!element || element >= ANTIDIAG_SIZE) continue;
char vert = seq_1[antidiag_idx - 1 - element];
char horz = seq_2[element - 1];
front[element] = horz == vert ? back[element - 1] + 1
: std::max(middle[element - 1], middle[element]);
}
}
if (pid) // parent process moves pointers
{
back = middle;
middle = front;
front = back;
}
pthread_barrier_wait(&sync_data->barrier);
}
if (!pid) exit(0);
std::cout << middle[ANTIDIAG_SIZE] << std::endl;
cleanup();
}
Now, this code does not work. This is strange, because with a small input size (specifically, < 256), this code only spawns one worker process and one parent process to manage it, and it still fails.
However, when the fork(), various pthread_barrier_wait() calls, and if (pid) control flow paths are removed in the for loop, the code executes perfectly and returns the correct expected length of the LCS between two strings specified in the input files. In other words, it degenerates into effectively a single-threaded, single-process version of the dynamic programming solution, but with the shear transform thing.
There is clearly an issue with my synchronisation, and I can't figure out where it is. I've tried several permutations of adding more pthread_barrier_wait()s, but this hasn't led anywhere.
Where is the synch issue, and how may I fix it?

Related

Using mmap memory for a circular buffer with very low overhead

I have a debugging tool which in order to register its acquired data uses a data structure called DiskPool (code follows). At start, this data structure mmaps a certain amount of data (backed by a file on disk). Clients can allocate memory via a simple bump pointer mechanism (implemented using std::atomic<size_t>.
As the volume of acquired data is massive I have decided to have a window over a time period instead of registering and keeping all the data. To fulfil such a purpose I have to change the disk pool into a circular buffer but this should not impose a considerable overhead as this overhead affects the measurement.
I wanted to ask you if anybody has any idea? (For example, using an atomic interface of STL).
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/stat.h>
#include <atomic>
#include <memory>
#include <signal.h>
#include <chrono>
#include <thread>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
class DiskPool {
char* addr_; // Initialized by mmap()
size_t len_; // Given by the user as many as memory pages as needed
std::atomic<size_t> top_; // Offset from address_
int fd_;
public:
DiskPool(size_t l, const char* file) : len_(l), top_(0),fd_(-1)
{
struct stat st;
fd_= open(file, O_CREAT|O_RDWR, S_IREAD | S_IWRITE);
if (fd_ == -1)
handle_error("open");
if (ftruncate(fd_, len_* sysconf(_SC_PAGE_SIZE)) != 0)
handle_error("ftruncate() error");
else {
fstat(fd_, &st);
printf("the file has %ld bytes\n", (long) st.st_size);
}
addr_ = static_cast<char*>( mmap(NULL, (len_* sysconf(_SC_PAGE_SIZE)),
PROT_READ | PROT_WRITE, MAP_SHARED|MAP_NORESERVE, fd_,0));
if (addr_ == MAP_FAILED)
handle_error("mmap failed.");
}
~DiskPool()
{
close(fd_);
if( munmap(addr_, len_)< 0) {
handle_error("Could not unmap file");
exit(1);}
std::cout << "Successfully unmapped the file. " << std::endl;
}
void* allocate(size_t s)
{
size_t t = std::atomic_fetch_add(&top_, s);
return addr_+t;
}
void flush() {madvise(addr_, len_, MADV_DONTNEED);}
};
As an example, I created sample code that uses this disk pool to record data at the creation and destruction of an object (AutomaticLifetimeCollector).
static const std::string RECORD_FILE = "Data.txt";
static const size_t DISK_POOL_NUMBER_OF_PAGES = 10000;
static std::shared_ptr<DiskPool> diskPool =
std::shared_ptr <DiskPool> (new DiskPool(DISK_POOL_NUMBER_OF_PAGES,RECORD_FILE.c_str()));
struct TaskRecord
{
uint64_t tid; // Thread id
uint64_t tag; // User-given identifier (“f1”)
uint64_t start_time; // nanoseconds
uint64_t stop_time;
uint64_t cpu_time;
TaskRecord(int depth, size_t tag, uint64_t start_time) :
tid(pthread_self()), tag(tag),
start_time(start_time), stop_time(0), cpu_time(0) {}
};
class AutomaticLifetimeCollector
{
TaskRecord* record_;
public:
AutomaticLifetimeCollector(size_t tag) :
record_(new(diskPool->allocate(sizeof(TaskRecord)))
TaskRecord(2, tag, (uint64_t)1000000004L))
{
}
~AutomaticLifetimeCollector() {
record_->stop_time = (uint64_t)1000000000L;
record_->cpu_time = (uint64_t)1000000002L;
}
};
inline void DelayMilSec(unsigned int pduration)
{
std::this_thread::sleep_until(std::chrono::system_clock::now() +
std::chrono::milliseconds(pduration));
}
std::atomic<bool> LoopsRunFlag {true};
void sigIntHappened(int signal)
{
std::cout<< "Application was terminated.";
LoopsRunFlag.store(false, std::memory_order_release);
}
int main()
{
signal(SIGINT, sigIntHappened);
unsigned int i = 0;
while(LoopsRunFlag)
{
AutomaticLifetimeCollector alc(i++);
DelayMilSec(2);
}
diskPool->flush();
return(0);
}
So accounting only for the handing out of variable-sized slices for a variable buffer, I believe a Compare-And-Swap loop should work.
The basic idea here is to read a value (which is atomic), do some computation with it, then write the value, if it did not change since reading. If it did change (another thread/process), the computation must be redone with the new value.
Since you have variable sized objects, I think actually simply slicing it into n array elements with (i + 1) % n won't work, as given (i + item_len) % capacity, it would split the allocation between the end and start of the buffer, and while that can be correct and working, I think maybe not what you wanted. So that means a condition, but I think the CPU should predict it pretty well.
#include <iostream>
#include <atomic>
std::atomic<size_t> next_index = 0;
const size_t len = 100; // small for demo purpose
size_t alloc(size_t required_size)
{
if (required_size > len) std::terminate(); // do something, would cause a buffer overflow
size_t i, ret_index, new_index;
i = next_index.load();
do
{
auto space = len - i;
ret_index = required_size <= space ? i : 0; // Wrap if needed
new_index = ret_index + required_size;
} while (next_index.compare_exchange_weak(i, new_index)); // succeed if value did of i not change
return ret_index;
}
int main()
{
std::cout << alloc(4) << std::endl; // 0 - 3
std::cout << alloc(8) << std::endl; // 4 - 11
std::cout << alloc(32) << std::endl; // 12 - 43
std::cout << alloc(32) << std::endl; // 44 - 75
std::cout << alloc(32) << std::endl; // 0 - 31 (76 - 107 would overflow)
std::cout << alloc(32) << std::endl; // 32 - 63
std::cout << alloc(32) << std::endl; // 64 - 95
std::cout << alloc(32) << std::endl; // 0 - 31 (96 - 127 would overflow)
}
Which should be fairly simple to plug in to your class:
void* allocate(size_t s)
{
if (s > len_ * sysconf(_SC_PAGE_SIZE)) std::terminate(); // do something, would cause a buffer overflow
size_t i, ret_index, new_index;
i = top_.load();
do
{
auto space = len_ * sysconf(_SC_PAGE_SIZE) - i;
ret_index = s <= space ? i : 0; // Wrap if needed
new_index = ret_index + s;
} while (top_.compare_exchange_weak(i, new_index)); // succeed if value did of i not change
return addr_+ ret_index;
}
len_ * sysconf(_SC_PAGE_SIZE) is in a few places, so might be the more useful value to store in len_ itself.

vanilla and chokolate cake producing and consuming by waiter in c++ using semaphore mutex in ubuntu

In my problem two chef, chef x will produce chokolate cake, chef y will produce vanilla cake. There are 5 slots queue0 to put cakes. When slots are full they will take break. Chef z will take cake from this slots and puts vanilla cake in queue1 and chokolate cake in queue2. Waiter 1 takes cake from queue1 and waiter 2 take cakes from queue2. When queue0 is not full chef x and chef y starts making cake. I wrote a code. But it shows that chef y can produce 4 vanilla cake then the code stuck and no further output is showing.
Here is my Code
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include<semaphore.h>
//#include </usr/include/semaphore.h>
// for sleep
#include <unistd.h>
#define BUFF_SIZE 5 /* total number of slots */
typedef struct //Producing queue
{
char buf1[BUFF_SIZE]; /* shared var */
int in1; /* buf[in%BUFF_SIZE] is the first empty slot */
int out1; /* buf[out%BUFF_SIZE] is the first full slot */
sem_t full; /* keep track of the number of full spots */
sem_t empty; /* keep track of the number of empty spots */
// use correct type here
pthread_mutex_t mutex; /* enforce mutual exclusion to shared data */
} sbuf_t1;
typedef struct //Chokolate Cake queue
{
char buf2[BUFF_SIZE]; /* shared var */
int in2; /* buf[in%BUFF_SIZE] is the first empty slot */
int out2; /* buf[out%BUFF_SIZE] is the first full slot */
sem_t full; /* keep track of the number of full spots */
sem_t empty; /* keep track of the number of empty spots */
// use correct type here
pthread_mutex_t mutex; /* enforce mutual exclusion to shared data */
} sbuf_t2;
typedef struct //Vanilla Cake queue
{
char buf3[BUFF_SIZE]; /* shared var */
int in3; /* buf[in%BUFF_SIZE] is the first empty slot */
int out3; /* buf[out%BUFF_SIZE] is the first full slot */
sem_t full; /* keep track of the number of full spots */
sem_t empty; /* keep track of the number of empty spots */
// use correct type here
pthread_mutex_t mutex; /* enforce mutual exclusion to shared data */
} sbuf_t3;
sbuf_t1 shared1;
sbuf_t2 shared2;
sbuf_t3 shared3;
void *Producer1(void *arg)
{
int i, index;
char item;
index = (int)arg;
for(i=0; i<10; ++i)
{
sem_wait(&shared1.empty);
pthread_mutex_lock(&shared1.mutex);
sleep(1);
shared1.buf1[shared1.in1] = 'C';
shared1.in1 = (shared1.in1+1)%BUFF_SIZE;
printf("[P%d] Producing Chokolate Cake...\n", index);
fflush(stdout);
/* Release the buffer */
pthread_mutex_unlock(&shared1.mutex);
/* Increment the number of full slots */
sem_post(&shared1.full);
//if (i % 2 == 1) sleep(1);
}
return NULL;
}
void *Producer2(void *arg)
{
int i, index;
char item;
index = (int)arg;
for(i=0; i<10; ++i)
{
item = i;
sem_wait(&shared1.empty);
pthread_mutex_lock(&shared1.mutex);
sleep(1);
shared1.buf1[shared1.in1] = 'V';
shared1.in1 = (shared1.in1+1)%BUFF_SIZE;
printf("[P%d] Producing Vanilla Cake...\n", index);
fflush(stdout);
/* Release the buffer */
pthread_mutex_unlock(&shared1.mutex);
/* Increment the number of full slots */
sem_post(&shared1.full);
//if (i % 2 == 1) sleep(1);
}
return NULL;
}
void *Chef_Z(void *arg)
{
int i, index;
char item;
index = (int)arg;
for (i=10; i > 0; i--) {
sem_wait(&shared1.full);
pthread_mutex_lock(&shared1.mutex);
sleep(1);
item=shared1.buf1[shared1.out1];
if(item == 'C') // Chokolate Cake queue
{
sem_wait(&shared2.full);
pthread_mutex_lock(&shared2.mutex);
shared2.buf2[shared2.in2]=item;
shared2.in2 = (shared2.in2+1)%BUFF_SIZE;
printf("[C_Z] Consuming Chokolate Cake and stored it in Chokolate queue ...\n");
pthread_mutex_unlock(&shared2.mutex);
/* Increment the number of full slots */
sem_post(&shared2.empty);
}
else if(item == 'V') // Vanilla Cake queue
{
sem_wait(&shared3.full);
pthread_mutex_lock(&shared3.mutex);
shared3.buf3[shared3.in3]=item;
shared3.in3 = (shared3.in3+1)%BUFF_SIZE;
printf("[C_Z] Consuming Vanilla Cake and stored it in Vanilla queue ...\n");
pthread_mutex_unlock(&shared3.mutex);
/* Increment the number of full slots */
sem_post(&shared3.empty);
}
shared1.out1 = (shared1.out1+1)%BUFF_SIZE;
fflush(stdout);
/* Release the buffer */
pthread_mutex_unlock(&shared1.mutex);
/* Increment the number of full slots */
sem_post(&shared1.empty);
/* Interleave producer and consumer execution */
//if (i % 2 == 1) sleep(1);
}
return NULL;
}
void *Waiter1(void *arg) //Chokolate cake waiter
{
int i, index;
char item;
index = (int)arg;
for (i=10; i > 0; i--) {
sem_wait(&shared2.full);
pthread_mutex_lock(&shared2.mutex);
sleep(1);
item=shared2.buf2[shared2.out2];
shared2.out2 = (shared2.out2+1)%BUFF_SIZE;
printf("[W%d] Consuming Chokolate Cake ...\n", index);
fflush(stdout);
/* Release the buffer */
pthread_mutex_unlock(&shared2.mutex);
/* Increment the number of full slots */
sem_post(&shared2.empty);
/* Interleave producer and consumer execution */
//if (i % 2 == 1) sleep(1);
}
return NULL;
}
void *Waiter2(void *arg) // Vanilla cake waiter
{
int i, index;
char item;
index = (int)arg;
for (i=10; i > 0; i--) {
sem_wait(&shared3.full);
pthread_mutex_lock(&shared3.mutex);
sleep(1);
item=shared3.buf3[shared3.out3];
shared3.out3 = (shared3.out3+1)%BUFF_SIZE;
printf("[W%d] Consuming Vanilla Cake ...\n", index);
fflush(stdout);
/* Release the buffer */
pthread_mutex_unlock(&shared3.mutex);
/* Increment the number of full slots */
sem_post(&shared3.empty);
/* Interleave producer and consumer execution */
//if (i % 2 == 1) sleep(1);
}
return NULL;
}
int main()
{
//pthread_t idP, idC;
pthread_t thread1,thread2,thread3,thread4,thread5;
int index;
void *producer1End;
void *producer2End;
void *chef_zEnd;
void *waiter1End;
void *waiter2End;
sem_init(&shared1.full, 0, 0);
sem_init(&shared1.empty, 0, BUFF_SIZE);
pthread_mutex_init(&shared1.mutex, NULL);
sem_init(&shared2.full, 0, 0);
sem_init(&shared2.empty, 0, BUFF_SIZE);
pthread_mutex_init(&shared2.mutex, NULL);
sem_init(&shared3.full, 0, 0);
sem_init(&shared3.empty, 0, BUFF_SIZE);
pthread_mutex_init(&shared3.mutex, NULL);
pthread_create(&thread1, NULL, Producer1, (void*)1 );
pthread_create(&thread2, NULL, Producer2, (void*)2 );
pthread_create(&thread3, NULL, Chef_Z, (void*)1 );
pthread_create(&thread4, NULL, Waiter1, (void*)1);
pthread_create(&thread5, NULL, Waiter2, (void*)2);
pthread_join(thread1,&producer1End);
pthread_join(thread2,&producer2End);
pthread_join(thread3,&chef_zEnd);
pthread_join(thread4,&waiter1End);
pthread_join(thread5,&waiter2End);
pthread_exit(NULL);
}

Why is MD5Sum so fast

I've been studying hashing in C/C++ and tried to replicate the md5sum command in Linux. After analysing the source code, it seems that md5sum relies on the md5 library's md5_stream. I've approximated the md5_stream function from the md5.h library into the code below, and it runs in ~13-14 seconds. I've tried to call the md5_stream function directly and got ~13-14 seconds. The md5sum runs in 4 seconds. What have the GNU people done to get the speed out of the code?
The md5.h/md5.c code is available in the CoreUtils source code.
#include <QtCore/QCoreApplication>
#include <QtCore/QDebug>
#include <iostream>
#include <iomanip>
#include <fstream>
#include "md5.h"
#define BLOCKSIZE 32784
int main()
{
FILE *fpinput, *fpoutput;
if ((fpinput = fopen("/dev/sdb", "rb")) == 0) {
throw std::runtime_error("input file doesn't exist");
}
struct md5_ctx ctx;
size_t sum;
char *buffer = (char*)malloc (BLOCKSIZE + 72);
unsigned char *resblock = (unsigned char*)malloc (16);
if (!buffer)
return 1;
md5_init_ctx (&ctx);
size_t n;
sum = 0;
while (!ferror(fpinput) && !feof(fpinput)) {
n = fread (buffer + sum, 1, BLOCKSIZE - sum, fpinput);
if (n == 0){
break;
}
sum += n;
if (sum == BLOCKSIZE) {
md5_process_block (buffer, BLOCKSIZE, &ctx);
sum = 0;
}
}
if (n == 0 && ferror (fpinput)) {
free (buffer);
return 1;
}
/* Process any remaining bytes. */
if (sum > 0){
md5_process_bytes (buffer, sum, &ctx);
}
/* Construct result in desired memory. */
md5_finish_ctx (&ctx, resblock);
free (buffer);
for (int x = 0; x < 16; ++x){
std::cout << std::setfill('0') << std::setw(2) << std::hex << static_cast<uint16_t>(resblock[x]);
std::cout << " ";
}
std::cout << std::endl;
free(resblock);
return 0;
}
EDIT: Was a default mkspec problem in Fedora 19 64-bit.
fread() is convenient, but don't use fread() if you care about performance. fread() will copy from the OS to a libc buffer, then to your buffer. This extra copying cost CPU cycles and cache.
For better performance use open() then read() to avoid the extra copy. Make sure your read() calls are multiples of the block size, but lower than your CPU cache size.
For best performance use mmap() map the disk directly to RAM.
If you try something like the below code, it should go faster.
// compile gcc mmap_md5.c -lgcrypt
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <gcrypt.h>
#include <linux/fs.h> // ioctl
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int main(int argc, char *argv[])
{
char *addr;
int fd;
struct stat sb;
off_t offset, pa_offset;
size_t length;
ssize_t s;
unsigned char digest[16];
char digest_ascii[32+1] = {0,};
int digest_length = gcry_md_get_algo_dlen (GCRY_MD_MD5);
int i;
if (argc < 3 || argc > 4) {
fprintf(stderr, "%s file offset [length]\n", argv[0]);
exit(EXIT_FAILURE);
}
fd = open(argv[1], O_RDONLY);
if (fd == -1)
handle_error("open");
if (fstat(fd, &sb) == -1) /* To obtain file size */
handle_error("fstat");
offset = atoi(argv[2]);
pa_offset = offset & ~(sysconf(_SC_PAGE_SIZE) - 1);
if (sb.st_mode | S_IFBLK ) {
// block device. use ioctl to find length
ioctl(fd, BLKGETSIZE64, &length);
} else {
/* offset for mmap() must be page aligned */
if (offset >= sb.st_size) {
fprintf(stderr, "offset is past end of file size=%zd, offset=%d\n", sb.st_size, (int) offset);
exit(EXIT_FAILURE);
}
if (argc == 4) {
length = atoi(argv[3]);
if (offset + length > sb.st_size)
length = sb.st_size - offset;
/* Canaqt display bytes past end of file */
} else { /* No length arg ==> display to end of file */
length = sb.st_size - offset;
}
}
printf("length= %zd\n", length);
addr = mmap(NULL, length + offset - pa_offset, PROT_READ,
MAP_PRIVATE, fd, pa_offset);
if (addr == MAP_FAILED)
handle_error("mmap");
gcry_md_hash_buffer(GCRY_MD_MD5, digest, addr + offset - pa_offset, length);
for (i=0; i < digest_length; i++) {
sprintf(digest_ascii+(i*2), "%02x", digest[i]);
}
printf("hash=%s\n", digest_ascii);
exit(EXIT_SUCCESS);
}
It turned out to be an error in the Qt mkspecs regarding an optimization flag not being set properly.

I am trying to clean my data file from special characters with some conditions, but those conditions are not met?

Here is my Code
This code is trying to remove special characters like ",',{,},(,) from a .txt file and replace them with blank space.
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#include <iostream>
#include <time.h>
#include <fstream>
using namespace std;
int main(int argc, char *argv[])
{
int fd;
int i;
int j;
int len;
int count = 0;
int countcoma = 0;
int countquote = 0;
char buf[10];
char spec[] = {',','"',':','{','}','(',')','\''};
fd = open(argv[1],O_RDWR,0777);
while (read(fd,buf,10) != 0) {
len = strlen(buf);
for (i=0;i<len;i++) {
for (j=0;j<8;j++) {
if (buf[i]==spec[j]) {
count =1;
countquote=0;
if (j==1) {
if (countcoma == 0) {
countcoma++;
}
if (countcoma == 1) {
countcoma--;
}
}
if ((j==7) && (countcoma ==1)) {
countquote = 1;
}
break;
}
}
//cout<<countquote;
if ((count != 0) && (countquote == 0)) {
buf[i] = ' ';
}
count = 0;
}
lseek(fd, -sizeof(buf), SEEK_CUR);
write(fd,buf,sizeof(buf));
memset(buf,' ',10);
}
return 0;
}
Now i want the single quotes that are inside the double quotes in my file remain untouched, but all the special characters are replaced with space as mentioned in the code.
I want these kind of single quotes to remain untouched "what's" but after i run the file it becomes what s instead of what's
Have a look at regex and other libraries. (When on UNIX type man regex.) You don't have to code this anymore nowadays, there are a zillion libraries that can do this for you.
Ok, so the problem with your code is that you are doing one thing, that you then undo in the next section. In particular:
if (countcoma == 0) {
countcoma++;
}
if (countcoma == 1) {
countcoma--;
}
Follow the logic: We come in with countcoma as zero. So the first if is true, and it gets incremented. It is now 1. Next if says if (countcoma == 1) so it is now true, and we decrement it.
I replaced it with countcoma = !countcoma; which is a much simpler way to say "if it's 0, make it 1, if it's 1, make it 0. You could put anelseon the back of the firstif` to make the same thing.
There are also a whole bunch of stylistic things: For example hard-coded constants, writing back into the original file (means that if there is a bug, you lose the original file - good thing I didn't close the editor window with my sample file...), including half the universe in header files, and figuring which of the spec characters it is based on the index.
It seems to me that your code is suffering from a more general flaw than what has been pointed out before:
char buf[10]; /* Buffer is un-initialized here!! */
while (read(fd,buf,10) != 0) { /* read up to 10 bytes */
len = strlen(buf); /* What happens here if no \0 byte was read? */
...
lseek(fd, -sizeof(buf), SEEK_CUR); /* skip sizeof(buf) = 10 bytes anyway */
write(fd,buf,sizeof(buf)); /* write sizeof(buf) = 10 bytes anyway */
memset(buf,' ',10); /* initialize buf to contain all spaces
but no \0, so strlen will still result in
reading past the array bounds */

fork exec and mmap issues

For the application I'm developing (under Linux, but I'm trying to maintain portability) I need to switch to shared memory for sharing data across different processes (and threads inside processes). There is a father process generating different children
I need for example to get every process able to increment a shared counter using a named semaphore.
In this case everything is ok:
#include <sys/mman.h>
#include <sys/wait.h>
#include <semaphore.h>
#include <fcntl.h>
#include <iostream>
#include <stdlib.h>
#include <string.h>
using namespace std;
#define SEM_NAME "/mysem"
#define SM_NAME "tmp_sm.txt"
int main(){
int fd, nloop, counter_reset;
int *smo;
sem_t *mutex;
nloop = 100;
counter_reset = 1000;
if (fork() == 0) {
/* child */
/* create, initialize, and unlink semaphore */
mutex = sem_open(SEM_NAME, O_CREAT, 0777, 1);
//sem_unlink(SEM_NAME);
/* open file, initialize to 0, map into memory */
fd = open(SM_NAME, O_RDWR | O_CREAT);
smo = (int *) mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);
/* INCREMENT */
for (int i = 0; i < nloop; i++) {
sem_wait(mutex);
cout << "child: " << (*smo)++ << endl;
if(*smo>=counter_reset){
(*smo)=0;
}
sem_post(mutex);
}
exit(0);
}
/* parent */
/* create, initialize, and unlink semaphore */
mutex = sem_open(SEM_NAME, O_CREAT, 0777, 1);
sem_unlink(SEM_NAME);
/* open file, initialize to 0, map into memory */
fd = open(SM_NAME, O_RDWR | O_CREAT);
smo = (int *) mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);
/* INCREMENT */
for (int i = 0; i < nloop; i++) {
sem_wait(mutex);
cout << "parent: " << (*smo)++ << endl;
if(*smo>=counter_reset){
(*smo)=0;
}
sem_post(mutex);
}
exit(0);
}
So far so good: both semaphore and shared counter are ok (same address in memory) and increment and reset work fine.
The program fails simply by moving child source code into a new source file invoked by exec. Shared memory and named semaphore addresses are different therefore increment fails.
Any suggestion? I used named semaphores and named shared memory (using a file) to try to get the same pointer values.
UPDATE:
as requested by Joachim Pileborg, this is the "server side" improvements respect above original code:
...
if (fork() == 0) {
/* child */
/*spawn child by execl*/
char cmd[] = "/path_to_bin/client";
execl(cmd, cmd, (char *)0);
cerr << "error while istantiating new process" << endl;
exit(EXIT_FAILURE);
}
...
And this is the "client" source code:
#include <sys/mman.h>
#include <sys/wait.h>
#include <semaphore.h>
#include <fcntl.h>
#include <iostream>
#include <stdlib.h>
using namespace std;
#define SEM_NAME "/mysem"
#define SM_NAME "tmp_ssm.txt"
int main(){
int nloop, counter_reset;
int *smo;
sem_t *mutex;
/* create, initialize, and unlink semaphore */
mutex = sem_open(SEM_NAME, O_CREAT, 0777, 1);
//sem_unlink(SEM_NAME);
/* open file, initialize to 0, map into memory */
int fd = open(SM_NAME, O_RDWR | O_CREAT);
smo = (int *) mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
close(fd);
nloop=100;
counter_reset=1000;
/* INCREMENT */
for (int i = 0; i < nloop; i++) {
sem_wait(mutex);
cout << "child: " << (*smo)++ << endl;
if(*smo>=counter_reset){
(*smo)=0;
}
sem_post(mutex);
}
exit(0);
}
executing this code cause the process to block (deadlock) and waiting for an infinite time. looking at addresses they are tipically found to be:
father semaphore: 0x7f2fe1813000
child semahpore: 0x7f0c4c793000
father shared memory: 0x7f2fe1811000
child shared memory: 0x7ffd175cb000
removing 'sem_post' and 'sem_wait' everything is fine but I need mutual exlusion while incrementing...
Don't unlink the semaphore. it actually removes the semaphore.
From the sem_unlink manual page:
sem_unlink() removes the named semaphore referred to by name. The semaphore name is removed immediately. The semaphore is destroyed once all other processes that have the semaphore open close it.
This means that once you've created the semaphore in the parent process, you immediately remove it. The child process then will not be able to find the semaphore, and instead creates a new one.