I tried to implement lock free Trie structure, but I am stuck on inserting nodes. At first I believed it was easy (my trie structure would not have any delete methods) but even swapping one pointer atomically can be tricky.
I want to swap pointer to point to structure(TrieNode) atomically only when it was nullptr so as to be sure that I do not lose other nods that other thread could insert inbetween.
struct TrieNode{
int t =0;
std::shared_ptr<TrieNode> child{nullptr};
};
std::shared_ptr<TrieNode> root;
auto p = std::atomic_load(&root);
auto node = std::make_shared<TrieNode>();
node->t=1;
auto tmp = std::shared_ptr<TrieNode>{nullptr};
std::cout<<std::atomic_compare_exchange_strong( &(p->child), &tmp,node)<<std::endl;
std::cout<<node->t;
With this code I get exit code -1073741819 (0xC0000005).
EDIT: Thank you for all your coments. Maybe I did not specify my problem so I want to address it now.After around 10 hours of coding last day I changed few things. Now I use ordinarry pointers and for now it is working. I did not test it for now if its race free with multiple threads inserting words. I plan to do it today.
const int ALPHABET_SIZE =4;
enum Alphabet {A,T,G,C,END};
class LFTrie{
private:
struct TrieNode{
std::atomic<TrieNode*> children[ALPHABET_SIZE+1];
};
std::atomic<TrieNode*> root = new TrieNode();
public:
void Insert(std::string word){
auto p =root.load();
int index;
for(int i=0; i<=word.size();i++){
if(i==word.size())
index = END;
else
index = WhatIndex(word[i]);
auto expected = p->children[index].load();
if(!expected){
auto node = new TrieNode();
if(! p->children[index].compare_exchange_strong(expected,node))
delete node;
}
p = p->children[index];
}
}
};
Now I believe it will work with many threads inserting different words . And yes, in this solution I discard node if there next pointer is not null. Sorry for the trouble (I am not native speaker).
CAS pattern should be something like:
auto expected = p->child;
while( !expected ){
if (success at CAS(&p->child, &expected, make_null_replace() ))
break;
}
if you aren't paying attention to the return value/expected and testing that you are replacing null, stored locally, you are in trouble.
On failure, you need to throw away the new node you made.
Related
I have a simple linked list. There is no danger of the ABA problem, I'm happy with Blocking category and I don't care if my list is FIFO, LIFO or randomized. At long as the inserting succeeds without making others fails.
The code for that looks something like this:
class Class {
std::atomic<Node*> m_list;
...
};
void Class::add(Node* node)
{
node->next = m_list.load(std::memory_order_acquire);
while (!m_list.compare_exchange_weak(node->next, node, std::memory_order_acq_rel, std::memory_order_acquire));
}
where I more or less randomly filled in the used memory_order's.
What are the right memory orders to use here?
I've seen people use std::memory_order_relaxed in all places, one guy on SO used that too, but then std::memory_order_release for the success case of compare_exchange_weak -- and the genmc project uses memory_order_acquire / twice memory_order_acq_rel in a comparable situation, but I can't get genmc to work for a test case :(.
Using the excellent tool from Michalis Kokologiannakis genmc, I was able to verify the required memory orders with the following test code. Unfortunately, genmc currently requires C code, but that doesn't matter for figuring out what the memory orders need to be of course.
// Install https://github.com/MPI-SWS/genmc
//
// Then test with:
//
// genmc -unroll 5 -- genmc_sll_test.c
// These header files are replaced by genmc (see /usr/local/include/genmc):
#include <pthread.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
#include <stdatomic.h>
#include <stdio.h>
#define PRODUCER_THREADS 3
#define CONSUMER_THREADS 2
struct Node
{
struct Node* next;
};
struct Node* const deleted = (struct Node*)0xd31373d;
_Atomic(struct Node*) list;
void* producer_thread(void* node_)
{
struct Node* node = (struct Node*)node_;
// Insert node at beginning of the list.
node->next = atomic_load_explicit(&list, memory_order_relaxed);
while (!atomic_compare_exchange_weak_explicit(&list, &node->next,
node, memory_order_release, memory_order_relaxed))
;
return NULL;
}
void* consumer_thread(void* param)
{
// Replace the whole list with an empty list.
struct Node* head = atomic_exchange_explicit(&list, NULL, memory_order_acquire);
// Delete each node that was in the list.
while (head)
{
struct Node* orphan = head;
head = orphan->next;
// Mark the node as deleted.
assert(orphan->next != deleted);
orphan->next = deleted;
}
return NULL;
}
pthread_t t[PRODUCER_THREADS + CONSUMER_THREADS];
struct Node n[PRODUCER_THREADS]; // Initially filled with zeroes -->
// none of the Node's is marked as deleted.
int main()
{
// Start PRODUCER_THREADS threads that each append one node to the queue.
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (pthread_create(&t[i], NULL, producer_thread, &n[i]))
abort();
// Start CONSUMER_THREAD threads that each delete all nodes that were added so far.
for (int i = 0; i < CONSUMER_THREADS; ++i)
if (pthread_create(&t[PRODUCER_THREADS + i], NULL, consumer_thread, NULL))
abort();
// Wait till all threads finished.
for (int i = 0; i < PRODUCER_THREADS + CONSUMER_THREADS; ++i)
if (pthread_join(t[i], NULL))
abort();
// Count number of elements still in the list.
struct Node* l = list;
int count = 0;
while (l)
{
++count;
l = l->next;
}
// Count the number of deleted elements.
int del_count = 0;
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (n[i].next == deleted)
++del_count;
assert(count + del_count == PRODUCER_THREADS);
//printf("count = %d; deleted = %d\n", count, del_count);
return 0;
}
The output of which is
$ genmc -unroll 5 -- genmc_sll_test.c
Number of complete executions explored: 6384
Total wall-clock time: 1.26s
Replacing either the memory_order_release or memory_order_acquire with memory_order_relaxed causes an assertion.
In fact, it can be checked that using exclusive memory_order_relaxed when just inserting nodes is sufficient to get them all cleanly in the list (although in a 'random' order - there is nothing sequential consistent, so the order in which they are added is not necessarily the same as that the threads try to add them, if such correlation exists for other reasons).
However, the memory_order_release is required so that when head is read with memory_order_acquire we can be certain that all non-atomic next pointers are visible in the "consumer" thread.
Note there is no ABA problem here because values used for head and next cannot be "reused" before they are deleted by the 'consumer_thread' function, which is the only place where these node are allowed to be deleted (therefore), implying that there can only be one consumer thread (this test code does NOT check for the ABA problem, so it also works using 2 CONSUMER_THREADS).
The actual code is a garbage collection mechanism where multiple "producer" threads add pointers to a singly linked list when those can be deleted, but where it is only safe to actually do so in one specific thread (in that case there is only one "consumer" thread thus, which performs this garbage collection at a well-known place in a main-loop).
I've stumbled upon a problem whilst doing my DSA (Data Structures and Algorithms) homework. I'm said to implement a B-Tree with Insertion and Search algorithms. As far as it goes, the search is working correctly, but I'm having trouble implementing the insertion function. Specifically the logic behind the B-Tree node-splitting algorithm. A pseudocode/C-style I could come up with is the following:
#define D 2
#define DD 2*D
typedef btreenode* btree;
typedef struct node
{
int keys[DD]; //D == 2 and DD == 2*D;
btree pointers[DD+1];
int index; //used to iterate throught the "keys" array
}btreenode;
void splitNode(btree* parent, btree* child1, btree* child2)
{
//Copies the content from the splitted node to the children
(*child1)->key[0] = (*parent)->key[0];
(*child1)->key[1] = (*parent)->key[1];
(*child2)->key[0] = (*parent)->key[2];
(*child2)->key[1] = (*parent)->key[3];
(*child1)->index = 1;
(*child2)->index = 1;
//"Clears" the parent node from any data
for(int i = 0; i<DD; i++) (*parent)->key[i] = -1;
for(int i = 0; i<DD+1; i++) (*parent)->pointers[i] = NULL
//Fixed the pointers to the children
(*parent)->index = 0;
//the line bellow was taken out for creating a new node that didn't have to be there.
//(*parent)->key[(*parent)->index] = newNode(); // The newNode() function allocs and inserts a the new key that I need to insert.
(*parent)->pointers[index] = (*child1);
(*parent)->pointers[index+1] = (*child2);
}
I'm almost sure that I'm messing up something with the pointers, but I'm not sure what. Any help is appreciated. Maybe I need a little bit more study on the B-Tree subject? I must add that while I can use basic input/output from C++, I need to use C-style structs.
You don't need to create a new node here. You've apparently already created the two new child nodes. All you have to do here after populating the children is make the parent now point to the two children, via a copy of the first key in each of them, and adjust its key count to two. You don't need to set the parent keys to -1 either.
[EDIT]Fixed my code. Is while(temp != NULL), not while(temp->next != NULL). Sorry to insert wrong code.
Today I've participated an online programming test. The interviewer used Codility to evaluate my code and the other interviewees.
At some moment a question about Linked list was made. It's about to count how many items a linked list has.
I did the only possible approach to do this, AFAIK:
//This is struct declaration
struct SomeStruct
{
int value;
SomeStruct* next;
}
int elementCount(SomeStruct* list)
{
int count = 0;
if(list != NULL)
{
SomeStruct* temp = list;
while(temp != NULL)
{
count++;
temp = temp->next;
}
}
return count;
}
I remember when I send this code as answer for this question, Codility points me out that this solution is wrong because its consume too much time to execute the task.
In my head and in this thread on SO there's no other way to get size of linked list without traversing it, not in a simple way.
Is there a problem with Codility when it says this solution is wrong? Or there are another approaches?
PS: the test allowed using of STL
Your solution is incorrect, since it returns 1 less than the actual count. Just try applying it to a list with 1 element.
Why did you come up with this strange two-tiered structure with an if and and a cycle that checks temp->next? Why not just
unsigned elementCount(const SomeStruct *list)
{
unsigned count = 0;
for (const SomeStruct *temp = list; temp != NULL; temp = temp->next)
++count;
return count;
}
I suspect that you decided to treat the element pointed by the list as the unused and reserved "header" element. Indeed, sometimes it might make sense to do implement lists that way. But I see nothing like that stated in your post. Did they tell you to treat it that way specifically?
well you don't have to evaluate the indirection temp->next twice for each iteration.
you can simply do
int count( SomeStruct const* pNode )
{
int result = 0;
while( pNode != 0 )
{
++result;
pNode = pNode->next;
}
return result;
}
Also, as WhozCraig notes, your code was logically wrong (yielding an off by one result), not just potentially inefficient.
Codility may be using a circularly linked list to check, in this case, your code will never end.
Using STL trivilailzes this though, as it has a List<> with a size method.
I'm trying to deep copy a linked list . I need an algorithm that executes in Linear Time O(n). This is what i have for now , but i'm not able to figure out what's going wrong with it. My application crashes and i'm suspecting a memory leak that i've not been able to figure out yet. This is what i have right now
struct node {
struct node *next;
struct node *ref;
};
struct node *copy(struct node *root) {
struct node *i, *j, *new_root = NULL;
for (i = root, j = NULL; i; j = i, i = i->next) {
struct node *new_node;
if (!new_node)
{
abort();
}
if (j)
{
j->next = new_node;
}
else
{
new_root = new_node;
}
new_node->ref = i->ref;
i->ref = new_node;
}
if (j)
{
j->next = NULL;
}
for (i = root, j = new_root; i; i = i->next, j = j->next)
j->ref =i->next->ref;
return new_root;
}
Can anyone point out where i'm going wrong with this ??
This piece alone:
struct node *new_node;
if (!new_node)
{
abort();
}
Seems good for a random abort() happening. new_node is not assigned and will contain a random value. The !new_node expression could already be fatal (on some systems).
As a general hint, you should only require 1 for-loop. Some code upfront to establish the new_root.
But atruly deep copy would also require cloning whatever ref is pointing to. It seems to me the second loop assigns something from the original into the copy. But I'm not sure, what is ref ?
One thing I immediately noticed was that you never allocate space for new_node. Since auto variables are not guaranteed to be initialized, new_node will be set to whatever value was in that memory before. You should probably start with something like:
struct node *new_node = (new_node *) malloc(sizeof(struct node));
in C, or if you're using C++:
node* new_node = new node;
Copying the list is simple enough to do. However, the requirement that the ref pointers point to the same nodes in the new list relative to the source list is going to be difficult to do in any sort of efficient manner. First, you need some way to identify which node relative to the source list they point to. You could put some kind of identifier in each node, say an int which is set to 0 in the first node, 1 in the second, etc. Then after you've copied the list you could make another pass over the list to set up the ref pointers. The problem with this approach (other that adding another variable to each node) is that it will make the time complexity of the algorithm jump from O(n) to O(n^2).
This is possible, but it takes some work. I'll assume C++, and omit the struct keyword in struct node.
You will need to do some bookkeeping to keep track of the "ref" pointers. Here, I'm converting them to numerical indices into the original list and then back to pointers into the new list.
node *copy_list(node const *head)
{
// maps "ref" pointers in old list to indices
std::map<node const *, size_t> ptr_index;
// maps indices into new list to pointers
std::map<size_t, node *> index_ptr;
size_t length = 0;
node *curn; // ptr into new list
node const *curo; // ptr into old list
node *copy = NULL;
for (curo = head; curo != NULL; curo = curo->next) {
ptr_index[curo] = length;
length++;
// construct copy, disregarding ref for now
curn = new node;
curn->next = copy;
copy = curn;
}
curn = copy;
for (size_t i=0; i < length; i++, curn = curn->next)
index_ptr[i] = curn;
// set ref pointers in copy
for (curo = head, curn = copy; curo != NULL; ) {
curn->ref = index_ptr[ptr_index[curo->ref]];
curo = curo->next;
curn = curn->next;
}
return copy;
}
This algorithm runs in O(n lg n) because it stores all n list elements in an std::map, which has O(lg n) insert and retrieval complexity. It can be made linear by using a hash table instead.
NOTE: not tested, may contain bugs.
I am working on a query processor that reads in long lists of document id's from memory and looks for matching id's. When it finds one, it creates a DOC struct containing the docid (an int) and the document's rank (a double) and pushes it on to a priority queue. My problem is that when the word(s) searched for has a long list, when I try to push the DOC on to the queue, I get the following exception:
Unhandled exception at 0x7c812afb in QueryProcessor.exe: Microsoft C++ exception: std::bad_alloc at memory location 0x0012ee88..
When the word has a short list, it works fine. I tried pushing DOC's onto the queue in several places in my code, and they all work until a certain line; after that, I get the above error. I am completely at a loss as to what is wrong because the longest list read in is less than 1 MB and I free all memory that I allocate. Why should there suddenly be a bad_alloc exception when I try to push a DOC onto a queue that has a capacity to hold it (I used a vector with enough space reserved as the underlying data structure for the priority queue)?
I know that questions like this are almost impossible to answer without seeing all the code, but it's too long to post here. I'm putting as much as I can and am anxiously hoping that someone can give me an answer, because I am at my wits' end.
The NextGEQ function reads a list of compressed blocks of docids block by block. That is, if it sees that the lastdocid in the block (in a separate list) is larger than the docid passed in, it decompresses the block and searches until it finds the right one. Each list starts with metadata about the list with the lengths of each compressed chunk and the last docid in the chunk. data.iquery points to the beginning of the metadata; data.metapointer points to wherever in the metadata the function currently is; and data.blockpointer points to the beginning of the block of uncompressed docids, if there is one. If it sees that it was already decompressed, it just searches. Below, when I call the function the first time, it decompresses a block and finds the docid; the push onto the queue after that works. The second time, it doesn't even need to decompress; that is, no new memory is allocated, but after that time, pushing on to the queue gives a bad_alloc error.
Edit: I cleaned up my code some more so that it should compile. I also added in the OpenList() and NextGEQ functions, although the latter is long, because I think the problem is caused by a heap corruption somewhere in it. Thanks a lot!
struct DOC{
long int docid;
long double rank;
public:
DOC()
{
docid = 0;
rank = 0.0;
}
DOC(int num, double ranking)
{
docid = num;
rank = ranking;
}
bool operator>( const DOC & d ) const {
return rank > d.rank;
}
bool operator<( const DOC & d ) const {
return rank < d.rank;
}
};
struct listnode{
int* metapointer;
int* blockpointer;
int docposition;
int frequency;
int numberdocs;
int* iquery;
listnode* nextnode;
};
void QUERYMANAGER::SubmitQuery(char *query){
listnode* startlist;
vector<DOC> docvec;
docvec.reserve(20);
DOC doct;
//create a priority queue to use as a min-heap to store the documents and rankings;
priority_queue<DOC, vector<DOC>,std::greater<DOC>> q(docvec.begin(), docvec.end());
q.push(doct);
//do some processing here; startlist is a pointer to a listnode struct that starts the //linked list
//point the linked list start pointer to the node returned by the OpenList method
startlist = &OpenList(value);
listnode* minpointer;
q.push(doct);
//start by finding the first docid in the shortest list
int i = 0;
q.push(doct);
num = NextGEQ(0, *startlist);
q.push(doct);
while(num != -1)
{
q.push(doct);
//the is where the problem starts - every previous q.push(doct) works; the one after
//NextGEQ(num +1, *startlist) gives the bad_alloc error
num = NextGEQ(num + 1, *startlist);
//this is where the exception is thrown
q.push(doct);
}
}
//takes a word and returns a listnode struct with a pointer to the beginning of the list
//and metadata about the list
listnode QUERYMANAGER::OpenList(char* word)
{
long int numdocs;
//create a new node in the linked list and initialize its variables
listnode n;
n.iquery = cache -> GetiList(word, &numdocs);
n.docposition = 0;
n.frequency = 0;
n.numberdocs = numdocs;
//an int pointer to point to where in the metadata you are
n.metapointer = n.iquery;
n.nextnode = NULL;
//an int pointer to point to the uncompressed block of data, if there is one
n.blockpointer = NULL;
return n;
}
int QUERYMANAGER::NextGEQ(int value, listnode& data)
{
int lengthdocids;
int lengthfreqs;
int lengthpos;
int* temp;
int lastdocid;
lastdocid = *(data.metapointer + 2);
while(true)
{
//if it's not the first chunk in the list, the blockpointer will be pointing to the
//most recently opened block and docpos to the current position in the block
if( data.blockpointer && lastdocid >= value)
{
//if the last docid in the chunk is >= the docid we're looking for,
//go through the chunk to look for a match
//the last docid in the block is in lastdocid; keep going until you hit it
while(*(data.blockpointer + data.docposition) <= lastdocid)
{
//compare each docid with the docid passed in; if it's greater than or equal to it, return a pointer to the docid
if(*(data.blockpointer + data.docposition ) >= value)
{
//return the next greater than or equal docid
return *(data.blockpointer + data.docposition);
}
else
{
++data.docposition;
}
}
//read through the whole block; couldn't find matching docid; increment metapointer to the next block;
//free the block's memory
data.metapointer += 3;
lastdocid = *(data.metapointer + 3);
free(data.blockpointer);
data.blockpointer = NULL;
}
//reached the end of a block; check the metadata to find where the next block begins and ends and whether
//the last docid in the block is smaller or larger than the value being searched for
//first make sure that you haven't reached the end of the list
//if the last docid in the chunk is still smaller than the value passed in, move the metadata pointer
//to the beginning of the next chunk's metadata; read in the new metadata
while(true)
// while(*(metapointers[index]) != 0 )
{
if(lastdocid < value && *(data.metapointer) !=0)
{
data.metapointer += 3;
lastdocid = *(data.metapointer + 2);
}
else if(*(data.metapointer) == 0)
{
return -1;
}
else
//we must have hit a chunk whose lastdocid is >= value; read it in
{
//read in the metadata
//the length of the chunk of docid's is cumulative, so subtract the end of the last chunk
//from the end of this chunk to get the length
//find the end of the metadata
temp = data.metapointer;
while(*temp != 0)
{
temp += 3;
}
temp += 2;
//temp is now pointing to the beginning of the list of compressed data; use the location of metapointer
//to calculate where to start reading and how much to read
//if it's the first chunk in the list,the corresponding metapointer is pointing to the beginning of the query
//so the number of bytes of docid's is just the first integer in the metadata
if( data.metapointer == data.iquery)
{
lengthdocids = *data.metapointer;
}
else
{
//start reading from the offset of the end of the last chunk (saved in metapointers[index] - 3)
//plus 1 = the beginning of this chunk
lengthdocids = *(data.metapointer) - (*(data.metapointer - 3));
temp += (*(data.metapointer - 3)) / sizeof(int);
}
//allocate memory for an array of integers - the block of docid's uncompressed
int* docblock = (int*)malloc(lengthdocids * 5 );
//decompress docid's into the block of memory allocated
s9decompress((int*)temp, lengthdocids /4, (int*) docblock, true);
//set the blockpointer to point to the beginning of the block
//and docpositions[index] to 0
data.blockpointer = docblock;
data.docposition = 0;
break;
}
}
}
}
Thank you very much, bsg.
QUERYMANAGER::OpenList returns a listnode by value. In startlist = &OpenList(value); you then proceed to take the address of the temporary object that's returned. When the temporary goes away, you may be able to access the data for a time and then it's overwritten. Could you just declare a non-pointer listnode startlist on the stack and assign it the return value directly? Then remove the * in front of other uses and see if that fixes the problem.
Another thing you can try is replacing all pointers with smart pointers, specifically something like boost::shared_ptr<>, depending on how much code this really is and how much you're comfortable automating the task. Smart pointers aren't the answer to everything, but they're at least safer than raw pointers.
Assuming you have heap corruption and are not in fact exhausting memory, the commonest way a heap can get corrupted is by deleting (or freeing) the same pointer twice. You can quite easily find out if this is the issue by simply commenting out all your calls to delete (or free). This will cause your program to leak like a sieve, but if it doesn't actually crash you have probably identified the problem.
The other common cause cause of a corrupt heap is deleting (or freeing) a pointer that wasn't ever allocated on the heap. Differentiating between the two causes of corruption is not always easy, but your first priority should be to find out if corruption is actually the problem.
Note this approach won't work too well if the things you are deleting have destructors which if not called break the semantics of your program.
Thanks for all your help. You were right, Neil - I must have managed to corrupt my heap. I'm still not sure what was causing it, but when I changed the malloc(numdocids * 5) to malloc(256) it magically stopped crashing. I suppose I should have checked whether or not my mallocs were actually succeeding! Thanks again!
Bsg