Rocksdb background threads are working after deleting db instance - c++

My start up method:
vector<ColumnFamilyDescriptor> columnFamilies = ...
DBOptions dbOptions(options);
std::vector<int32_t> ttls = ...
DBWithTTL* _db;
std::vector<ColumnFamilyHandle*> _handles;
Status status = DBWithTTL::Open(dbOptions, WORKING_DIRECTORY, columnFamilies, &_handles, &_db, ttls, false);
My shutdown method:
for (auto handle : _handles) {
delete handle;
}
delete _db->GetBaseDB();
But after shutdown is completed, I'm still getting merge requests with stack under rocksdb::DBImpl::BGWorkCompaction(void * arg), which of course fail because all column family handles were disposed of.
How can I mark any compaction or flushing to stop? Deleting db instance doesn't seem to be enough.

Related

DDBLockClient - Non leaders wont compete for leadership after failing first time

I am using [dynamoDBLockClient][1] to choose the leader host
Expectation :
If host is dead new host should become leader in next 12 seconds
Followed this:
https://aws.amazon.com/blogs/database/building-distributed-locks-with-the-dynamodb-lock-client/
https://github.com/amazon-archives/dynamodb-lock-client
Reality
If leader host is dead other host/non leader wont try to become leader. How do i set DDB lock in such a way that if leader is dead other host must become leader in next 10 seconds
#Provides
public AmazonDynamoDBLockClient getLeaderSelectionLockClient(AmazonDynamoDB dynamoDB) {
final AmazonDynamoDBLockClient client = new AmazonDynamoDBLockClient(
AmazonDynamoDBLockClientOptions.builder(dynamoDB, LEADER_SELECTION_DDB_TABLE)
.withTimeUnit(TimeUnit.SECONDS)
.withLeaseDuration(10L)
.withHeartbeatPeriod(3L)
.withOwnerName(EC2MetadataUtils.getInstanceId())
.withCreateHeartbeatBackgroundThread(true) //If true then leader will continue to be leader till it die
.build());
return client;
}
#Provides
public AcquireLockOptions getAcquireLockOptionsForEnchanter() {
final String keyToLock = "LEADER";
AcquireLockOptions acquireLockOptions = AcquireLockOptions
.builder(keyToLock)
.withRefreshPeriod(11L)
.withTimeUnit(TimeUnit.SECONDS)
.build();
return acquireLockOptions;
}
public void competeForLeadership() {
final String lockSuccessMessage =
"Acquired lock! If I die, my lock will expire in 10 seconds Otherwise, I will hold it until I stop "
+ "heart beating." + EC2MetadataUtils.getInstanceId();
try {
final Optional<LockItem> lockItem = dynamoDBLockClient.tryAcquireLock(acquireLockOptions);
if (lockItem.isPresent()) {
log.info(lockSuccessMessage);
} else {
log.error("Failed to acquire lock!");
}
} catch (Exception e) {
log.error("Leader Selector is down");
}
}
As per document https://www.mvndoc.com/c/com.amazonaws/dynamodb-lock-client/com/amazonaws/services/dynamodbv2/AcquireLockOptions.AcquireLockOptionsBuilder.html#withAdditionalTimeToWaitForLock-java.lang.Long-
withRefreshPeriod should have solved this problem but it is not solving.
How to do set up correctly?
Looks like tryAcquire is non blocking call since I can see control is coming back to code after failed to get lock with message "Failed to acquire lock!"
Is it possible to keep on retrying till it get lock
Thanks
Jk

Understanding siddhi snapshot concept

I have query and execution plan, I want to take snapshot of that so that I could restore it on receiver side and start executing it again.
What format should be sent to the receiver?
How to restore on receiver side?
Following is some code which I have taken from Siddhi repository.
SiddhiManager siddhiManager = new SiddhiManager();
String query =
"define stream inStream(meta_roomNumber int,meta_temperature double);" +
"from inStream#window(10)[meta_temperature > 50]\n" +
"select *" +
"insert into outStream;";
ExecutionPlanRuntime executionPlanRuntime = siddhiManager.createExecutionPlanRuntime(query);
executionPlanRuntime.start();
SiddhiContext siddhicontext = new SiddhiContext();
context.setSiddhiContext(siddhicontext);
context.setSnapshotService(new SnapshotService(context));
executionPlanRuntime.snapshot();
You can use PersistenceStore to persist the state (snapshot) of the execution plan and restore it later. Please refer to the following PersistenceTestCase to get an idea on its usage. i.e.;
// Create executionPlanRuntime
ExecutionPlanRuntime executionPlanRuntime = siddhiManager.createExecutionPlanRuntime(executionPlan);
// Register Callbacks, InputHandlers
executionPlanRuntime.addCallback("query1", queryCallback);
stream1 = executionPlanRuntime.getInputHandler("Stream1");
// Start executionPlanRuntime
executionPlanRuntime.start();
// Send events
stream1.send(new Object[]{"WSO2", 25.6f, 100});
Thread.sleep(100);
stream1.send(new Object[]{"GOOG", 47.6f, 100});
Thread.sleep(100);
// Persist the state
executionPlanRuntime.persist();
// Shutdown the running execution plan
executionPlanRuntime.shutdown();
// Create new executionPlanRuntime
executionPlanRuntime = siddhiManager.createExecutionPlanRuntime(executionPlan);
// Register Callbacks, InputHandlers
executionPlanRuntime.addCallback("query1", queryCallback);
stream1 = executionPlanRuntime.getInputHandler("Stream1");
// Start executionPlanRuntime
executionPlanRuntime.start();
// Restore to previously persisted state
executionPlanRuntime.restoreLastRevision();

MySQL Asynchronous?

Im basically facing a blocking problem.
I have my server coded based on C++ Boost.ASIO using 8 threads since the server has 8 logical cores.
My problem is a thread may face 0.2~1.5 seconds of blocking on a MySQL query and I honestly don't know how to go around that since MySQL C++ Connector does not support asynchronous queries, and I don't know how to design the server "correctly" to use multiple threads for doing the queries.
This is where I'm asking for opinions of what to do in this case.
Create 100 threads for async' query sql?
Could I have an opinion from experts about this?
Okay, the proper solution to this would be to extend Asio and write a mysql_service implementation to integrate this. I was almost going to find out how this is done right away, but I wanted to get started using an "emulation".
The idea is to have
your business processes using an io_service (as you are already doing)
a database "facade" interface that dispatches async queries into a different queue (io_service) and posts the completion handler back onto the business_process io_service
A subtle tweak needed here you need to keep the io_service on the business process side from shutting down as soon as it's job queue is empty, since it might still be awaiting a response from the database layer.
So, modeling this into a quick demo:
namespace database
{
// data types
struct sql_statement { std::string dml; };
struct sql_response { std::string echo_dml; }; // TODO cover response codes, resultset data etc.
I hope you will forgive my gross simplifications :/
struct service
{
service(unsigned max_concurrent_requests = 10)
: work(io_service::work(service_)),
latency(mt19937(), uniform_int<int>(200, 1500)) // random 0.2 ~ 1.5s
{
for (unsigned i = 0; i < max_concurrent_requests; ++i)
svc_threads.create_thread(boost::bind(&io_service::run, &service_));
}
friend struct connection;
private:
void async_query(io_service& external, sql_statement query, boost::function<void(sql_response response)> completion_handler)
{
service_.post(bind(&service::do_async_query, this, ref(external), std::move(query), completion_handler));
}
void do_async_query(io_service& external, sql_statement q, boost::function<void(sql_response response)> completion_handler)
{
this_thread::sleep_for(chrono::milliseconds(latency())); // simulate the latency of a db-roundtrip
external.post(bind(completion_handler, sql_response { q.dml }));
}
io_service service_;
thread_group svc_threads; // note the order of declaration
optional<io_service::work> work;
// for random delay
random::variate_generator<mt19937, uniform_int<int> > latency;
};
The service is what coordinates a maximum number of concurrent requests (on the "database io_service" side) and ping/pongs the completion back onto another io_service (the async_query/do_async_query combo). This stub implementation emulates latencies of 0.2~1.5s in the obvious way :)
Now comes the client "facade"
struct connection
{
connection(int connection_id, io_service& external, service& svc)
: connection_id(connection_id),
external_(external),
db_service_(svc)
{ }
void async_query(sql_statement query, boost::function<void(sql_response response)> completion_handler)
{
db_service_.async_query(external_, std::move(query), completion_handler);
}
private:
int connection_id;
io_service& external_;
service& db_service_;
};
connection is really only a convenience so we don't have to explicitly deal with various queues on the calling site.
Now, let's implement a demo business process in good old Asio style:
namespace domain
{
struct business_process : id_generator
{
business_process(io_service& app_service, database::service& db_service_)
: id(generate_id()), phase(0),
in_progress(io_service::work(app_service)),
db(id, app_service, db_service_)
{
app_service.post([=] { start_select(); });
}
private:
int id, phase;
optional<io_service::work> in_progress;
database::connection db;
void start_select() {
db.async_query({ "select * from tasks where completed = false" }, [=] (database::sql_response r) { handle_db_response(r); });
}
void handle_db_response(database::sql_response r) {
if (phase++ < 4)
{
if ((id + phase) % 3 == 0) // vary the behaviour slightly
{
db.async_query({ "insert into tasks (text, completed) values ('hello', false)" }, [=] (database::sql_response r) { handle_db_response(r); });
} else
{
db.async_query({ "update * tasks set text = 'update' where id = 123" }, [=] (database::sql_response r) { handle_db_response(r); });
}
} else
{
in_progress.reset();
lock_guard<mutex> lk(console_mx);
std::cout << "business_process " << id << " has completed its work\n";
}
}
};
}
This business process starts by posting itself on the app service. It then does a number of db queries in succession, and eventually exits (by doing in_progress.reset() the app service is made aware of this).
A demonstration main, starting 10 business processes on a single thread:
int main()
{
io_service app;
database::service db;
ptr_vector<domain::business_process> bps;
for (int i = 0; i < 10; ++i)
{
bps.push_back(new domain::business_process(app, db));
}
app.run();
}
In my sample, business_processes don't do any CPU intensive work, so there's no use in scheduling them across CPU's, but if you wanted you could easily achieve this, by replacing the app.run() line with:
thread_group g;
for (unsigned i = 0; i < thread::hardware_concurrency(); ++i)
g.create_thread(boost::bind(&io_service::run, &app));
g.join_all();
See the demo running Live On Coliru
I'm not a MySQL guru, but the following is generic multithreading advice.
Having NumberOfThreads == NumberOfCores is appropriate when none of the threads ever block and you are just splitting the load over all CPUs.
A common pattern is to have multiple threads per CPU, so one is executing while another is waiting on something.
In your case, I'd be inclined to set NumberOfThreads = n * NumberOfCores where 'n' is read from a config file, a registry entry or some other user-settable value. You can test the system with different values of 'n' to fund the optimum. I'd suggest somewhere around 3 for a first guess.

DeadLock possibility in EntityFramework

I have this webmethod(called from android app)
[WebMethod]
public bool addVotes(string username,string password,int votes)
{
bool success= false;
if (Membership.ValidateUser(username, password) == true)
{
DbContext context = new DbContext();
AppUsers user = context.AppUsers.Where(x => x.Username.Equals(username)).FirstOrDefault();
if (user != null)
{
user.Votat += votes;
context.SaveChanges();
success = true;
}
}
return success;
}
This web service will be called from 80 users(probably) in the same period of time(within two or three hours). I am afraid that there can occur a deadlock while reading or updating data in the database. Could you tell me weather there is a possibility of a deadlock and if there is such a possibility how can I prevent it with EF or sql or whatever.
With this code : you can't
AppUsers user = context.AppUsers.Where(x => x.Username.Equals(username)).FirstOrDefault();
This line will wait for a readlock but eventually it will aquire one, so no deadlock possible.
context.SaveChanges();
This line will try and update your user table. It will wait for a writelock but it will eventually get one and then move on.
You can only get a deadlock while inserting / deleting / ... over multiple tables and usually it happens during a cursor iteration. I have yet to bump in a situation where EF ends up in a deadlock, so i wouldn't worry about it too much.
Maybe you'll find this article usefull : http://blogs.msdn.com/b/diego/archive/2012/04/01/tips-to-avoid-deadlocks-in-entity-framework-applications.aspx

AppFabric Cache concurrency issue?

While stress testing prototype of our brand new primary system, I run into concurrent issue with AppFabric Cache. When concurrently calling many DataCache.Get() and Put() with same cacheKey, where I attempt to store relatively large objet, I recieve "ErrorCode:SubStatus:There is a temporary failure. Please retry later." It is reproducible by the following code:
var dcfc = new DataCacheFactoryConfiguration
{
Servers = new[] {new DataCacheServerEndpoint("localhost", 22233)},
SecurityProperties = new DataCacheSecurity(DataCacheSecurityMode.None, DataCacheProtectionLevel.None),
};
var dcf = new DataCacheFactory(dcfc);
var dc = dcf.GetDefaultCache();
const string key = "a";
var value = new int [256 * 1024]; // 1MB
for (int i = 0; i < 300; i++)
{
var putT = new Thread(() => dc.Put(key, value));
putT.Start();
var getT = new Thread(() => dc.Get(key));
getT.Start();
}
When calling Get() with different key or DataCache is synchronized, this issue will not appear. If DataCache is obtained with each call from DataCacheFactory (DataCache is supposed to be thread-safe) or timeouts are prolonged it has no effect and error is still received.
It seems to me very strange that MS would leave such bug. Did anybody faced similar issue?
I also see the same behavior and my understanding is that this is by design. The cache contains two concurrency models:
Optimistic Concurrency Model methods: Get, Put, ...
Pessimistic Concurrency Model: GetAndLock, PutAndLock, Unlock
If you use optimistic concurrency model methods like Get then you have to be ready to get DataCacheErrorCode.RetryLater and handle that appropriately - I also use a retry approach.
You might find more information at MSDN: Concurrency Models
We have seen this problem as well in our code. We solve this by overloading the Get method to catch expections and then retry the call N times before fallback to a direct request to SQL.
Here is a code that we use to get data from the cache
private static bool TryGetFromCache(string cacheKey, string region, out GetMappingValuesToCacheResult cacheResult, int counter = 0)
{
cacheResult = new GetMappingValuesToCacheResult();
try
{
// use as instead of cast, as this will return null instead of exception caused by casting.
if (_cache == null) return false;
cacheResult = _cache.Get(cacheKey, region) as GetMappingValuesToCacheResult;
return cacheResult != null;
}
catch (DataCacheException dataCacheException)
{
switch (dataCacheException.ErrorCode)
{
case DataCacheErrorCode.KeyDoesNotExist:
case DataCacheErrorCode.RegionDoesNotExist:
return false;
case DataCacheErrorCode.Timeout:
case DataCacheErrorCode.RetryLater:
if (counter > 9) return false; // we tried 10 times, so we will give up.
counter++;
Thread.Sleep(100);
return TryGetFromCache(cacheKey, region, out cacheResult, counter);
default:
EventLog.WriteEntry(EventViewerSource, "TryGetFromCache: DataCacheException caught:\n" +
dataCacheException.Message, EventLogEntryType.Error);
return false;
}
}
}
Then when we need to get something from the cache we do:
TryGetFromCache(key, region, out cachedMapping)
This allows us to use Try methods that encasulates the exceptions. If it returns false, we know thing is wrong with the cache and we can access SQL directly.