Reading the state just after writing into it in the same chaincode - blockchain

I have setup the Fabric as per the instruction using docker and everything works fine. I have a chaincode which stores a value in the world state which I can read afterwards using a query method.
My scenario is something like this: I submit multiple separate requests within a short period of time to store different data in the world state. Within each request I need to read the data just submitted previously. However, I am unable to read the most recently submitted data.
My understanding is that it might be because those data might not be stored in the blockchain yet and hence they cannot be read. With this understanding, I introduced a sleep function to sleep for a few seconds to give enough time for the previously submitted data to be included in the blockchain. However, this approach was not successful.
So I am wondering if there is any way to read the previous data just after storing the subsequent data.
Thanks,
Ripul

Waiting a few seconds in the chaincode would not be sufficient. Data that is 'written' in chaincode is not yet committed to the database, it is only a proposal to write something to the database at that point. Only committed data is read back in chaincode. Therefore after you make an update in chaincode and get the proposal response, you must submit the transaction to ordering. It may take a few seconds for orderer to cut the block, distribute it to peers, and have peers commit the data. Only then can the data be read back in chaincode.
If you must read the data that you just wrote within the same chaincode function, then you will need to keep a map of the data that has been written and retrieve the value from the map rather than from the committed database.

Related

Is Redis atomic when multiple clients attempt to read/write an item at the same time?

Let's say that I have several AWS Lambda functions that make up my API. One of the functions reads a specific value from a specific key on a single Redis node. The business logic goes as follows:
if the key exists:
serve the value of that key to the client
if the key does not exist:
get the most recent item from dynamoDB
insert that item as the value for that key, and set an expiration time
delete that item from dynamoDB, so that it only gets read into memory once
Serve the value of that key to the client
The idea is that every time a client makes a request, they get the value they need. If the key has expired, then lambda needs to first get the item from the database and put it back into Redis.
But what happens if 2 clients make an API call to lambda simultaneously? Will both lambda processes read that there is no key, and both will take an item from a database?
My goal is to implement a queue where a certain item lives in memory for only X amount of time, and as soon as that item expires, the next item should be pulled from the database, and when it is pulled, it should also be deleted so that it won't be pulled again.
I'm trying to see if there's a way to do this without having a separate EC2 process that's just keeping track of timing.
Is redis+lambda+dynamoDB a good setup for what I'm trying to accomplish, or are there better ways?
A Redis server will execute commands (or transactions, or scripts) atomically. But a sequence of operations involving separate services (e.g. Redis and DynamoDB) will not be atomic.
One approach is to make them atomic by adding some kind of lock around your business logic. This can be done with Redis, for example.
However, that's a costly and rather cumbersome solution, so if possible it's better to simply design your business logic to be resilient in the face of concurrent operations. To do that you have to look at the steps and imagine what can happen if multiple clients are running at the same time.
In your case, the flaw I can see is that two values can be read and deleted from DynamoDB, one writing over the other in Redis. That can be avoided by using Redis's SETNX (SET if Not eXists) command. Something like this:
GET the key from Redis
If the value exists:
Serve the value to the client
If the value does not exist:
Get the most recent item from DynamoDB
Insert that item into Redis with SETNX
If the key already exists, go back to step 1
Set an expiration time with EXPIRE
Delete that item from DynamoDB
Serve the value to the client

How to deal with fail chain on Ethereum?

I am building decentralized application which grabs data from blockchain to mysql database.
I'm not sure, but I guess it is possible that one part of Ethereum network accepts newly mined transaction X and another part accepts mined transaction Y. Some time later one of this transactions should be accepted by full chain and other transaction should fail.
If my node gets in wrong chain I will have incorrect data on my mysql database. And it will be hard to revert database back.
How to deal correctly with these types of conflicts? Should I grab data only after certain number of confirmations (for example 5 or 10)? Or there is another approach?

What happens to a batch having a settlementState of settlementError?

In the Authorize.net API, when getSettledBatchList returns a settlementState of settlementError, is that the final state for the batch? What should I expect to happen to the batched transactions?
Is the same batch processed again the following day, using the same batch id, possibly resulting in a settlementState of settledSuccessfully? Or are the affected transactions automatically included in a new batch with a new batch id?
If the transactions are included in a new batch, would they then be included in multiple batches? If transactions are included in multiple batches, would getTransactionList for each of these batches return the exact same transactionStatus for transactions that were included in multiple batches, regardless of which batch id was used to make the getTransactionList request?
Question was originally asked at https://community.developer.authorize.net/t5/Integration-and-Testing/What-happens-to-a-batch-having-a-settlementState-of/td-p/58993. If the question is answered there, I'll also add the answer here.
Here's the answer posted in the Authorize.Net community for those who did not follow the link in the question:
Batch status of "settlement error" means that the batch failed. There are different reasons a batch could fail depending on the processor the merchant is using and different causes of failure. A failed batch needs to be reset and this means that the merchant will need to contact Authorize.Net to request for a batch reset. It is important to note that batches over 30 days old cannot be reset. When resetting a batch, merchant needs to confirm first with their MSP (Merchant Service Provider) that the batch was not funded, and the error that failed the batch has been fixed, before submitting a ticket for the batch to be reset.
Resetting a batch doesn't really modify the batch, what it does, is it takes the transactions from the batch and puts them back into unsetttled so they settle with the next batch. Those transactions that were in the failed batch will still have the original submit date.
Authorize.net just sends the batch to your msp, you'll have to contact your msp to have them three way call authorize.net to sort it out.

Ordering of streaming data with kinesis stream and firehose

I have an architecture dilemma for my current project which is for near realtime processing of big amount of data. So here is a diagram of the the current architecture:
Here is an explanation of my idea which led me to that picture:
When the API gateway receives a request it's put in the stream(this is because of the nature of my application- "fire and forget) That's how I came up to that conclusion. The input data is separated in the shards based on a specific request attribute which guarantees me the correct order.
Then I have a lambda which cares for validating the input and anomaly detection. So it's an abstraction which keeps the data clean for the next layer- the data enrichment. So this lambda sends the data to a kinesis firehose because it can backup the "raw" data(something which I definitely want to have) and also attach a transformation lambda which will do the enrichment- so I won't care for saving the data in S3, it will come out of the box. So everything is great until the moment where I need a preserved ordering of the received data(the enricher is doing sessionization), which is lost in the firehose, because there's no data separation there as it's in the kinesis streams.
So the only thing I could think of is- to move the sissionization in the first lambda, which will break my abstraction, because it will start caring about data enrichment and the bigger drawback is that the backup data will have enriched data in it, which is also breaking the architecture. And all this is happening because the missing sharding conception in the firehose.
So can someone think of a solution of that problem without losing the out of the box features which aws provides us?
I think that sessionization and data enrichment are two different abstractions, will need to be split between the lambdas.
A session is a time bound, strictly ordered flow of events that are bounded by a purpose or task. You only have that information at the first lambda stage (from the kinesis stream categorization), and should label flows with session context at the source and where sessions can be bounded.
If storing session information in a backup is a problem, it may be that the definition of a session is not well specified or subject to redefinition. If sessions are subject to future recasting, the session data already calculated can be ignored, provided enough additional data to inform the unpredictable future concepts of possible sessions has also been recorded with enough detail.
Additional enrichment providing business context (aka externally identifiable data) should process the sessions transactionally within the previously recorded boundaries.
If sessions aren't transactional at the business level, then the definition of a session is over or under specified. If that is the case, you are out of the stream processing business and into batch processing, where you will need to scale state to the number of possible simultaneous interleaved sessions and their maximum durations -- querying the entire corpus of events to bracket sessions of hopefully manageable time durations.

Django database; how to download huge data in csv format

I have setup my database in Django in which I have huge amount of data. The task is to download all the data at a time in csv format. The problem which I am facing here is when the data size (in number of table rows) is upto 2000, I am able to download it but when number of rows reaches to more than 5k, it throws an error, "Gateway timeout". How to handle such issue. There is no table indexing as of now.
Also, when there is 2K data available, it takes around 18sec to download. So how this can be optimized.
First, make sure the code that is generating the CSV is as optimized as possible.
Next, the gateway timeout is coming from your front end proxy; so simply increase the timeout there.
However, this is a temporary reprieve - as your data set grows, this timeout will be exhausted and you'll keep getting these errors.
The permanent solution is to trigger a separate process to generate the CSV in the background, and then download it once its finished. You can do this by using celery or rq which are both ways to queue tasks for execution (and then collect the results at a later time).
If you are currently using HttpResponse from django.http then you could try using StreamingHttpResponse instead.
Failing that, you could try querying the database directly. For example, if you use the MySql database backend, these answers might help you:
dump-a-mysql-database-to-a-plaintext-csv-backup-from-the-command-line
As for the speed of the transaction, you could experiment with other database backends. However, if you need to do this often enough for the speed to be a major issue then there may be something else in the larger process which should be optimized instead.