I am developing a system using Django + Postgresql. It's my first time with postgresql but I chose it because I needed the transactions and foreign key features.
In a certain view I have to lock my tables with AccessExclusiveLock, to prevent any read or write during this view. That's because I do some checks on the whole data before I save/update my entities.
I noticed an inconsistent error that happens from time to time. It's because of a select statement that happens directly after the lock statement. It demands to have AccessShareLock. I read on postgresql website that AccessShareLock conflicts with AccessExclusiveLock.
What I can't understand is why is it happening in the first place. Why would postgresql ask for implicit lock if it already has an explicit lock that cover that implicit one ? The second thing I can't understand is why is this view running on 2 different postregsql processes ? Aren't they supposed to be collected in a single transaction ?
Thanx in advance.
In PostgreSQL, instead of acquiring exclusive access locks, I would recommend to set the appropriate transaction isolation level on your session. So, before running your "update", send the following command to your database:
begin;
set transaction isolation level repeatable read;
-- your SQL commands here
commit;
According to your description, you need repeatable read isolation level.
Related
I wanted to do an insert and update at the same time in Redshift. For this I am inserting the data into a temporary table, removing the updated entries from the original table and inserting all the new and updated entries. Since Redshift uses concurrency, sometimes entries are duplicated, because the delete started before the insert was finished. Using a very large sleep for each operation this does not happen, however the script is very slow. Is it possible to run queries in parallel in Redshift?
Hope someone can help me , thanks in advance!
You should read up on MVCC (multi-version coherency control) and transactions. Redshift can only only run one query at a time (for a session) but that is not the issue. You want to COMMIT both changes at the same time (COMMIT is the action that causes changes to be apparent to others). You do this by wrapping your SQL statement in a transaction (BEGIN ... COMMIT) and executed in the same session (not clear if you are using multiple sessions). All changes made within the transaction will only be visible to the session making the changes UNTIL COMMIT when ALL the changes made by the transaction will be visible to everyone at the same moment.
A few things to watch out for - if your connection is in AUTOCOMMIT mode then you may break out of your transaction early and COMMIT partial results. Also when you are working in transactions your source table information is unchanging (so you see consistent data during your transaction) and this information isn't allowed to change for you. This means that if you have multiple sessions changing table data you need to be careful about the order in which they COMMIT so the right version of data is presented to each other.
begin transaction;
<run the queries in parallel>
end transaction;
In this specific case do this:
create temp table stage (like target);
insert into stage
select * from source
where source.filter = 'filter_expression';
begin transaction;
delete from target
using stage
where target.primarykey = stage.primarykey;
insert into target
select * from stage;
end transaction;
drop table stage;
See:
https://docs.aws.amazon.com/redshift/latest/dg/c_best-practices-upsert.html
https://docs.aws.amazon.com/redshift/latest/dg/t_updating-inserting-using-staging-tables-.html
Anyone knows how to disable auto-commit on a Sybase-ASE database through python? I am using sybpydb and there doesn't appear to be an option to do so. Researched quite a bit online but can't see a way around. Thanks.
You will need to change your set-up from unchained mode to chained mode.
chained mode implicitly begins a transaction before any data-retrieval
or modification statement: delete, insert, open, fetch, select, and
update. You must still explicitly end the transaction with commit
transaction or rollback transaction.
You can set this mode for your current session by turning on the
chained option of the set statement:
However, you cannot execute the set chained command within a
transaction. To return to the unchained transaction mode, set the
chained option to off.
From the Sybase Documentation.
I want to use different Berkeley-DB databases to store different classes of objects in my application. Transactions in a single DB can be done atomically using DbTxn::commit. However, if I'm using multiple databases, I have to create multiple transactions (one for each database), right? In this case, if committing the first succeeds but the second fails, is there a way to roll back the already committed first transaction? (As far as I understand DbTxn::abort, this is no longer possible to use after the transaction has been committed.)
Is there some way to achieve atomic transactions across multiple databases?
If you are using multiple databases then you DON'T have to create multiple transactions. By using a single transaction, you can operate on multiple DBs.
Please see this link for the documentation of Db::Open().
It has the 'DbTxn *txnid' parameter. You can specify a transaction id returned by the DB_ENV->txn_begin() API. So before opening a DB, a transaction id should be obtained.
Carefully read the note under the parameter 'txnid' in the given documentation link.
Please note that, you should Not specify DB_AUTO_COMMIT the flag in the Db::open() API. Instead of that, you will specify the same transaction id for the parameter 'txnid' for all DBs that you want to operate on. In this way you can achieve atomic transactions across multiple databases.
In general, you need something like distributed tranzaction manager, the full answer fills books. See "The Berkeley DB Book", CHAPTER 9, "Distributed Transactions and Data-Distribution Strategies", ISBN-10: 1-59059-672-2
I have an update query that is based on the result of a select, typically returning more than 1000 rows.
If some of these rows are updated by other queries before this update can touch them could that cause a problem with the records? For example could they get out of sync with the original query?
If so would it be better to select and update individual rows rather than in batch?
If it makes a difference, the query is being run on Microsoft SQL Server 2008 R2
Thanks.
No.
A Table cannot be updated while something else is in the process of updating it.
Databases use concurrency control and have ACID properties to prevent exactly this type of problem.
I would recommend reading up on isolation levels. The default in SQL Server is READ COMMITTED, which means that other transactions cannot read data that has been updated but not committed by a given transaction.
This means that data returned by your select/update statement will be an accurate reflection of the database at a moment in time.
If you were to change your database to READ UNCOMMITTED then you could get into a situation where the data from your select/update is out of synch.
If you're selecting first, then updating, you can use a transaction
BEGIN TRAN
-- your select WITHOUT LOCKING HINT
-- your update based upon select
COMMIT TRAN
However, if you're updating directly from a select, then, no need to worry about it. A single transaction is implied.
UPDATE mytable
SET value = mot.value
FROM myOtherTable mot
BUT... do NOT do the following, otherwise you'll run into a deadlock
UPDATE mytable
SET value = mot.value
FROM myOtherTable mot WITH (NOLOCK)
I am currently adding unit tests to a rather large quantity of PostgreSQL stored procedures, using pgTap.
Some of the procedures perform operations which lock rows explicitly. These locks are critical to the application.
How do I write tests that check that the rows that need to be locked have been, and that rows which shouldn't be locked aren't?
The only "clue" I have at the moment is the pgrowlocks extension, which allows a transaction to check for rows locked by another transaction. However, the current transaction doesn't seem to see its own locks, so I'd have to use something to synchronise two transaction, and unless I am quite mistaken, there's no way to do that using pgTap.
(note: using PostgreSQL 9.1)
If you can identify the ctid of the rows in question, and know which transaction should have the rows locked, maybe you could use the pageinspect extension and look at the tuple info flags and xmax? The info flags should indicate the row is locked, and xmax be set to the transaction id holding it.
How do I write tests that check that the rows that need to be locked have been, and that rows which shouldn't be locked aren't?
Open separate transation, try to lock the same row with NOWAIT, and catch the exception.
PostgreSQL has no support for autonomous transactions, so - to open separate transaction from within PgTAP test, you will have to resort to dblink or other similar extension.
PS. I found this link, where Robert Haas explains, why row-level tuples are not tracked in pg_locks:
(...) ungranted tuple locks show up in
pg_locks, but they disappear once granted. (PostgreSQL would run out
of lock table space on even a medium-sized SELECT FOR UPDATE query if
we didn't do this.)
On the other hand - I quite don't understand why you want to test for lock existence - it's guaranteed after succesful LOCK command.