This question already has answers here:
Identity increment is jumping in SQL Server database
(6 answers)
Closed 7 years ago.
I have a strange scenario in which the auto identity int column in my SQL Server 2012 database is not incrementing properly.
Say I have a table which uses an int auto identity as a primary key it is sporadically skipping increments, for example:
1,
2,
3,
4,
5,
1004,
1005
This is happening on a random number of tables at very random times, can not replicate it to find any trends.
How is this happening?
Is there a way to make it stop?
This is all perfectly normal. Microsoft added sequences in SQL Server 2012, finally, i might add and changed the way identity keys are generated. Have a look here for some explanation.
If you want to have the old behaviour, you can:
use trace flag 272 - this will cause a log record to be generated for each generated identity value. The performance of identity generation may be impacted by turning on this trace flag.
use a sequence generator with the NO CACHE setting (http://msdn.microsoft.com/en-us/library/ff878091.aspx)
Got the same problem, found the following bug report in SQL Server 2012
If still relevant see conditions that cause the issue - there are some workarounds there as well (didn't try though).
Failover or Restart Results in Reseed of Identity
While trace flag 272 may work for many, it definitely won't work for hosted Sql Server Express installations. So, I created an identity table, and use this through an INSTEAD OF trigger. I'm hoping this helps someone else, and/or gives others an opportunity to improve my solution. The last line allows returning the last identity column added. Since I typically use this to add a single row, this works to return the identity of a single inserted row.
The identity table:
CREATE TABLE [dbo].[tblsysIdentities](
[intTableId] [int] NOT NULL,
[intIdentityLast] [int] NOT NULL,
[strTable] [varchar](100) NOT NULL,
[tsConcurrency] [timestamp] NULL,
CONSTRAINT [PK_tblsysIdentities] PRIMARY KEY CLUSTERED
(
[intTableId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
and the insert trigger:
-- INSERT --
IF OBJECT_ID ('dbo.trgtblsysTrackerMessagesIdentity', 'TR') IS NOT NULL
DROP TRIGGER dbo.trgtblsysTrackerMessagesIdentity;
GO
CREATE TRIGGER trgtblsysTrackerMessagesIdentity
ON dbo.tblsysTrackerMessages
INSTEAD OF INSERT AS
BEGIN
DECLARE #intTrackerMessageId INT
DECLARE #intRowCount INT
SET #intRowCount = (SELECT COUNT(*) FROM INSERTED)
SET #intTrackerMessageId = (SELECT intIdentityLast FROM tblsysIdentities WHERE intTableId=1)
UPDATE tblsysIdentities SET intIdentityLast = #intTrackerMessageId + #intRowCount WHERE intTableId=1
INSERT INTO tblsysTrackerMessages(
[intTrackerMessageId],
[intTrackerId],
[strMessage],
[intTrackerMessageTypeId],
[datCreated],
[strCreatedBy])
SELECT #intTrackerMessageId + ROW_NUMBER() OVER (ORDER BY [datCreated]) AS [intTrackerMessageId],
[intTrackerId],
[strMessage],
[intTrackerMessageTypeId],
[datCreated],
[strCreatedBy] FROM INSERTED;
SELECT TOP 1 #intTrackerMessageId + #intRowCount FROM INSERTED;
END
Related
Currently I'm loading data from Google Storage to stage_table_orders using WRITE_APPEND. Since this load both new and existed order there could be a case where same order has more than one version the field etl_timestamp tells which row is the most updated one.
then I WRITE_TRUNCATE my production_table_orders with query like:
select ...
from (
SELECT * , ROW_NUMBER() OVER
(PARTITION BY date_purchased, orderid order by etl_timestamp DESC) as rn
FROM `warehouse.stage_table_orders` )
where rn=1
Then the production_table_orders always contains the most updated version of each order.
This process is suppose to run every 3 minutes.
I'm wondering if this is the best practice.
I have around 20M rows. It seems not smart to WRITE_TRUNCATE 20M rows every 3 minutes.
Suggestion?
We are doing the same. To help improve performance though, try to partition the table by date_purchased and cluster by orderid.
Use a CTAS statement (to the table itself) as you cannot add partition after fact.
EDIT: use 2 tables and MERGE
Depending on your particular use case i.e. the number of fields that could be updated between old and new, you could use 2 tables, e.g. stage_table_orders for the imported records and final_table_orders as destination table and do
a MERGE like so:
MERGE final_table_orders F
USING stage_table_orders S
ON F.orderid = S.orderid AND
F.date_purchased = S.date_purchased
WHEN MATCHED THEN
UPDATE SET field_that_change = S.field_that_change
WHEN NOT MATCHED THEN
INSERT (field1, field2, ...) VALUES(S.field1, S.field2, ...)
Pro: efficient if few rows are "upserted", not millions (although not tested) + pruning partitions should work.
Con: you have to explicitly list the fields in the update and insert clauses. A one-time effort if schema is pretty much fixed.
There are may ways to de-duplicate and there is no one-size-fits-all. Search in SO for similar requests using ARRAY_AGG, or EXISTS with DELETE or UNION ALL,... Try them out and see which performs better for YOUR dataset.
I am using C++ code with embedded Pro*C (Version: 11.2.0.3.0) for Oracle DB. I am running a bulk insert clause as below:
insert int TBL1 (col1, col2)
select a.col1, b.col2 from TBL2 a, TBL3 b
where a.col1 = :v and a.col2 = b.col2
I run this query for a set of records to be inserted, and binding values for :v in place.
However, while some records could be inserted, some failed with
ORA-01403: no data found
I see from sqlca.sqlerrd[2], the number of rows that could be inserted. So, I know M out N records could be inserted. Now, I would like to know which records did fail, so I need a clue of list of all a.col1 values that could cause this failure.
Is there any way out? Any clue or direction would be very helpful.
This is a bit long for a comment.
The error you are referencing is a PL/SQL error, documented here. This is not an error that an insert would normally produce.
My one guess is that the table has an insert trigger and this trigger is causing the problem.
It is also possible that your code is in a larger block, and something else in the block is causing the error.
i am using c++ 4.8 ( available 4.9) and pqxx driver ver. 4.0.1.
postgresdb is latest stable.
My problem is all about complexity and resource balance:
I need to execute insert to database (and there is optionally pqxx::result) and id in that table id is based on nextval(table_seq_id)
Is is possible to get id of inserted row as a result? There is a workaround on this to ask db about currentvalue in sequence and just insert query with currentvalue+1 (or +n) but this will require to do "insert and ask" chain.
Db should be able to store more than 6K large requests /per.sec. so i would like to ask about id as infrequent as possible. Bulk insert is not an option.
As documented here, you can add a RETURNING clause to the INSERT query, to return values from the inserted row(s). They give an example similar to what you want, returning an ID:
INSERT INTO distributors (did, dname) VALUES (DEFAULT, 'XYZ Widgets')
RETURNING did;
Alright I am in a rather difficult situation, or at least I think so anyway. I have been doing some research on how to fix my problem but have really come up empty handed.
I need to be able to reindex the rowid of my table after I delete a row. That way at any given time when I want to update or index a row by the rowid it is accessing the correct one.
Now for those of you asking why. Basically I am interfacing a "homebrewed" db that was programmed in C and is really just a bunch of memory locations all accessed like they were a db table. So what I'm trying to say is they can look up a row by searching for a value in the table, or by simply saying i want row 6. Lastly the table could consist of really anything, and any values which means they dont create a column as an index and ultimately the only thing for me to index their row by row number is the rowid to my knowledge.
So I have found that VACUUM would do what I want or need but it appears that the system that database is in isn't giving sqlite privileges to write so when VACUUM is run it comes back with and error. (ERROR 14 or Unable to open the database file) (I also know that my db is open so that isn't the issue but not having write privileges is the only reason I can come up with) I have also read some stuff about the auto increment or something like that but didn't really understand/think that was going to be able to fix my problem.
Any suggestions or ideas from the sqlite or database geniuses out that would be appreciated.
Not sure if I have understood completely your problem, but if you can use SQL code maybe you can write a query to update the IDs (assuming they are in dense order).
You can use a query like this:
UPDATE t1
SET id = (SELECT rank
FROM (SELECT id,
(
SELECT count()+1
FROM (SELECT DISTINCT id
FROM t1 AS t
WHERE t.id < t1.id
)
) rank
FROM t1
) AS sub
WHERE sub.id = t1.id
);
You can check my demo in SQLFiddler. In this demo you will see the result of the DELETE and UPDATE statements (to simulate your case) if you run all queries together.
I have been searching for a while on how to get the generated auto-increment ID from an "INSERT . INTO ... (...) VALUES (...)". Even on stackoverflow, I only find the answer of using a "SELECT LAST_INSERT_ID()" in a subsequent query. I find this solution unsatisfactory for a number of reasons:
1) This will effectively double the queries sent to the database, especially since it is mostly handling inserts.
2) What will happen if more than one thread access the database at the same time? What if more than one application accesses the database at the same time? It seems to me the values are bound to become erroneous.
It's hard for me to believe that the MySQL C++ Connector wouldn't offer the feature that the Java Connector as well as the PHP Connector offer.
An example taken from http://forums.mysql.com/read.php?167,294960,295250
sql::Statement* stmt = conn->createStatement();
sql::ResultSet* res = stmt->executeQuery("SELECT ##identity AS id");
res->next();
my_ulong retVal = res->getInt64("id");
In nutshell, if your ID column is not an auto_increment column then you can as well use
SELECT ##identity AS id
EDIT:
Not sure what do you mean by second query/round trip. First I thought you are trying to know a different way to get the ID of the last inserted row but it looks like you are more interested in knowing whether you can save the round trip or not?
If that's the case, then I am completely agree with #WhozCraig; you can punch in both your queries in a single statement like inser into tab value ....;select last_inserted_id() which will be a single call
OR
you can have stored procedure like below to do the same and save the round trip
create procedure myproc
as
begin
insert into mytab values ...;
select last_inserted_id();
end
Let me know if this is not what you are trying to achieve.