The differences between GeneratedValue strategies - doctrine-orm

In the Doctrine docs they mention that there exists a few different strategies for the #GeneratedValue annotation:
AUTO
SEQUENCE
TABLE
IDENTITY
UUID
CUSTOM
NONE
Would someone please explain the differences between all thees strategies?

Check the latest doctrine documentation
Here is a summary :
the list of possible generation strategies:
AUTO (default): Tells Doctrine to pick the strategy that is preferred by the used database platform. The preferred strategies are IDENTITY for MySQL, SQLite and MsSQL and SEQUENCE for Oracle and PostgreSQL. This strategy provides full portability.
SEQUENCE: Tells Doctrine to use a database sequence for ID generation. This strategy does currently not provide full portability. Sequences are supported by Oracle and PostgreSql and SQL Anywhere.
IDENTITY: Tells Doctrine to use special identity columns in the database that generate a value on insertion of a row. This strategy does currently not provide full portability and is supported by the following platforms:
MySQL/SQLite/SQL Anywhere => AUTO_INCREMENT
MSSQL => IDENTITY
PostgreSQL => SERIAL
TABLE: Tells Doctrine to use a separate table for ID generation. This strategy provides full portability. This strategy is not yet implemented!
NONE: Tells Doctrine that the identifiers are assigned, and thus generated, by your code. The assignment must take place before a new entity is passed to EntityManager#persist. NONE is the same as leaving off the #GeneratedValue entirely.
SINCE VERSION 2.3 :
UUID: Tells Doctrine to use the built-in Universally Unique Identifier generator. This strategy provides full portability.

Ofcourse the accepted answer is correct, but it needs a minor update as follows:
According to Annotation section of the documentation:
This annotation is optional and only has meaning when used in conjunction with #Id.
If this annotation is not specified with #Id the NONE strategy is used as default.
The strategy attribute is optional.
According to Basic Mapping section of the documentation:
SEQUENCE: Tells Doctrine to use a database sequence for ID generation. This strategy does currently not provide full portability. Sequences are supported by Oracle, PostgreSql and SQL Anywhere.
IDENTITY: Tells Doctrine to use special identity columns in the database that generate a value on insertion of a row. This strategy does currently not provide full portability and is supported by the following platforms:
MySQL/SQLite/SQL Anywhere (AUTO_INCREMENT)
MSSQL (IDENTITY)
PostgreSQL (SERIAL).
Downvote
Regarding the downvote given by someone, it should be noted that SQL Anywhere has been added and the accepted answer needs a minor update.

From the perspective of a programmer, they all achieve the same result: that is to provide a UNIQUE value for the primary key field. Strictly speaking, there are two further conditions which are also met, namely: the key must also be mandatory and not null.
The only differences lie in the internal implementations which provide the primary key value. In addition, there are performance and database-compatibility factors which also need to be considered. Different databases support different strategies.
The easiest one to understand is SEQUENCE and this is generally also the one which yields the best performance advantage. Here, the database maintains an internal sequence whose nextval is accessed by an additional SQL call as illustrated below:
SELECT nextval ('hibernate_sequence')
The next value is allocated during insertion of each new row. Despite the additional SQL call, there is negligible performance impact. With SEQUENCE, it is possible to specify the initial value (default is 1) and also the allocation size (default=50) using the #SequenceGenerator annotation:
#SequenceGenerator(name="seq", initialValue=1, allocationSize=100)
The IDENTITY strategy relies on the database to generate the primary key by maintaining an additional column in the table whose next value is automatically generated whenever a new row is inserted. A separate identity generator is required for each type hierarchy.
The TABLE strategy relies on a separate table to store and update the sequence with each new row insertion. It uses pessimistic locks to maintain the sequence and as a result is the slowest strategy of all these options. It may be worth noting that an #TableGenerator annotation can be used to specify generator name, table name and schema for this strategy:
#TableGenerator(name="book_generator", table="id_generator", schema="bookstore")
With the UUID option, the persistence provider (eg Hibernate) generates a universally unique ID of the form: '8dd5f315-9788-4d00-87bb-10eed9eff566'. To select this option, simply apply the #GeneratedValue annotation above a field declaration whose data type is UUID; eg:
#Entity
public class UUIDDemo {
#Id
#GeneratedValue
private UUID uuid;
// ...
}
Finally, the AUTO strategy is the default and with this option, the persistence provider selects the optimal strategy for the database being used.

Related

DynamoDB 1 big table or multiple small tables?

I'm currently facing some questions regarding my database design. Currently i'm developing an api which lets users do the following:
Create an Account ( 1 User owns 1 Account)
Create a Profile ( 1 Account owns 1-n Profiles)
Let a profile upload 2 types of items ( 1 Profile owns 0-n Items ; the items differ in type and purpose)
Calling the API methods triggers AWS Lambda to perform the requested operations in the DynamoDB tables.
My current plan looks like this:
It should be possible to query items by specifying a time frame and the Profile ID. But i think my design completely defeats the purpose of DynamoDB. AWS documentation says that a well designed product only requires one table.
What would be a good way to realise this architecture in one table?
Are there any drawbacks on using the current design?
What would you specify as Primary/Partition/sort key/secondary indexes in both the current design and a one-table-approach?
I’m going to give this answer assuming that you need to be able to do the following queries.
Given an Account, find all profiles
Given a Profile, find all Items
Given a Profile and a specific ItemType, find all Items
Given an Item, find the owning Profile
Given a Profile, find the owning account
One of the beauties of DynamoDB (and also a bane, perhaps) is that it is mostly schema-less. You need to have the mandatory Primary Key attributes for every item in the table, but all of the other attributes can be anything you like. In order to have a DynamoDB design with only one table, you usually need to get used to the idea of having mixed types of objects in the same table.
That being said, here’s a possible schema for your use case. My suggestion assumes that you are using something like UUIDs for your identifiers.
The partition key is a field that is simply called pkey (or whatever you want). We’ll also call the sort key skey (but again, it doesn’t really matter). Now, for an Account, the value of pkey is Account-{{uuid}} and the value of skey would be the same. For a Profile, the pkey value is also Account-{{uuid}}, but the skey value is Profile-{{uuid}}. Finally, for an Item, the pkey is Profile-{{uuid}} and the skey is Item-{{type}}-{{uuid}}. For all of the attributes of an item, don’t worry about it, just use whatever attributes you want to use.
Since the “parent” object is always the partition key, you can get any of the “child” objects simply by querying for the ID of the of the parent. For example, your key condition expression to get all the ‘ItemType2’s for a Profile would be
pkey = “Profile-{{uuid}}” AND begins_with(skey, “Item-Type2”)
In this schema, your GSI has the same keys as the table, but reversed. You can query the GSI for ‘Item-{{type}}-{{uuid}}’ to get the owning Profile, and similarly with a Profile is to get the owning account.
What I have illustrated here is the adjacency list pattern. DynamoDB also has an article describing how to use composite sort keys for hierarchical data, which would also be suitable for your data, and depending on your expected queries, it might be more suitable than using the adjacency list.
You don’t have to put everything in a single table. Yes, DynamoDB recommends it, but it is far more important to make sure that your application is correct and maintainable. If having multiple tables means it’s easier to write a defect free application, then use multiple tables.

Change schema attribute cardinality based on another attribute

I have the following pseudo schemas:
A)
-- Cost-schedule: FRE494
-- Periodic: false
-- Type: Fixed
-- Value: 70.00
-- CCY GBP
B)
-- Cost-schedule: GHK999
-- Periodic: true
-- Period start: 01/04/2015
-- Period end: 30/04/2015
-- Type: Promise
-- Filled: false
-- Value: 0.00
-- CCY: GBP
I am trying to avoid any kind of nasty hierarchy with a super class "Cost-Schedule" with sub-classes "Periodic" and "One-off". Firstly, I am using clojure which is not OO. Also don't want to fall into the Liskov Substitution trap.
So, as a newbie to Datomic, is there a way to dynamically change the schema so that an attributes cardinality is modified based on another attribute value. In this case, if Periodic is "false" we don't need to have Period-Start, Period-End. If Periodic is "true" then we need to enforce having values for these attributes.
My gut says, this is not possible. If not, how can I enforce this in the DB? It appears to me that if I have to explicitly validate the transaction before submitting it to the transactor then I am really just defining a schema outside of the constraints of Datomic which doesn't appear to be wise, given that many micro-systems will be writing/reading from the DB and coordinating humans to write 'correct' code is difficult!
Any help on how to overcome this challenge very gratefully received.
I see two sub-answers to your question.
The first is that Datomic does not define "objects". It is really closer to a plain map. Your entity B has 3 fields that entity A does not. That is fine and is not controlled in any way by Datomic. Each attribute-value pair can be added to any entity independently from any other entity. Just because one map has 4 entries, it has no relationship to another map having 7 entries, even if all of the keys in map A are also in map B.
The 2nd sub-answer is that your app must do all validation & integrity checking - Datomic won't. There is no analogue to "UNIQUE NOT NULL" in SQL, etc. However, Datomic does support Database Functions which have a chance to abort any transaction that fails a user-supplied test. So, this is one way of enforcing data integrity checks.
Please also check out Tupelo Datomic, a library I wrote to make using Datomic more effortless.

How to generate unqiue keys for caching items in ColdFusion

I posted a similar question over on the Adobe Community forums, but it was suggested to ask over here as well.
I'm trying to cache distinct queries associated with a particular database, and need to be able to flush all of the queries for that database while leaving other cached queries intact. So I figured I'd take advantage of ColdFusion's ehcache capabilities. I created a specific cache region to use for queries from this particular database, so I can use cacheRemoveAll(myRegionName) to flush those stored queries.
Since I need each distinct query to be cached and retrievable easily, I figured I'd hash the query parameters into a unique string that I would use for the cache key for each query. Here's the approach I've tried so far:
Create a Struct containing key value pairs of the parameters (parameter name, parameter value).
Convert the Struct to a String using SerializeJSON().
Hash the String using Hash().
Does this approach make sense? I'm wondering how others have approached cache key generation. Also, is the "MD5" algorithm adequate for this purpose, and will it guarantee unique key generation, or do I need to use "SHA"?
UPDATE: use cacheRegion attribute introduced in CF10!
http://help.adobe.com/en_US/ColdFusion/10.0/CFMLRef/WSc3ff6d0ea77859461172e0811cbec22c24-7fae.html
Then all you need to do is to specify cachedAfter or cachedWithin, and forget about how to to generate unique keys. CF will do it for you by 'hashing':
query "Name"
SQL statement
Datasource
Username and
password
DBTYPE
reference: http://www.coldfusionmuse.com/index.cfm/2010/9/19/safe.caching
I think this would be the easiest, unless you really need to fetch a specific query by a key, then u can feed your own hash using cacheID, another new attribute introduced in CF10.

UUIDs for DynamoDB?

Is it possible to get DynamoDB to automatically generate unique IDs when adding new items to a table?
I noticed the Java API mentions #DynamoDBAutoGeneratedKey so I'm assuming there's a way to get this working with PHP as well.
If so, does the application code generate these IDs or is it done on the DynamoDB side?
Good question - while conceptually possible, this seems not currently available as a DynamoDB API level feature, insofar neither CreateTable nor PutItem refer to such a functionality.
The #DynamoDBAutoGeneratedKey notation you have noticed is a Java annotation, i.e. syntactic sugar offered by the Java SDK indeed:
An annotation, in the Java computer programming language, is a special
form of syntactic metadata that can be added to Java source code.
As such #DynamoDBAutoGeneratedKey is one of the Amazon DynamoDB Annotations offered as part of the Object Persistence Model within the Java SDK's high-level API (see Using the Object Persistence Model with Amazon DynamoDB):
Marks a hash key or range key property as being auto-generated. The
Object Persistence Model will generate a random UUID when saving these
attributes. Only String properties can be marked as auto-generated
keys.
While working with dynamodb in javascript with nodejs. I use the npm module uuid to genrate unique key.
Ex:
id=uuid.v1();
refer :uuid npm
By using schema based AWS dynamodb data mapper library in Node.js, Hash key (id) will be generated automatically. Auto generated ids are based on uuid v4.
For more details, have a look on following aws package.
Data Mapper with annotation
Data Mapper package for Javascript
Sample snipet
#table('my_table')
class MyDomainClass {
#autoGeneratedHashKey()
id: string;
#rangeKey({defaultProvider: () => new Date()})
createdAt: Date;
}
The client can create a (for all intents and purposes) unique ID either by picking a long random id (DynamoDB supports 128-bit integers, for example), or by picking an ID which contains the client's IP address, CPU number, and current time - or something along these lines.
The UUID standard even includes a standard way to do this (and you have libraries in various languages to create such UUIDs on the client side), but you don't really need to use a standard.
And interesting question is how do you plan to find these items if they have random keys. Or are you planning to use a secondary index?
The 2022 answer is here:
https://dev.to/prabusah_53/aws-lambda-in-built-uuid-382f
External libraries are no longer needed.
Here is another good method taken from mkyong
http://www.mkyong.com/java/how-to-get-current-timestamps-in-java/
I adjusted his method to get the milliseconds instead of the actual date
java.util.Date date= new java.util.Date();
System.out.println(new Timestamp(date.getTime()).getTime());
The approach I'm taking is to use the current timestamp for the hash-key (or the range-key, if using a range-key too). Store the timestamp as an integer, representing the number of milliseconds since the start of the "UNIX epoch" (in the UTC timezone). Many date/time libraries can produce this number for you.
This has the advantage that if you want to have a "creation time" field in your table, your UUID already stores this information. Just call another method in your date/time library to convert the timestamp to a readable format.
(Be sure to handle the exception which will occur if a second item is created in the same table with the same millisecond timestamp; just fall back and retry the operation in that case, with a slightly later, current timestamp.)
For example:
User table
hash-key only: userID (timestamp of the creation of this user).
WidgetAttributes table
hash-key plus range-key.
hash-key: userID (use the userID from the User table of the user to whom the widget belongs).
range-key: attribID (use the timestamp of the creation of this widget-attribute).
Now you can run "query" operations on the WidgetAttributes table to get all widget-attributes for a certain user; by using "greater-than-zero" as the query-parameter for the range-key.

borland builder c++ oracle question

I have a Borland builder c++ 6 application calling Oracle 10g database. Operating over a LAN. When the application in question makes a simple db select e.g.
select table_name from element_tablenames where element_id = 10023842
the following is recorded as happening in Oracle (from the performance logs)
select table_name
from element_tablenames
where element_id = 10023842
then immediately (and not from C++ source code but perhaps deeper)
select table_name, element_tablenames.ROWID
from element_tablenames
where element_id = 10023842
The select statement is only called once in the TADODbQuery object, yet two queries are being performed - one to parse and the other adds the ROWID for executon.
Over a WAN and many, many queries this is obviously a problem to the user.
Does anyone know why this might be happening, can someone suggest a solution?
Agree with Robert.
The ROWID uniquely identifies a row in a table so that the returned record can be applied back to the database with any changes (or as a DELETE).
Is there a way to identify a particular column (or set of columns) as a primary key so that it can be used to identify a row without using a ROWID.
I don't know exactly where the RowID is coming from, it could be either the TAdoQuery implementation or the Oracle Driver. But I am sure I found the reason.
From the Oracle docs:
If the database table does not contain a primary key, the ROWID must be selected explicitly when populating DataTable.
So I suspect your Table does not have a primary key, either add one or add the rowid.
Either way this will solve the duplicate query problem.
Since you are concerned about performance. In general
Using TAdoQuery you can set the CursorType to optimize different behaviors for performance. This article covers this from a TAdoQuery perspective. MSDN also has an article that covers it from from a general ADO Perspective. Finally the specifications from the Oracle Driver can be useful.
I would recommend setting the Cursor to either as they are the only supported by Oracle
ctStatic - Bi-directional query produced.
ctOpenForwardOnly - Unidirectional query produced, fastest but can't call Prior
You can also play with CursorLocation to see how it effects your speed.