Why athena table names should be unique globally? - amazon-athena

When I’m creating table in athena, it mentioned as tables names must be globally unique.
I created other database2 and had same tables as in database1. What’s the difference? I’m able to create tables with same_table_names.
It’s not restricting me as S3 does. Can someone explain?

Where did you read that table names must be globally unique? There is no such limitation. Table names must be unique within a database, but not globally. Depending on your definition of globally database names must not even be unique globally, just in an AWS region.

Related

Give access to bigquery tables with specific tables names, to be created in future, across all datasets in a gcp project?

I've searched the documentation a lot, but couldn't find anything that allows me to do the following:
Allow creating a role which allows full table access to tables with certain table names only (ex.: "table1", etc.) that'll be created in future. This should work across all available datasets in a GCP project, and also the ones that'll be created in future.
Is this possible? If not directly, indirectly maybe?
Thanks..
The simplest way to do that would be to create a dataset for housing such tables, and set the access appropriate to what you need. Tables requiring a different set of policies should be housed in other datasets.
More information here: https://cloud.google.com/bigquery/docs/dataset-access-controls

Are GSI on Global table of dynamodb replicated automatically?

I have a gsi defined ( in usw2 region) of a global table that is configured to replicate automatically to use2 . I have a gsi defined in usw2 for my table - will the index be replicated automatically ? or do i need go create that manually in the other region too ?
There are two ways to add a region to a global table. In the old way - which was the usual way until November 2019 - you would need to create the same table yourself, and indeed you would also need to create the same indexes yourself in the other region too. You would then use UpdateGlobalTable. Quoting this operation's documentation:
If global secondary indexes are specified, then the following conditions must also be met:
The global secondary indexes must have the same name.
The global secondary indexes must have the same hash key and sort key (if present).
The global secondary indexes must have the same provisioned and maximum write capacity units.
The new (November 2019) way to replicate to another region is to use UpdateTable with the ReplicaUpdates parameter. This way does not require you to create the table table manually in the other reason. Amazon did not seem to document how that table is created, and whether the same indexes are also created on it, but given the above information, I don't see any reason why it wouldn't create the same indexes, just like was always the requirement.
Of course, the best thing for you to do is to just try it, and report back your findings :-)

Hash distribution on identity column

Creating a table in Azure SQL Data Warehouse, I would like to make a hash distribution on an identity column, but get an error that
Cannot insert explicit value for identity column in table 'Table_ff4d8c5d544f4e26a31dbe71b44851cb_11' when IDENTITY_INSERT is set to OFF.
Is this not possible? And if not, why? And is there a work-around? (And where does this odd table name come from?)
Thanks!
You cannot use an IDENTITY column as the hash distributed column in your table.
https://learn.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-identity#limitations
In SQLDW the name you give to your table is its logical name not its physical name. Logical metadata such as table names is maintained centrally on the control node so that operations such as table renames are quick and painless. However, SQLDW is still bound by the rules of table creation - we need to make sure the table name is unique both now and in the future. Therefore the physical names contain guids to deliver that uniqueness.
Saying that, the error you have here is not ideal. It would be helpful if you can post a repro so that we can improve the experience for you.
You are also welcome to post a feature request on our uservoice channel for hash distribution on the IDENTITY column. https://feedback.azure.com/forums/307516-sql-data-warehouse

AWS DataPipeline: RedshiftCopyActivity OVERWRITE_EXISTING not enforcing primary key

I have a DataPipeline that exports data from a local DB to Redshift via S3 (very similar to Incremental copy of RDS MySQL table to Redshift template). I have defined primary key and set insertMode to "OVERWRITE_EXISTING" in pipeline definition, however, I noticed that some rows eventually were duplicated. In what cases does it happen and how do I prevent it?
In Redshift it wont enforce primary key to restrict duplicate values.
We do use temp table to load incremental data then we do upsert(using merge) to target table by checking whether record exist or not.
In this way you can achieve.
Thanks!!
Just found this post after several years, adding an answer in case it helps someone else:
In addition to primary keys Redshift also uses distkeys to determine which lines to overwrite. So in my case an updated value in distkey column forced Redshift to create a duplicate row, although the primary key remained unchanged.

Can you add a global secondary index to dynamodb after table has been created?

With an existing dynamodb table, is it possible to modify the table to add a global secondary index? From the dynamodb control panel, it looks like I have to delete the table and create a new one with the global index.
Edit (January 2015):
Yes, you can add a global secondary index to a DynamoDB table after its creation; see here, under "Global Secondary Indexes on the Fly".
Old Answer (no longer strictly correct):
No, the hash key, range key, and indexes of the table cannot be modified after the table has been created. You can easily add elements that are not hash keys, range keys, or indexed elements after table creation, though.
From the UpdateTable API docs:
You cannot add, modify or delete indexes using UpdateTable. Indexes can only be defined at table creation time.
To the extent possible, you should really try to anticipate current and future query requirements and design the table and indexes accordingly.
You could always migrate the data to a new table if need be.
Just got an email from Amazon:
Dear Amazon DynamoDB Customer,
Global Secondary Indexes (GSI) enable you to perform more efficient
queries. Now, you can add or delete GSIs from your table at any time,
instead of just during table creation. GSIs can be added via the
DynamoDB console or a simple API call. While the GSI is being added or
deleted, the DynamoDB table can still handle live traffic and provide
continuous service at the provisioned throughput level. To learn more
about Online Indexing, please read our blog or visit the documentation
page for more technical and operational details.
If you have any questions or feedback about Online Indexing, please
email us.
Sincerely, The Amazon DynamoDB Team
According to the latest new from AWS, GSI support for existing tables will be added soon
Official statement on AWS forum