AWS DMS Removing LOB Columns - amazon-web-services

I'm trying to set up a Postgresql migration using the DMS to s3 as target. But after running I noticided that some tables were missing some columns.
After checking the logs I noticed this message:
Column 'column_name' was removed from table definition 'schema.table': the column data type is LOB and the table has no primary key or unique index
In the settings of the task migration I tried to increase the lob limit in the option
Maximum LOB size to 2000000
But still getting the same result.
Does anyone know a workaround for this problem?

I guess, the problem is you do not have the primary key in your table.
From AWS documentation:
Currently, a table must have a primary key for AWS DMS to capture LOB
changes. If a table that contains LOBs doesn't have a primary key,
there are several actions you can take to capture LOB changes:
Add a primary key to the table. This can be as simple as adding an ID
column and populating it with a sequence using a trigger.
Create a materialized view of the table that includes a
system-generated ID as the primary key and migrate the materialized
view rather than the table.
Create a logical standby, add a primary key to the table, and migrate
from the logical standby.
Learn more
It is also important to have the primary key of a simple type, not LOB:
In FULL LOB or LIMITED LOB mode, AWS DMS doesn't support replication of primary keys that are LOB data types.
Learn more

Related

How do I add a value for a new field in AWS DynamoDB?

My team added a new field in our DynamoDB and I need to add an integer for all entries that do not currently contain the field.
I've tried using the Update example here, however I am getting a ValidationException: Where clause does not contain a mandatory equality on all key attributes as my WHERE clause does not include the primary key. However, I do not need to filter by primary key as I only care about entries who do not have a value for the new column.
The query I'm running is PartiQL is:
UPDATE <table-name>
SET <new-field>=0
WHERE NOT <new-field>=0;
The PartiQL UPDATE WHERE clause must identify a single DyanmoDB item. As the docs say:
You can only update one item at a time; you cannot issue a single DynamoDB PartiQL statement that updates multiple items. For information on updating multiple items, see Performing Transactions with PartiQL for DynamoDB or Running Batch Operations with PartiQL for DynamoDB.
What this means in practice is that you (1) scan for all the target records and (2) then issue a UPDATE statement for each record, ideally batched together.

Dynamo db will not allow data to be inserted into table unless the value contains the primary key set during table creation?

The dynamo db will not allow data to be inserted into table unless the value contains the primary key set during table creation.
Dynamodb table:
id (primary key)
device_id
temperature_value
I am sending data from IoT core rule engine into the Dynamodb (Split message into multiple columns of a DynamoDB table (DynamoDBv2)). However, data does not arrive at the dynamo db table if the msg is missing the id attribute.
Is there any way to set primary key to be auto incrementing every time a new data point arrives?
DynamoDB does not support auto incrementing functionality for keys as it might have in a relational database.
Instead this will need to be generated by you at the time of inserting the record into DynamoDB.
There are a few options to generate:
Use a primary key combined of partition key (referencing your sensor id) and a sort key (something such as an event time, or a randomly generated string).
Generate a random string instead and insert this.
Use a seperate data store such as relational or Redis, where you autoincrement a value and use this. This is really not ideal.
Use a seperate DynamoDB table to include this value ensuring you use a transactional write to lock the row and increment, and strongly consistent read to get the latest value. Again this is not ideal

AWS DMS with AWS MSK(Kafka) CDC transactional changes

I'm going to use AWS Database Migration Service (DMS) with AWS MSK(Kafka).
I'd like to send all changes within the same transaction into the same partition of Kafka topic - in order to guarantee correct message order(reference integrity)
For this purpose I'm going to enable the following property:
IncludeTransactionDetails – Provides detailed transaction information from the source database. This information includes a commit timestamp, a log position, and values for transaction_id, previous_transaction_id, and transaction_record_id (the record offset within a transaction). The default is false. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Kafka.html
Also, as I may see from the same documentation:
AWS DMS supports the following two forms for partition keys:
1. SchemaName.TableName: A combination of the schema and table name.
2. ${AttributeName}: The value of one of the fields in the JSON, or the primary key of the table in the source database.
I have a question - in case of 'IncludeTransactionDetails = true', will I be able to use 'transaction_id' from event JSON as partition key for MSK(Kafka) migration topic?
The documentation says, you can define partition key to group the data
"You also define a partition key for each table, which Apache Kafka uses to group the data into its partitions"

informatica powercenter express pass variable to multiple mappings

Background: I am new to Informatica. Informatica powercenter express Version: 9.6.1 HotFix 2
In my etl project I have several mappings to load different dimension and fact tables in a data mart. The ETL will run daily, one requirement is to add a audit key as a column to each of these tables. The audit key is an integer and is generated from a audit table (next value from the audit key column (primary key)). So everyday the audit key is increased by 1 etc. So after each etl load, all the new or updated rows in all tables (dimension/fact) will have this audit key in a column. The purpose is the ability to trace when or how each row is inserted/updated etc.
Now the question is how to generate such key and pass on to all the mappings? The key should be from the next value from auditkey column of audit table.
You could build a mapplet that generates/maintains the key you want and use it in all your workflows
If you have a RDBMS source, I would suggest creating a oracle sequencer in the DB and create oracle function to get the next value...
Call the the newly created oracle function in SQL Override and use the next value sequence number in all the mapping

How can i update column name in DynamoDB?

I am using DynamoDB for my backend database operations. One of my tables contains column name is 'Region'. When I am scanning this table, I applied a filter with Region. That time DynamoDB is throwing an error message. 'Region' is a keyword of DynamoDB.
How can I change column name Region to State?
You don't really have to change the column name. You can do this using placeholders:
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ExpressionPlaceholders.html