I am trying to solve an Informatica problem
I have two tables: Table A and Table B have the following structure
Table A
A_Key
A_Name
A_Address
A_PostalCode
A_Country
A_Latitude
A_Longitude
Table B
B_Key
B_Name
B_PostalCode
B_Latitude
B_Longitude
I need to combine A & B in order to have one output table that contains all the Attribute of A & B.
Since I am new to Informatica Data Quality tool, I am trying to find the logic how I can implement this.
Does anyone have a better solution?
You can use a Joiner Transformation to do this.
It has two groups - Master and Detail. Ideally, you should connect the table with lesser data to the Master and the table with additional data should be connected to Detail section.
Ensure your table data is sorted before connecting to the joiner. Also, enable the Sorted Input in the advanced section of the Joiner Transformation.
Again for powercenter, this scenario sounds more like a union to me and setting the missing colums to null from group b
Related
I need help on this issue as i don't have any experience in Power Bi. I want to join 2 table in Power Bi where it have the same column which is Part_Number. How can i make this 2 table to match by Part Number and return the value?
Recon Table
Inventory Table
I would like to have Part Number, Part Name, QTY, Total Quantity as the result. Hope that i can the clarification i need. Thanks a lot!
For this case you simply must merge the tables. It doesn't look like you have done a lot of research on the matter though, so it's hard to understand exactly what you need help with.
To merge your two tables in Power Query, I would right click in the left hand side menu and select Merge Queries as New.
After that you simply follow the on-screen instructions and select your two tables and their respective key columns. After merging you can choose to disable load of your two original tables to save space in your data model, but this depends on your requirements.
If this was my data model, I would think on why joining these tables are necessary, instead of using these two tables as fact tables, and creating a third table to handle the part number dimension with associated part metadata.
Read the docs: Merge queries in Power Query
I am new to Informatica Powercenter. My company has a ETL implemented in Informatica. What's the best (easiest) way to find the source table and fields mapping to the target table and fields? The ETL logic is rather complicated involving multiple tier architecture:
E.g.
mapping 1: table a, table b - table c
mapping 2: table c, table d - table e
Now need to find where the fields in e is ORIGINALLY coming from. They should be coming from table a, b, d since c is intermediate table. And I will need to work out a mapping of fields in e with fields in original tables.
I know this could be done manually by looking at the mappings in mapping designer, but the example here is simplified, the real ETL is much more complicated. And the task is to analyze all target tables in a database.
You may try using the app I once created: XMLAnalyzer for PowerCenter. It's capable to perform source-to-target analysis for individual mappings as well as complete workflows.
So I have two source tables lets call the, table1 and table2, and the destination table table3 - inside these tables there is information that needs to be extracted from columns of one table, columns of another table, and then combined to give entries of columns to the new table.
Think of it as a complex transformation; for example:
partial text in column1 extracted from table1 and complete text in column1 of table2 combined into 4 rows of column1 (depending on the JSON of column1 in table1) in new transformed table.
So it's not a 1 to 1 mapping between 1 table and another, but a 1 to many mapping where the 1 row of the source comes from a mix of one row from two source table that translates to many rows of the new destination table.
Is this something that glue jobs can accomplish? or am I better of just writing a throwaway Python script? You can assume that the size of the table is not of any concern
Provided you plan to run this process at some frequency, this is a perfect use case for Glue. If this is just a one off, Glue is also a fine choice, but Glue is primarily designed for repeated use.
In you glue script I expect you will end up joining the two tables, and then select new result columns and rows by combining your existing columns. Typically the pattern to follow would be to convert the dynamic frames (created by glue), into pyspark data frames, and then work with pyspark from there, converting back to a dynamic frame before outputting to the database.
Note that depending on your design you may not need to add rows, it of course depends on the outcome you are seeking, but Dynamo does have support for some nifty hierarchical approaches that may remove your need for multiple rows.
If you have more specific examples of schema and the outcomes you are seeking, I could show you a bit of example code.
I wanted to know what would be the best approach for creating the dim tables. Can I maintain it as a single table with all fields and use them as required or create separate dim tables and use them individually.
Can someone please help me out here
PS: I'm a beginner here.
Creating 1 table per dimension is the best practice. In data warehouse concept, you will get 4 types of schema as below-
Start Schema
Snowflakes Schema
Galaxy Schema
Combined Schema
People select any of the above based on their Data type/nature, requirement and other parameter. But in all case, there are single table per dimension. This is easy to maintain and give better performance.
I'm attempting to create multiple joins with two tables (Table A + Table B) using the same key from Table A. The key on table A is "Name" and there are multiple columns in Table B that I need to join this with. Any ideas on the best way to do this?
Just go ahead and do it. Power BI allows multiple relations between tables, but only one of them will be "active":
Multiple Relationships Between Tables
To use inactive relations, you will have to refer to them in DAX using function called
USERELATIONSHIP
Alternatively, you can replicate your table A as many times as you need, and setup regular relations. In my opinion, it's a better data model - it's more intuitive and easier to use.