Best way to compare date from source to target

Best way to compare date from source to target - informatica

I'm looking to sync Salesforce to an sql table with informatica cloud. I need to compare the date in the salesforce object against a date in the sql table before updating the sf record. If the createdDate of the sf record is within 10 days of the startdate in sql then update sf with the sql record, otherwise ignore. I've done some syncing in the past but never needed to have criteria from source and target. I usually just need to handle the query in sql. Hoping to find a relatively uncomplicated way to do this.

Use Informatica's DATE_DIFF function
https://docs.informatica.com/data-integration/powercenter/10-1/_transformation-language-reference_powercenter_10-1_ditamap/dates/understanding_date_arithmetic.html

Related

Power BI develop query with cached dataset

When developing a query in Power BI with a database data source, making any changes causes the query editor to 'start from scratch' and re-query the database.
Wondering if there is a workaround that allows you to develop a query without repeated long wait times by eg downloading a temporary local flat file of the full dataset which can be used to develop the query offline and can then be swapped out for the live database connection when you are happy with it.
Importing the data once, exporting as a csv from a Power BI table visualisation and re-importing as a new data source would work but maybe there's a simpler way?
Thanks

There's two approaches you can use.
If your database supports query folding, make the first step take just the top 200 records whilst you develop your query. Once your happy with it, remove the firstN filter.
Load the entire table to the model, export it to a csv using DAX studio, develop your query using the CSV and then switch back to the DB once you're happy with it.

Is there a way to add the date time stamp for the data refresh in AWS QuickSight dashboard?

Is there a field or formula I can add to a narrative in QuickSight that will allow me to show the date/time the backing data was most recently refreshed automatically?

If your data set is SQL based (Whether SPICE or direct query) then I would use custom SQL and add a column with the date time value for the time that the query was run.
This can be done by using something like the NOW() function in PostgreSQL.
Other than this workaround I don't think it is currently possible to get what you are looking for in QuickSight

Dataset shows only 5 event tables after re-linking Firebase with another Google Analytics account

Recently unlinked and re-linked a Firebase project with a different Google Analytics account.
The BigQuery integration configured to export GA data created the new dataset and data started populating into that.
The old dataset corresponding to the unlinked, "default" GA account, which contained ~2 years of data is still accessible in the BigQuery UI, however only the 5 most recent event_ tables are visible in the dataset. (5 days worth of event data)
Is it possible to extract historical data from the old, unlinked dataset?

What I could suggest, it's to do some queries for further validate the data that you have within your BigQuery dataset.
In this case, I would start by getting the dates for each table to see the amount (days) of data contained on the dataset.
SELECT event_date
FROM `firebase-public-project.analytics_153293282.events_*`
GROUP BY event_date ORDER BY event_date
EDIT
A better way to do this, and get all the tables within the dataset, is using the bq command line tool, see reference here.
bq ls firebase-public-project:analytics_153293282
You'll get something like this:
You could also do a COUNT(event_date), so you can see how many records you have per day, and compare this to the content that you have or you can see on your Firebase project.
SELECT event_date, COUNT(event_date) ...
On the case that there's data missing, you could use table decorators, to try to recover that data, see example here.
About the table's expiration date you can see this, in short, expiration time can be set by default at dataset level and it would be applied for new tables (existing tables require a manual update of their expiration time one by one), and expiration time can be set during the creation of the table. To see if there was any change on the expiration time you could look into your logs for protoPayload.methodName="tableservice.update", and see if there was set an expireTime as follows:
tableUpdateRequest: {
resource: {
expireTime: "2020-12-31T00:00:00Z"
...
}
}
Besides this, if you have a GCP support plan, you could reach them looking for further assistance on what could have happened with your tables on that dataset. Otherwise, you could open an issue tracker. Keep in mind that Firebase doesn't delete your data when unlinking a Firebase project from BigQuery, so in theory the data should be there.

Problems loading data in to Analysis Services Model

I’m building an model in Azure Analysis Services. The model should contain only data for the last 3 months and is processed every day.
I have a separate dimension for date that has a relation with a fact table using a datekey. I’m using a power query to only load the last 3 months in the date dimension. In the power query to load the fact table I used Table.nestedjoin to only load the rows that have a value in the date table.
When I do this, the processing of the model takes forever. After some troubleshooting I saw that the query Analysis Services is using to retrieve data from the SQL database retrieves all rows. So, Am I correct saying AS load all data before it merge the rows? Is there a way to change this? Or is there a better way to a chief my solution?
Kind regards,

Joins are super slow in Power Query. You should avoid them if you can do it in the datasource or use normal relationships in the data model.
Also, you can setup the date dimension in DAX and dynamically populate it to contain only dates present in the FACT table.
As for the load of all the data, it could be because the data is fetched as is, and only then power query applies the transformations (the join).

You can modify the query in the Power Query Editor / Advenced Editor to add a where clause direclty in the query

BigQuery Table multiple parition a day

I am trying to load some Avro format data to BigQuery through the api and I need some partitioning. According to the documentation here
https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TimePartitioning
It will create only one partition a day with the ingestion partition that use the _PARTITIONTIME column. Is it possible to create multiple partition a day by using timestamp field?
Another option I can think about was the ranged partition documented here
https://cloud.google.com/bigquery/docs/reference/rest/v2/JobConfiguration#RangePartitioning
however, it was marked as experimental. Not sure it is good for production use?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Best way to compare date from source to target - informatica

Use Informatica's DATE_DIFF function https://docs.informatica.com/data-integration/powercenter/10-1/_transformation-language-reference_powercenter_10-1_ditamap/dates/understanding_date_arithmetic.html

Related

Power BI develop query with cached dataset

Is there a way to add the date time stamp for the data refresh in AWS QuickSight dashboard?

Dataset shows only 5 event tables after re-linking Firebase with another Google Analytics account

Problems loading data in to Analysis Services Model

BigQuery Table multiple parition a day

Categories

Resources