PowerBi: incremental data load by using OData feed

PowerBi: incremental data load by using OData feed - powerbi

is there any possibility to save previous data before overriding because of refreshing data?
Steps i have done:
I Created a table and appended to table A
Created a Column called DateTime with the function
DateTime.LocalNow()
Now i have a problem how to save previous data before the refreshing phase. I need to preserve the timestamp of previous data and actually data.
Example giving:
Before refreshing:
Table A:
|Columnname x| DateTime | ....
| value | 23.03.2016 23:00
New Table:
|Columnname x| DateTime | ....
| value | 23.03.2016 23:00
After refreshing:
Table A:
|Columnname x| DateTime | ....
| value | 23.03.2016 23:00
| value 2 | 23.03.2016 23:01
New Table:
|Columnname x| DateTime | ....
| value | 23.03.2016 23:00
| value 2 | 23.03.2016 23:01
kind regards

Incremental refreshes in the Power BI Service or Power BI Desktop aren't currently supported. But please vote for this feature. (update: see that link for info on a preview feature that does this)
If you need this behavior you need to load these rows to a database then incrementally load the database. The load to Power BI will still be a full load of the table(s).

This is now available in PowerBI Premium
From the docs
Incremental refresh enables very large datasets in the Power BI Premium service with the following benefits:
Refreshes are faster. Only data that has changed needs to be refreshed. For example, refresh only the last 5 days of a 10-year dataset.
Refreshes are more reliable. For example, it is not necessary to maintain long-running connections to volatile source systems.
Resource consumption is reduced. Less data to refresh reduces overall consumption of memory and other resources.

Related

AWS Dynamo DB: High read latency using query on GSI

I have a dynamo db table which contains Date, City and other attributes as the columns. I have configured GSI with Date as the hash key. The table contains 27 attributes from 350 cities recorded daily.
| Date | City | Attribute1 | Attribute27|
+------------+------------+-------------+------------+
| 25-06-2020 | Boston | someValue | someValue |
| 25-06-2020 | NY | someValue | someValue |
| 25-06-2020 | Chicago | someValue | someValue |
+------------+------------+-------------+------------+
I have a Lambda proxy integration setup in API Gateway. The lambda function receives a 7 day date range as the request. Each of the date, in this range is used query the dynamodb (using query input) to get all the items for a given day. The result for each day is consolidated for a week, and is then sent back as a JSON response.
The latency seen in POSTMAN is around 1.5s, after increasing the lambda memory to 1024MB (Even though, only 76MB is being consumed).
Is there any way to improve the performance? The dynamo db is already running in On-Demand Capacity.

You don't say if you are using parallel queries or not.
If not do so.
You also don't say what cloudwatch is showing for Query latency, as mentioned by Marcin, DAX can help reduce that.
You also don't mention what cloudwatch is showing for lambda execution. There's various articles about optimizing lambda.
Whatever's left is networking...not much you can do about that..one piece to consider is reusing Db connections in your lambda

Add condition to an existing column

I have the following columns in my dataset:
_____________________________________________________________________________________
| ... |Start Date| Start Time | End Date | End Time | Production Start Date | ... |
|_____|__________|____________|____________|____________|_______________________|_____|
| ... | 01022020 | 180000 | 02022020 | 190000 | 01022020 | ... |
| | | | | | | |
Sometimes the Start Date + Start Time etc. values are blank but the Production Start Date values are always populated.
When the Start Date is empty (NULL), for example, I want my dataset (or ideally, graph) to read the Production Start Date.
How can I achieve this in Power BI?
I know I can make a conditional column, then within that, determine which column to read data from but is there any way to add a condition to the existing Start Date column? I couldn't see such an option in the context menu or subsequent ribbon options.
Is my only option to create a custom conditional column instead?

As #Andrey Nikolov mentioned in the comments, the only ways you can achieve this is to:
1 Create a calculated DAX column.
2 Create a custom column in query mode (M).
3 Edit the original source table.
doug

Using Dataprep to write to just a date partition in a date partitioned table

I'm using a BigQuery view to fetch yesterday's data from a BigQuery table and then trying to write into a date partitioned table using Dataprep.
My first issue was that Dataprep would not correctly pick up DATE type columns, but converting them to TIMESTAMP works (thanks Elliot).
However, when using Dataprep and setting an output BigQuery table you only have 3 options for: Append, Truncate or Drop existing table. If the table is date partitioned and you use Truncate it will remove all existing data, not just data in that partition.
Is there another way to do this that I should be using? My alternative is using Dataprep to overwrite a table and then using Cloud Composer to run some SQL pushing this data into a date partitioned table. Ideally, I'd want to do this just with Dataprep but that doesn't seem possible right now.
BigQuery table schema:
Partition details:
The data I'm ingesting is simple. In one flow:
+------------+--------+
| date | name |
+------------+--------+
| 2018-08-08 | Josh1 |
| 2018-08-08 | Josh2 |
+------------+--------+
In the other flow:
+------------+--------+
| date | name |
+------------+--------+
| 2018-08-09 | Josh1 |
| 2018-08-09 | Josh2 |
+------------|--------+
It overwrites the data in both cases.

You ca create a partitioned table bases on DATE. Data written to a partitioned table is automatically delivered to the appropriate partition.
Data written to a partitioned table is automatically delivered to the appropriate partition based on the date value (expressed in UTC) in the partitioning column.
Append the data to have the new data added to the partitions.
You can create the table using the bq command:
bq mk --table --expiration [INTEGER1] --schema [SCHEMA] --time_partitioning_field date
time_partitioning_field is what defines which field you will be using for the partitions.

Better Design for a multi column search - Django PostgreSQL

I have a table of products(productsmaster).
For each product in the table, I have four columns to show the months in which quarterly check dates are conducted (product1: Q1-Jan, Q2-Apr; product2: Q1-Dec,Q2Mar...)
I am developing an app in Django-PostgreSQL that for a specified month, it picks all products that are reporting in the specified month ie that have the month in column Q1 or Q2 or Q3 or Q4.
The products are set up once but are accessed a lot more when being read for reports and processing. I am sorry I dont have any code yet as I am looking to clarify design before I start coding. Can anyone advise how I can improve this? Database design if possible but willing to use django solutions too.

You should have a dedicated model for this, call it ProductCheck
product | quater | month
------------------------
prod1 | Q1 | Jan
prod1 | Q2 | Apr
prod2 | Q1 | Dec
prod2 | Q2 | Mar
Then your query looks like this:
ProductCheck.objects.filter(month="Jan")
Remember to put indexes on the fields that are most commonly searched upon, to speed up the query. Also you can add unique constraint to avoid duplicated data.

Syncframework:Map single table into multiple tables

I have two tables like the fallowing:
On server:
| Orders Table | OrderDetails Table
-------------------------------------------------------------------------------------
| Id | Id
| OrderDate | OrderId
| ServerName | Product
| Quantity
On client:
| Orders Table | OrderDetails Table
-------------------------------------------------------------------------------------
| Id | Id
| OrderDate | OrderId
| Product
| Quantity
| ClientName
I need to sync the [Server].[Orders Table].[ServerName] to [Client].[OrderDetails Table].[ClientName]
The Question:
What is the true and efficient way of making it?
I know Deprovisioning and provisioning with different config, is one way of doing it.
So I just wanna know the correct way.
Thanks.
EDIT :
Other columns of each table should sync normally ([Server].[Orders Table].[Id] to [Client].[Orders Table].[Id] ...).
And mapping strategy sometimes changes based on the row of data (which which is sending/receiving).

Sync Fx is not an ETL tool. simply put, it's DB sync is per table.
if you really want to force it to do what you want, you can simply intercept ChangesSelected event for the OrderDetails table, lookup the extra column from the other table and then dynamically add the column to the dataset before it gets applied on the other side.
see this link on how to manipulate the change dataset

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

PowerBi: incremental data load by using OData feed - powerbi

Related

AWS Dynamo DB: High read latency using query on GSI

Add condition to an existing column

Using Dataprep to write to just a date partition in a date partitioned table

Better Design for a multi column search - Django PostgreSQL

Syncframework:Map single table into multiple tables

Categories

Resources