Do I have to apply changes before refresh in Power BI - powerbi

I have made changes to a huge table in query editor. Afterwards I have just used Closed as opposite to Close & Apply. Do I have to press Apply changes and then Refresh to have a full refresh? Or does just Refresh suffice? Does Refresh make both Apply changes and Refresh table?
With a huge table applying changes takes a few minutes and then refresh another a few minutes.
What are the tips for making changes to huge fact tables in query editor to minimize the load (or double load) at the stage of designing model?

When you only press Refresh, it gets your data once more.
The (Close) & Apply, applies the changes to your model and refreshes the data.
You can even have a situation that you make a change in the query (model) so your constraints on data is different, this might conflict with your existing data. This is annoying because you cannot Apply the change anymore, you first need to ensure you filter/correct your data in the query side before you apply the changes.

Related

Which one is more performant in redshift - Truncate followed with Insert Into or Drop and Create Table As?

I have been working on AWS Redshift and kind of curious about which of the data loading (full reload) method is more performant.
Approach 1 (Using Truncate):
Truncate the existing table
Load the data using Insert Into Select statement
Approach 2 (Using Drop and Create):
Drop the existing table
Load the data using Create Table As Select statement
We have been using both in our ETL, but I am interested in understanding what's happening behind the scene on AWS side.
In my opinion - Drop and Create Table As statement should be more performant as it reduces the overhead of scanning/handling associated data blocks for table needed in Insert Into statement.
Moreover, truncate in AWS Redshift does not reseed identity columns - Redshift Truncate table and reset Identity?
Please share your thoughts.
Redshift operates on 1MB blocks as the base unit of storage and coherency. When changes are made to a table it is these blocks that are "published" for all to see when the changes are committed. A table is just a list (data structure) of block ids that compose it and since there can be many versions of a table in flight at any time (if it is being changed while others are viewing it).
For the sake of the is question let's assume that the table in question is large (contains a lot of data) which I expect is true. These two statements end up doing a common action - unlinking and freeing all the blocks in the table. The blocks is where all the data exists so you'd think that the speed of these two are the same and on idle systems they are close. Both automatically commit the results so the command doesn't complete until the work is done. In this idle system comparison I've seen DROP run faster but then you need to CREATE the table again so there is time needed to recreate the data structure of the table but this can be in a transaction block so do we need to include the COMMIT? The bottom line is that in the idle system these two approaches are quite close in runtime and when I last measured them out for a client the DROP approach was a bit faster. I would advise you to read on before making your decision.
However, in the real world Redshift clusters are rarely idle and in loaded cases these two statements can be quite different. DROP requires exclusive control over the table since it does not run inside of a transaction block. All other uses of the table must be closed (committed or rolled-back) before DROP can execute. So if you are performing this DROP/recreate procedure on a table others are using the DROP statement will be blocked until all these uses complete. This can take an in-determinant amount of time to happen. For ETL processing on "hidden" or "unpublished" tables the DROP/recreate method can work but you need to be really careful about what other sessions are accessing the table in question.
Truncate does run inside of a transaction but performs a commit upon completion. This means that it won't be blocked by others working with the table. It's just that one version of the table is full (for those who were looking at it before truncate ran) and one version is completely empty. The data structure of the table has versions for each session that has it open and each sees the blocks (or lack of blocks) that corresponds to their version. I suspect that it is managing these data structures and propagating these changes through the commit queue that slows TRUNCATE down slightly - bookkeeping. The upside for this bookkeeping is that TRUNCATE will not be blocked by other sessions reading the table.
The deciding factors on choosing between these approaches is often not performance, it is which one has the locking and coherency features that will work in your solution.

Django: Improve page load time by executing complex queries automatically over night and saving result in small lookup table

I am building a dashboard-like webapp in Django and my view takes forever to load due to a relatively large database (a single table with 60.000 rows...and growing), the complexity of the queries and quiet a lot of number crunching and data manipulation in python, according to django debug toolbar the page needs 12 seconds to load.
To speed up the page loading time I thought about the following solution:
Build a view that is called automatically every night, completeles all the complex queries, number crunching and data manipulation and saves the results in a small lookup table in the database
Build a second view that is returning the dashbaord but retrieves the data from the small lookup table via a very simple query and hence loads much faster
Since the queries from the first view are executed every night, the data is always up-to-date in the lookup table
My questions: Does my idea make sense, and if yes does anyone have any exerience with such an approach? How can I write a view that gets called automatically every night?
I also read about caching but with caching the first loading of the page after a database update would still take a very long time, and the data in the database gets updated on a regular basis
Yes, it is common practice.
We are pre-calculating some stuff and we are using celery to run those tasks around midnight daily. For some stuff we have special new model, but usually we add database columns to the model, that contains pre-calculated information.
This approach basically has nothing to do with views - you use them normally, just access data differently.

Oracle APEX - Reusable Pages?

We have some tables in our database that all have the same attributes but the table is named differently for each. I'm not sure of the Architect's original intent in creating them in this way, but this is what I have to work with.
My question for all the expert Oracle APEX developers: is there away to create a reusable page that I can pass the table name to and that table name would be used in the reporting region and DML processing of that page?
I've read up on templates and plugins and don't see a path forward with those options. Of course, I'm new to webdevelopment, so forgive my ignorance.
We are using version 18.2.
Thanks,
Brian
For reporting purposes, you could use a source which is a function that returns a query (i.e. a SELECT statement). Doing so, you'd dynamically decide which table to select from.
However, DML isn't that simple. Instead of default row processing, you should write your own process(es) so that you'd insert/update/delete rows in the right table. I've never done that, but I'd say that it is possible. Basically, you'd keep all logic in the database (for example, a package) and call those procedures from your Apex application.
You could have multiple regions on one page; one region per table. Then use dynamic actions to show/hide the regions and run the select query based on a table name selected by the user.
Select table name from a dropdown or list
Show the region that matches the table name (dynamic action)
Hide the any other regions that are visible (dynamic action)
Refresh the selected region so the data loads (dynamic action)
If that idea works let me know and I can provide a bit more guidance.
I never tried it with reports, but would it work to put all three reports in a single page, and set them via an Item to have Server-Side Conditions that decide what gets shown in the page? You'd likely need separate items with a determined value for the page to recognize and display.
I know I did that to set buttons such as Delete, Save and Create dynamically, rather than creating two or more separate pages for handling editing of certain information. In this case it regarded which buttons to shown based on a reports' primary key being sent to said "Edit" page. If the value was empty, it meant you wanted to create a new record (also because the create button/link sent no PK). If said PK was sent (via a edit button/link), then you'd have the page recognize it and hide the create button and rather show the edit button.

Copy Records in Oracle Apex

I need to copy selected row values and store as a new record.
I am using Oracle Apex 4.2 and Tabular Form.
I need to use checkbox to select the rows and button copy. When i select multiple rows followed by click copy button to copy all the selected row values as new rows and save.
Can anyone Help
Copying Records Through an APEX Tabular Form Input
The idea of cloning existing records from a single table through an Oracle APEX Tabular Form works without much interference with the default design that you can set up through the APEX wizard for page region content.
Build a table with an independent primary key.
Suggested to include two auxiliary columns: COPY_REQUEST and COPIED_FROM for running copy operations. Specific form elements will map to these columns on the tabular form that will be set up.
Build an Oracle stored procedure that can read which records need to be copied. This procedure will be invoked each time the SUBMIT button is pressed.
(optional) Consider including a suppression of step (3) in the event that there is nothing to process (i.e., no records marked for copying).
The Working Table for Receiving Input: COPY_ME
TIP: You will have an easier time if you use the standard TABLE creation wizard. Designate CUSTOMER_ID as the PRIMARY_KEY and have APEX create its standard auto-incrementing functionality on top. (sequence plus trigger set up.)
Here's the sample data I used... though it doesn't matter. You can put in your own values and be able to verify what happened easily.
The Heavy Lifting: The Stored Procedure for Cloning Records in COPY_ME
This procedure works with 1 or more records at a time with a special identifier in the COPY_REQUEST table. After the task is done, the procedure cleans up and resets the request value again.
create or replace procedure proc_copy_me_request is
c_request_code CONSTANT char(1):= 'Y';
cursor copy_cursor is
SELECT cme.CUSTOMER_ID, cme.CUSTOMER_NAME, cme.CITY, cme.COUNTRY,
cme.COPY_REQUEST
FROM copy_me cme
WHERE cme.COPY_REQUEST = c_request_code
FOR UPDATE OF cme.COPY_REQUEST;
BEGIN
FOR i in copy_cursor LOOP
INSERT INTO copy_me (customer_name, city, country, copied_from)
VALUES (i.customer_name, i.city, i.country, i.customer_id);
UPDATE copy_me
SET copy_request = null
WHERE CURRENT OF copy_cursor;
END LOOP;
COMMIT;
END proc_copy_me_request;
There is also a column that can be hidden. It tracks where the record was originally copied from.
Note that the cursor is using the FOR UPDATE OF and WHERE CURRENT OF notation. This is important because the procedure is changing the records that are referenced by it.
APEX Page Setup Instructions
Set up a standard FORM type page and choose the TABULAR FORM style. Follow the set up instructions, taking care to map the correct primary key, and also to the PK sequence object created with the table in the previous steps above.
This is what your page set up will look like after these steps are completed:
EDIT The COPY_REQUEST Form Value:
Under the column attributes section, change the Display As option to "simple checkbox"
Under the list of values section, put a single value under the LOV Definition: Y (case sensitive in either way... just be consistent)
EDIT The COPIED_FROM Form Value:
Under the column attributes section, change the Display As option to "Display as Text(Saves State)". This is just to prevent users from stepping on this read-only field. You could also suppress it if it isn't important to know.
CREATE a New Process: Execute Copy Procedure
This is the bottom of the same configuration page, there are very few things to change or add:
Demonstration: Screenshot of COPY_ME Tabular Form Page in Action
The first screenshot below is before the page is tidied up and the checkbox control is put into place.
Plug in some test data and give it at try. The Page Process created in the step above conditionally invokes the stored procedure that processes all copy requests made at the same time when the SUBMIT form button is selected.
COMMENTS: If you spend enough time tinkering around with the built-in wizards in Oracle APEX, there are opportunities to learn new design patterns and process flows compatible within the tool. Adapting your approach can reduce the amount of additional work and frustration.

Overcoming querying limitations in Couchbase

We recently made a shift from relational (MySQL) to NoSQL (couchbase). Basically its a back-end for social mobile game. We were facing a lot of problems scaling our backend to handle increasing number of users. When using MySQL loading a user took a lot of time as there were a lot of joins between multiple tables. We saw a huge improvement after moving to couchbase specially when loading data as most of it is kept in a single document.
On the downside, couchbase also seems to have a lot of limitations as far as querying is concerned. Couchbase alternative to SQL query is views. While we managed to handle most of our queries using map-reduce, we are really having a hard time figuring out how to handle time based queries. e.g. we need to filter users based on timestamp attribute. We only need a user in view if time is less than current time:
if(user.time < new Date().getTime() / 1000)
What happens is that once a user's time is set to some future time, it gets exempted from this view which is the desired behavior but it never gets added back to view unless we update it - a document only gets re-indexed in view when its updated.
Our solution right now is to load first x user documents and then check time in our application. Sorting is done on user.time attribute so we get those users who's time is less than or near to current time. But I am not sure if this is actually going to work in live environment. Ideally we would like to avoid these type of checks at application level.
Also there are times e.g. match making when we need to check multiple time based attributes. Our current strategy doesn't work in such cases and we frequently get documents from view which do not pass these checks when done in application. I would really appreciate if someone who has already tackled similar problems could share their experiences. Thanks in advance.
Update:
We tried using range queries which works for only one key. Like I said in most cases we have multiple time based keys meaning multiple ranges which does not work.
If you use Date().getTime() inside a view function, you'll always get the time when that view was indexed, just as you said "it never gets added back to view unless we update it".
There are two ways:
Bad way (don't do this in production). Query views with stale=false param. That will cause view to update before it will return results. But view indexing is slow process, especially if you have > 1 milllion records.
Good way. Use range requests. You just need to emit your date in map function as a key or a part of complex key and use that range request. You can see one example here or here (also if you want to use DateTime in couchbase this example will be more usefull). Or just look to my example below:
I.e. you will have docs like:
doc = {
"id"=1,
"type"="doctype",
"timestamp"=123456, //document update or creation time
"data"="lalala"
}
For those docs map function will look like:
map = function(){
if (doc.type === "doctype"){
emit(doc.timestamp,null);
}
}
And now to get recently "updated" docs you need to query this view with params:
startKey="dateTimeNowFromApp"
endKey="{}"
descending=true
Note that startKey and endKey are swapped, because I used descending order. Here is also a link to documnetation about key types that couchbase supports.
Also I've found a link to a question that can also help.