power bi import data taking forever - powerbi

I usually work with power bi and all go well on my computer. Yesturday I downloaded it on a virtual machine itself on a Windows server 2019 machine that I connect on using "distant desktop" or whatever you call it to create a dashboard to visualize some data.
The problem I'm having is that it is taking forever (over 10 hours by now) on the " creating connexions in model".
Image is in french but it is pretty usual content:
precisions:
I already have optimized the data, and can not reshape the tables more,
The only big table (>100 rows) have a count of around 800k rows,
I do have internet and can ping whatever I want,
Any idea where it can come from? Thanks for your help!

Ok, I've found my problem this morning while hanging in the options of the virtual machine.
I had only 1 virtual processor that was always used by other applications and so powerBi could not get nearly as much processing power as needed.
I just changed it and it loaded in seconds...
Conclusion: powerBi speed is based on your processing power, so when working on it, close as much windows as possible on your computer.

Same for me. Import of a local Excel sheet (1 tab, 2 cols, 30 rows) did not complete after 30 minutes using only one processor.
Upgrading the VM to more than one processor solved the issue.

Related

How to increase the Power Query timeout beyond 5 minutes?

Using Power Query to query Kusto and the query times out after 5 minutes even though I've set the timeout to 21 minutes, like this:
[Timeout = #duration(0,0,21,0), ClientRequestProperties = [#"query_language" = "csl"]])
The query in question typically takes about 7-10 minutes when run directly in Kusto.
A similar question asked here had an answer that suggested going to "Data source settings" and clicking on "Change Source..." but that button is grayed for me. Besides, the above, query-specific setting should override a global setting, right?
Assuming that you're using the AzureDataExplorer.Contents() or Kusto.Contents() methods, there was a regression in the Timeout implementations of the connector. This was fixed on Jun 7 2021, and should be included in version 3.0.52 of the connector (should already be publicly available - make sure you have the latest version of the PBI Desktop).
If you're still facing an issue, contact me directly at itsagui(at)microsoft.com

How to get average over time only when PC is up in Prometheus

I'm currently trying to get the average amount of free RAM in the last week on a Windows PC, using the WMI Exporter (https://github.com/martinlindhe/wmi_exporter) and Prometheus with the following query :
avg_over_time(wmi_os_physical_memory_free_bytes{instance="foo"}[1w])
The query inside "avg_over_time" is working correctly, but the problem is the PC is not up 24/7. Since I can't have holes in a range, I was considering the idea of using a recording rules.
The other problem is that the PC doesn't start and stop at known values, so I can't use the following solution : How to get the average over time only during the day in Prometheus because I can't tell the time of the day I need to start collecting.
Is there any recording rule that could concatenate all the information gathered only during the up time of the pc ?
Thanks in advance.
The above expression already does what you want, as there'll be no data for the periods when the target couldn't be scraped and avg_over_time doesn't try to do anything fancy with gaps.

Google AutoML Importing text items very slow

I'm importing text items to Google's AutoML. Each row contains around 5000 characters and I'm adding 70K of these rows. This is a multi-label data set. There is no progress bar or indication of how long this process will take. Its been running for a couple of hours. Is there any way to calculate time remaining or total estimated time. I'd like to add additional data sets, but I'm worried that this will be a very long process before the training even begins. Any sort of formula to create even a semi-wild guess would be great.
-Thanks!
I don't think that's possible today, but I filed a feature request [1] that you can follow for updates. I asked for both training and importing data, as for training it could be useful too.
I tried training with 50K records (~ 300 bytes/record) and the load took more than 20 mins after which I killed it. I retried with 1K, which ran for 20 mins and then emailed me an error message saying I had multiple labels per input (yes, so what? training data is going to have some of those) and I had >100 labels. I simplified the classification buckets and re-ran. It took another 20 mins and was successful. Then I ran 'training' which took 3 hours and billed me $11. That maps to $550 for 50K recs, assuming linear behavior. The prediction results were not bad for a first pass, but I got the feeling that it is throwing a super large neural net at the problem. Would help if they said what NN it was and its dimensions. They do say "beta" :)
don't wast your time trying to using google for text classification. I am a GCP hard user but microsoft LUIS is far better, precise and so much faster that I can't believe that both products are trying to solve same problem.
Luis has a much better documentation, support more languages, has a much better test interface, way faster.. I don't know if is cheaper yet because the pricing model is different but we are willing to pay more.

Informatica PowerExchange CDC Data results in target DB way too slow

First of all, I'm very new to Informatica PowerCenter and PowerExchange.
We are using Informatica PowerCenter and PowerExchange to receive CDC data from our source DB2 to a PostgreSQL DB. Therefore we have one workflow where 7 tables are mapped and we get the result in our PostgreSQL. It works fine so far, but it's lacking performance. Not that the size of data is the problem, it's more the delay I see results in the target DB.
When I insert or delete some data on the DB2 (just like 10 rows in one db), I see the results in our PostgreSQL mostly in about ~10-30 seconds (very rare in less than 5 seconds).
My goal would be to speed up this delay. Is this possible? What would I need for that?
I played a little bit with commit interval, and DTM Buffer size, but nothing helped pretty much.
Also I have the feeling that when I configure the workflow to run continuously, it's even slower, compared to when I execute the workflow, after I made the Inserts/Deletes.
Thanks in advance

How to get the transferred byte size of a SQL query in qt from an Oracle database?

We try to benchmark the database usage for an application written in QT 5 and using an Oracle database.
Counting queries and benchmarking is no problem, but now our supervisor also wants the size in bytes of the received result. At the moment we use only the Qt SQL interface which doesn't give you the received byte count.
Is there a way (preferably within Qt) to get the transferred byte size?
My only idea at the moment is to calculate the bytesize of a row and multiply it with the transferred row count and use it as estimate, but this is more a crutch than a solution...
Thanks in advance,
Kai
You're assuming that the Oracle driver even reports such information. Does it?
Alas, it doesn't matter. You can easily create a transparent proxy within your application that forwards the data to/from the real database. Then point the driver at the proxy. The proxy will then have access to transfer sizes and can be easily queried about them.