Does anyone know if it's possible to connect a Power BI report to data in Cloudera Data Platform via PolyBase? I know you can get PolyBase to work with CDH, but I've not seen anything about CDP.
You should be able to connect to CDP via Impala:
https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-connect-impala
The reason that you may not find much yet on connecting to the Cloudera Data Platform, CDP is that it is the newest generation of the Big Data Platform.
Another thing to note, is that CDP is an entire platform, so there are many things you may want to connect with. The most likely tools being Hive and Impala and the most likely forms being the Data Hub and the Data Platform.
The good news:
Though it will take some time for the blogosphere to catch up, and articles to be written or re-written, CDP pretty much encompasses the capabilities of its predecessors CDH and HDP. As such, if you have something that works with either CDH or HDP in the field of connecting to data, it is almost certain to be possible with CDP as well.
To answer concretely: It is definitely possible to leverage any BI tool, including Power BI on top of the data in CDP.
Full Disclosure: I am an Employee of Cloudera.
Related
I'm an experienced developer who knows very little about Power BI. So we've hired some consultants to implement our Power BI screens. And I provided them with a read-only login to my SQL Server database.
It works okay, but when we complained that the data never updates, they are now telling us we should set up a VM to "assure that at the refreshing moment, the scheduled job is not going to fail. VM is always connected, so even during holidays, weekends, the data will be always refreshing."
They followed up with "If the database is on-premise, we need a gateway to connect power bi to the database. If the machine, where the gateway is installed is off, power bi can not connect to the database. So, we need a VM to assure that the gateway is always on."
But this makes zero sense to me. Our database is not on-premise if it's on the Internet and we've given them a connection string. They should be able to update the data at any time.
Can anyone tell me what I'm missing here? I'm starting to question these guys' knowledge. Is it this complicated for Power BI to automatically update its data?
Some data sources require a Data Gateway, even if you put them on the open internet. Data sources that are typically deployed on private networks, or data sources that require 3rd party drivers require the Power BI On-Prem Gateway for refresh. See the list here.
I'm working in a team of people who connect Power BI to a db2 database in order to create reports. However, people are doing this via two different drivers, and this is creating compatibility problems when refreshing each others' reports.
I want to get everyone using the same driver for consistency & compatibility, but I'm not sure which is best for purpose.
The two drivers are:
IBM DB2 ODBC DRIVER DB2COPY1 (appears under the 'System DSN' tab in ODBC Data Source Administrator)
IBM DB2 ODBC DRIVER IBMDBCL1 (appears under the 'User DSN' tab in ODBC Data Source Administrator)
Does anyone know what the practical difference is between the two please? And is either considered industry standard for connecting to db2 from Power BI? I don't really know anything about what the drivers are doing behind the scenes. I had a look at this page but couldn't find anything explaining why you might choose one over another.
I see the same:
The fact that you have two of them doesn't mean you have compatibility problems.
Keep going like that, you are good to go.
Does anyone know what the practical difference is between the two
please?
Yes, one is for your User and the other is for the whole System.
And is either considered industry standard for connecting to db2 from
Power BI?
Yes, you should use the standard connection that Power BI provides out of the box
Currently our client is using Crystal reports (11.x), integrated with the old .Net desktop application.
Looking to move towards better reporting solutions: Dashboards, reports with filters, drill downs, better export options & formatting with excel. Still print reports, needs better printing experience with reports
Client already has SQL license - SSRS reporting services fits most of the requirements, but they need better Dashboards. They like Power BI Dashboards.
Does Power BI can replace SSRS with reports + Dashboards or still need both to complement each other?
If your client absolutely requires printable paginated reports, you'll likely still want to complement Power BI with SSRS. I've also seen this as a pain point for some organizations, but in many cases I am able to work with a client to adopt Power BI as an alternative; one can make the case that paginated reports are not required when you have the ability to drill down and even export row-level data if desired.
In situations where all employees who need to consume the information have access to licensing and technology (even a tablet or phone will do), then you can potentially supplant SSRS/Crystal entirely with Power BI. Your biggest issues are going to be when reports are mass-printed to low-level employees in operational positions, particularly those who don't work on a computer.
This is definitely more of an expectations management conversation and working with the organization to distill down precisely what their requirements are. It can get political, and even if they don't really need paginated, printed reports, some users will insist on them.
What is a better mBaaS that supports offline sync and caching?
I am evaluating several mBaaS solutions for my hybrid mobile app under development. I looked at Kinvey, Kii, buddy, and Telerik BackEnd platform. I have also came across some open source solutions like openmobster and dreamfactory. I am looking to store data in sql-lite on mobile app and then sync it back with an online data store. Kinvey has this support, but their pricing model (per user) is not suitable in my scenario. I can see that openmobster does this but, how is what I need to understand? Can I host in on Azure VM or something? Also please suggest if there is any other solution commercial/open source capable of doing offline sync and caching with push notifications and data storage?
DreamFactory could be a good fit for your scenario. It is open source and comes with a full 30 days of free support. After which it's only like $25/month for a developer account - and this isn't even a requirement to use its product. It's specifically a support package.
To address your question a little more in-depth... I don't believe DreamFactory supports offline syncing at the moment, though they plan to very soon. In regards to sql-lite, DreamFactory's (DSP) product has a built in sql-lite driver to connect to that DB. However, it hasn't been tested enough for them to say it is a fully supported RDBMS. One of the beautiful things about DreamFactory is you're able to host the DSP (DreamFactory Service Platform) on Azure and Amazon EC2 instances (cloud solutions), host locally on your own server, or even use its own free hosted edition!
I would definitely take a little time to look into DF. It doesn't seem to me like you have much to lose. Especially, considering it's a free open-source product!
Feel free to ask me any questions you may have about DreamFactory!
-Mark
I'm trying to figure out which database would suit my needs. My c++ project need a database that will be running on devices sold to customers. Mainly it would only log data and events to a database on local SSD disk. Write speed is the most important as logging frequency can be up to 1000Hz (1 write per 1ms). It must be possible to access data remotely from other devices to make graphic visualisations of data. I have tested sqlite with 3rd party server, mysql and postgres. Postgres seems to be quite slow compared to others. As I've read Postgres will become good if concurrency will increase, but in my case concurrency is and will be quite low.
I'm wondering is there any other database for such needs. It also feels that mysql and postgres will be a litte overkill for such requirements. Any suggestions?
PostgreSQL is an enterprise quality database, and not fit for embedded devices. MySQL while smaller will also be a tight fit in an embedded device. SQLite is the most common, and is widely used in embedded devices, even quite small.
Go for sqlite because your requirement states that you App will be running on DEVICES and mostly I guest they are mobile devices and almost all mobile devices support sqlite.... so go for it...
Consider BerkeleyDB. It is a small-footprint embedded DB with a big commercial backer if you needed support, etc. There are open source versions as well as commercially licensed ones. There's no support for SQL querying, but unless you're doing quite complex relational queries this should not be a problem. Concurrency support is excellent, though initial database configuration tends to be awkward.
There's a Microsoft-only alternative in the form of the Extensible Storage Engine, that's free and available on most versions of Windows. There are various other 'DBM'-like simple embedded databases out there, so long as you don't feel you need SQL.
You might also consider an in-memory 'NoSQL'-style database; something like Redis will be very performant.
RDM Embedded may be a good fit for you. I'm with Raima and this product allows you to access data remotely and you can utilize the in-memory or a hybrid on-disk/in-memory database capabilities (www.raima.com/in-memory-database) if you need to. What could be useful for you in this particular case is that RDM products can be used together to manage data between embedded, mobile, desktop or server devices. This can be easily setup through our products, RDM Embedded, RDM Mobile, RDM Workgroup and RDM Server.
If you want to test performance of our database quickly before downloading the full product, go to our Database Performance Popcorn Samples.