What is the best way to implement ETL jobs using WSO2.
We've been trying to leverage data services within WSO2 EI 6.4.
Our objective is to fetch data from web services as well as RDBMS and to store it to an RDBMS.
Any suggestions / ideas will be much appreciated.
In My Experience with WSO2 middleware, Data services may not be the best fit for ETL jobs.
We had similar case where we wanted to copy data from one Databse to another Application.
For that we wrote integration (java) web service to fetch data from database send to the applciation using application Interface (web service exposed from application). and configured the Integration web service in EI scheduler to periodically run the service to sync the data.
Yes, Stream processor is better but need to really see if it fits in for ETL
Related
I have to create reports in SAP Analytics Cloud using data saved in delta tables in Databricks on AWS. I have come across some ready-made connectors (such as this: https://www.cdata.com/kb/tech/databricks-connect-sac.rst) but as a proof of concept my team has decided to deploy a docker container with the sap data provider (https://docs.aws.amazon.com/sap/latest/general/data-provider-installallation.html) and to pull the data into SAC via a JDBC connection. This feels like re-inventing the wheel, so I was wondering if there are ready made tools for this purpose, or if not, if anyone has done this using a docker container and can share some tips or code that would be much appreciated.
This is about a Reporting Server solution.
I need some advice to choose a product, which will hold a SQL Database Server and a Web Service App (one that will make a call to a stored procedure and run an SSIS package - not much processing here -) and SSRS. I'm not familiar with this, it needs to be available 24/7, as I said there's no much processing just synchronizing data (few hundreds of thousands of records), what do you suggest me?
Requirements:
SQL Server Enterprise 2017: this will hold the database and execute
the SSIS package.
We have an SSIS package that will be executed from a .Net Web Service app which will execute a Stored Procedure on users demand.
The Server needs to run Reporting Services (SSRS).
Considerations:
Storage: Database will hold around 750K records (all text).
Bandwidth: There will be synchronization (data retrieval or updates
only) with an external system.
Use: the client has asked to consider a dedicated instance since they
will use it at their own discretion.
Now the only issue is, as far as I know, we can't call a Stored Procedure from the outside system (outside the server), or at least I have not found a way to do that, that's why I want to host both solutions in one place, so the Web Service App can call the Stored Procedure Locally.
So now I'm wondering, what should I do? should I leverage a full VM? how much will cost?
If you want to do PaaS and not have to manage infrastructure, take a look at the Azure App Service Environment is an Azure App Service feature that provides a fully isolated and dedicated environment for securely running App Service apps at high scale. This capability can host your:
Windows web apps
Linux web apps
Docker containers
Mobile apps
Functions
For SQL you can use Azure SQL Database Managed instance,a new deployment option of Azure SQL Database, providing near 100% compatibility with the latest SQL Server on-premises (Enterprise Edition) Database Engine, providing a native virtual network (VNet) implementation that addresses common security concerns, and a business model favorable for on-premises SQL Server customers. This is a fully isolated instance of SQL server.
I suggest you host a static site on blob, an Azure function on consumption model to make calls to SQL database and a SQL database. Of course, there are alternative architecture you can use, however all depends on detailed requirements.
I am new to the Hadoop environment, sorry if the question is obvious...
I need to develop a web service to record and read large volumes of data. Because of this requirement I thought of using a Hadoop cluster and HBase as my database.
I have designed my hbase schema to satisfy my requirements, so far so good.
The thing is that since it is a service I am developing, I would like the users of the service not to know the internal representation of the data.
I do not want the users to have to invoke a Put to a certain table, for example, to the Clients table, but instead invoke a high-abstraction method, for example, createClient().
How do I add this abstraction layer on top of HBase while maintaining the characteristics of reliable and distributed and the capacity to service lots of users simultaneously offered by HBase itself?
Thanks a lot
Consider Hbase Stargate to enable a REST server. If you want to obscure the table name in the URI, perhaps proxy Stargate with a web server.
I would like to build an app and collect some events from the app, and then show some event statistics like frequency, duration etc.
I`ve just investigated the aws Cognito web service, but it stores only a set of key-value pairs of a limited total size.
I can build, of course, my own REST web service on the top of the database and store all my events there. But I wonder if there are some aws web service(s) that I can leverage to build such a solution. (In case if someone familiar with Azure, it would be nice to see the possible solution there too!)
Any ideas, suggestions?
Haven't used any packaged web service for this; however, I do use REST methods for statistics in my apps and find it works well....low overhead and easy to add, change and collect.
I would suggest you to have a look at AWS Mobile Analytics service (http://aws.amazon.com/mobileanalytics/)
Have a look at the Getting Started page http://aws.amazon.com/mobileanalytics/getting-started/
Seb
I am working with dashboard and want custom events to be shown using dashboard. Is it possible to see custom events using this?
Yes, You can do this. WSO2 BAM is designed to receive events and store in Apache Cassandra,has the capability to summarise the collected data by Apache Hive/Hadoop and writes the RDBMS database. By default h2 is used to write the data.
You can write the jaggery application or you can use gadget generation tool. Or any other third party dashboard which you want to be integrated also can be done as far as it can read it from RDBMS for which you are writing back the summarised data.