How to get list of account that are changed in given time frame - universe

We are planning to export data from universe and load them into sql(mirroring). But it will be great if we can load only the accounts that are changed.

UniVerse offers 2 technologies that can do this for you, so you don't need to reinvent the wheel.
Talk to your support provider about Replication with EDA.
EDA stands for 'External Database Access'. It is a built-in technology that will automatically send any record updates to a foreign database, such as Server SQL , DB2 or Oracle.
EDA by itself will convert your local UniVerse tables to remote SQL tables. Most people won't want this, they will still want UniVerse tables locally (for performance reasons). You can use Replication (or Single Server Replication) to achieve the best of both worlds.
You can read the EDA manual on Rocket's site.
Note, you will need to be on the latest version of UniVerse. Luckily, UniVerse is highly backwards compatible and Rocket's support are professional service teams are experienced in this.

Related

What is the difference between 'SAS' and 'Salesforce'

I would be starting ft in one company, where i was been told that the application is developed using 'Sas' and 'salesforce'. What is the difference between two?
And which are recommended online resource which I can use to learn more about it.
SAS is software for statistical analysis. If your company/job description doesn't look like working with large sets of data & complex reporting that's probably not it.
They probably mean SaaS (Software as a Service) model, also known as "the cloud", cloud computing etc. You write the program (or use / modify existing one) but you don't buy servers, worry about network connection, electricity costs, load balancing (spikes in traffic will not cause your website to go down). Many apps operate in this model. Microsoft's Azure cloud (or even online wersions of MS Office). There's Siebel Oracle on Demand CRM, Microsoft Dynamics, SAP I think also has SaaS offering...
It's a big topic, I'm simplifying a lot here. And then there are Platform as a Service things too (PaaS) where they give you "just" the hosting etc but no base application to build on top of. You write everything you need from scratch and upload it. Think Heroku or Amazon Web Services (AWS).
Salesforce is "just" one more SaaS application. You start with base application & database, similar to all other clients in the world. You can install plugins to it (some free, some paid), configure it yourself, write custom code if your functionality is too complex... You can do a lot with just clicks & drag & drop but if you need to code stuff then JavaScript (for client-side) and Apex (for server-side) will be your friend. Apex is bit similar to Java.
Where to start... Trailhead is good source of self-paced trainings. You can sign up for a free Salesforce Developer Edition (has almost all features as the paid one but limited storage space), try to pass some courses... Or in SF help&training there should be tons of videos (actually in that link whole left menu "getting started with salesforce" might be good).

What is the different between AWS Elasticsearch and AWS Redshift

I read the document that both for data analysis and in cluster structure but I don't understand what use case different.
Amazon Elasticsearch is a popular open-source search and analytics engine for use cases such as log analytics, real-time application monitoring, and clickstream analytics.Amazon Elasticsearch
Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. Amazon Redshift
Amazon Redshift is a hosted data warehouse product, while Amazon Elasticsearch is a hosted ElasticSearch cluster.
Redshift is based on PostgreSQL and (afaik) mostly used for BI purpuses and other compute-intensive jobs, the Amazon Elasticsearch is an out-of-the-box ElasticSearch managed cluster (which you cannot use to run SQL queries, since ES is a NoSQL database).
Both Amazon Redshift and Amazon ES are managed services, which means you don't need to do anything in order to manage your servers (this is what you pay for). Using the AWS Console you can add new cluster and you don't need to run any commands on order to install any software - you just need to choose which server to run your cluster on (number of nodes, disk, ram, etc).
If you are not familiar with ElasticSearch you should check their website.
Edit: It is now possible to write SQL queries on ElasticSearch: SQL Support for AWS ElasticSearch
I agree with #IMSoP's assertions above...
To compare the two is like comparing an elephant and a tiger - you're not really asking the right question quite yet.
What you should really be asking is - what are my requirements for my use cases to best fulfill my stakeholder / customer needs, first, and then which data storage technology best aligns with my requirements second...
To be clear - Whether speaking of AWS ElasticSearch Service, or FOSS / Enterprise ElasticSearch (which have signifficant differences, between, even) - ElasticSearch is NOT a Relational Database (RDBMS), nor is it quite a NoSQL (Document Store) Database, either...
ElasticSearch is a Search Engine / Index. It does some things very well, for very specific use cases, however unlike RDBMS data models most signifficantly, ElasticSearch or NoSQL are not going to provide you with FULL ACID Compliance, or Transactional Statement Processing, so if your use case prioritizes data integrity, constrainability, reliability, audit ability, regulatory compliance, recover ability (to Point in Time, even), and normalization of data model for performance and least repetition of data while providing deep cardinality and enforcing model constraints for optimal integrity, "NoSQL and Elastic are not the Droids you're looking for..." and you should be implementing a RDBMS solution. As already mentioned, the AWS Redshift Service is based on PostgreSQL - which is one of the most popular OpenSource RDBMS flavors out there, just offered by AWS as a fully managed solution / service for their customers.
Elastic falls between RDBMS and NoSQL categories, as it is a Search Engine / Index that works most optimally with "single index" type use cases, where A LOT of content is indexed all at once and those documents aren't updated very frequently after the initial bulk indexing,but perhaps the most important thing I could stress is that in my experience it typically does not scale very cost effectively (even managed cluster services) if you want your clusters to perform well, not degrade over time, retain large historical datasets, and remain highly available for your consumers - and for most will likely become cost PROHIBITIVE VERY fast. That said, Elastic Search DOES still have very optimal use cases, so is always worth evaluating against your unique requirements - just keep scalability and cost in mind while doing so.
Lastly let's call NoSQL what it is, a Document Store that stores collections of documents (most often in JSON format) and while they also do indexing, offer some semblance of an Authentication and Authorization model, provide CRUD operability (or even SQL support nowadays, which makes the career Enterprise Data Engineer in me giggle, that SQL is now the preferred means of querying data from their NoSQL instances! :D )- Still NOT a traditional database, likely won't provide you with much control over your data's integrity - BUT that is precisely what "NoSQL" Document Stores were designed to work best for - UNSTRUCTURED DATA - where you may not always know what your data model is going to look like from the start, or your use case prioritizes data model flexibility over enforcing data integrity in general (non mission critical data). Last - while most modern NoSQL Document Stores may have SOME features that appear on the surface to resemble RDBMS, I am not aware of ANY in that category at current that could claim to offer all that a relational database does, with Oracle MySQL's DocumentStore being probably the best of both worlds in my opinion (and not just because I've worked with it every day for the last decade, either...).
So - I hope Developers with similar questions come across this thread, and after reading are much better informed to make the most optimal design decisions for their use cases - because if we're all being honest with ourselves - everything we do in our profession is about data - either generating it, transporting it, rendering it, transforming it....it all starts and ends with data, and making the most optimal data storage decisions for your applications will literally define the rest of your project!
Cheers!
This strikes me as like asking "What is the difference between apples and oranges? I've heard they're both types of fruit."
AWS has an overview of the analytics products they offer, which at the time of writing lists 21 different services. They also have a list of database products which includes Redshift and 10 others. There's no particularly obvious reason why these two should be compared, and the others on both pages ignored.
There is inevitably a lot of overlap between the capabilities of these tools, so there is no way to write an exhaustive list of use cases for each. Their strengths and weaknesses, and the other tools they integrate easily with, will change over time, and some differences are a matter of "taste" or "style".
Regarding the two picked out in the question:
Elasticsearch is a product built by elastic.co, which AWS can manage the installation and configuration for. As its name suggests, its core functionality is based around search - it can be used to build a flexible but fast product search for an e-commerce site, for instance. It's also commonly used along with other tools to search and aggregate logs and monitoring data.
Redshift is a database system built by AWS, based on PostgreSQL but optimised for extremely large data sets. It is designed for "data warehouse" applications, where you want to write complex logical queries against the data, like "how many people in each city bought both a toothbrush and toothpaste, this year compared to last year".
Rather than trying to make an abstract comparison of all the different services available, it makes more sense to start from the use case which you actually have, and see which tool best fits that need.

What is a better mBaaS that supports offline sync and caching?

What is a better mBaaS that supports offline sync and caching?
I am evaluating several mBaaS solutions for my hybrid mobile app under development. I looked at Kinvey, Kii, buddy, and Telerik BackEnd platform. I have also came across some open source solutions like openmobster and dreamfactory. I am looking to store data in sql-lite on mobile app and then sync it back with an online data store. Kinvey has this support, but their pricing model (per user) is not suitable in my scenario. I can see that openmobster does this but, how is what I need to understand? Can I host in on Azure VM or something? Also please suggest if there is any other solution commercial/open source capable of doing offline sync and caching with push notifications and data storage?
DreamFactory could be a good fit for your scenario. It is open source and comes with a full 30 days of free support. After which it's only like $25/month for a developer account - and this isn't even a requirement to use its product. It's specifically a support package.
To address your question a little more in-depth... I don't believe DreamFactory supports offline syncing at the moment, though they plan to very soon. In regards to sql-lite, DreamFactory's (DSP) product has a built in sql-lite driver to connect to that DB. However, it hasn't been tested enough for them to say it is a fully supported RDBMS. One of the beautiful things about DreamFactory is you're able to host the DSP (DreamFactory Service Platform) on Azure and Amazon EC2 instances (cloud solutions), host locally on your own server, or even use its own free hosted edition!
I would definitely take a little time to look into DF. It doesn't seem to me like you have much to lose. Especially, considering it's a free open-source product!
Feel free to ask me any questions you may have about DreamFactory!
-Mark

Server-based embedded database

I'm trying to figure out which database would suit my needs. My c++ project need a database that will be running on devices sold to customers. Mainly it would only log data and events to a database on local SSD disk. Write speed is the most important as logging frequency can be up to 1000Hz (1 write per 1ms). It must be possible to access data remotely from other devices to make graphic visualisations of data. I have tested sqlite with 3rd party server, mysql and postgres. Postgres seems to be quite slow compared to others. As I've read Postgres will become good if concurrency will increase, but in my case concurrency is and will be quite low.
I'm wondering is there any other database for such needs. It also feels that mysql and postgres will be a litte overkill for such requirements. Any suggestions?
PostgreSQL is an enterprise quality database, and not fit for embedded devices. MySQL while smaller will also be a tight fit in an embedded device. SQLite is the most common, and is widely used in embedded devices, even quite small.
Go for sqlite because your requirement states that you App will be running on DEVICES and mostly I guest they are mobile devices and almost all mobile devices support sqlite.... so go for it...
Consider BerkeleyDB. It is a small-footprint embedded DB with a big commercial backer if you needed support, etc. There are open source versions as well as commercially licensed ones. There's no support for SQL querying, but unless you're doing quite complex relational queries this should not be a problem. Concurrency support is excellent, though initial database configuration tends to be awkward.
There's a Microsoft-only alternative in the form of the Extensible Storage Engine, that's free and available on most versions of Windows. There are various other 'DBM'-like simple embedded databases out there, so long as you don't feel you need SQL.
You might also consider an in-memory 'NoSQL'-style database; something like Redis will be very performant.
RDM Embedded may be a good fit for you. I'm with Raima and this product allows you to access data remotely and you can utilize the in-memory or a hybrid on-disk/in-memory database capabilities (www.raima.com/in-memory-database) if you need to. What could be useful for you in this particular case is that RDM products can be used together to manage data between embedded, mobile, desktop or server devices. This can be easily setup through our products, RDM Embedded, RDM Mobile, RDM Workgroup and RDM Server.
If you want to test performance of our database quickly before downloading the full product, go to our Database Performance Popcorn Samples.

Data Warehouse and Django

This is more of an architectural question than a technological one per se.
I am currently building a business website/social network that needs to store large volumes of data and use that data to draw analytics (consumer behavior).
I am using Django and a PostgreSQL database.
Now my question is: I want to expand this architecture to include a data warehouse. The ideal would be: the operational DB would be the current Django PostgreSQL database, and the data warehouse would be something additional, preferably in a multidimensional model.
We are still in a very early phase, we are going to test with 50 users, so something primitive such as a one-column table for starters would be enough.
I would like to know if somebody has experience in this situation, and that could recommend me a framework to create a data warehouse, all while mantaining the operational DB with the Django models for ease of use (if possible).
Thank you in advance!
Here are some cool Open Source tools I used recently:
Kettle - great ETL tool, you can use this to extract the data from your operational database into your warehouse. Supports any database with a JDBC driver and makes it very easy to build e.g. a star schema.
Saiku - nice Web 2.0 frontend built on Pentaho Mondrian (MDX implementation). This allows your users to easily build complex aggregation queries (think Pivot table in Excel), and the Mondrian layer provides caching etc. to make things go fast. Try the demo here.
My answer does not necessarily apply to data warehousing. In your case I see the possibility to implement a NoSQL database solution alongside an OLTP relational storage, which in this case is PostgreSQL.
Why consider NoSQL? In addition to the obvious scalability benefits, NoSQL offer a number of advantages that probably will apply to your scenario. For instance, the flexibility of having records with different sets of fields, and key-based access.
Since you're still in "trial" stage you might find it easier to decide for a NoSQL database solution depending on your hosting provider. For instance AWS have SimpleDB, Google App Engine provide their own DataStore, etc. However there are plenty of other NoSQL solutions you can go for that have nice Python bindings.