Approach to data-analysis [closed] - clojure

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm looking to write a reporting tool. The data resides in a ~6GB postgresql database. The application is an online store/catalog application that has items and orders. The stakeholders are requesting a feature that will allow them to search for an item and give a count of all those orders in the last 2 years.
Some rows contain quantities, and units of measure, which would require multiplication of quantity and UoM for each row.
It's also possible that other reporting functions will be necessary in the future.
I have not delved much into the data analysis aspect of programming. I enjoy Clojure, so I would be thrilled to find a solution that uses Clojure, but only if Clojure offers competitive tools for my needs.
Here are some options I'm considering:
merely SQL
Clojure
core.reducers
a clojure hadoop library
Hadoop
Can anyone shed some insight into these kinds of problems for me? Are there articles that you would recommend?

Hadoop is likely overkill for this project. It seems most likely that simply using Clojure-jdbc or Korma to read the data form the database and filter/reduce it in Clojure is likely to be fine. At work we routinely work with sequences of that size, though this depends on the expected response time. You may need to do some preprocessing and caching if instantaneous responses are expected.

Related

How to architect/design a knowledge base to solve issues from its history analysis? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a ticketing system (lets say JIRA or similar) for my application to file an issue of my application. Now my requirement is to build a knowledge base in a way so that I can predict the solution of any similar issues in future by churning that knowledge base.
To explain further, the knowledge base would give me how many times this kind of issues have arisen in past and what have been the root cause of it in most of the time (lets say 80% time). This way the repository should have an analysis of each and every issue and its possible root cause plus many other relevant information about the issue.
Just to start off to build such a knowledge base, I need to know following things:
What is the most commonly used technology/mechanism available to achieve this ?
How do I need to architect/design a system to be able to serve this kind of requirement?
Does it require to learn any particular language/database ?
I request community experts to enlighten me with the required information and pointers to give me a starting point at least in this direction.
Thanks.
I would suggest against a ‘reinvent the wheel’ approach.
There are perfectly good tools out there that achieve your required use cases. Look at ServiceNow or Desk.com as CRM for tickets, or if you just want a Wiki that integrates with Jira, look at Atlassian’s wiki.
You can also generate reports from Jira itself, not sure why anyone would want to build his own when there are such great tools out there.

Advantages of Service Oriented Architecutre [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I searched on Google, but didn't get straight answers that what are the advantages of Service Oriented Architecture?
Can someone please highlight some of the benefits of SOA?
The two most important (at least in a practical sense) are:
Small, manageable (i.e. maintainable) components.
Services can be distributed across different machines. This makes
the system highly scalable.
In other words: SOA is a good fit into the modern software development landscape with distributed teams and ever-changing requirements, be it functional or non-functional.
It gives great deal of re usability to your code and enormous power to the business as well.
Lets say you start creating an application for banking, now you need to create a mobile app for the same, and if that's not it you have to expose methods from your service to Master /Visa for transaction.
Now in the above scenario if application has been designed with SOA in mind, then lot of code is reused with added advantage of centralized deployment.

Amazon DynamoDB vs relational database [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I want to store huge amount of trading data (say million records per day) using some kind of data base. Each record is small and has static structure: id(integer), time stamp(integer), price(float), size (float). Id field is primary key here (in terms of relational data bases). And I want to select records from specific time range (ordered by time). These is straightforward in a relational database.
Is nosql data base (DynamoDB in particular) suitable for these requirements? Or should I use traditional relational database solution ?
I don't have any experience with NoSql data bases.
The straightforward answer to this question is yes, this fits DynamoDB's use case well. But there's a better answer: try it out and see!
I have been seeing a lot of this kind of question regarding AWS, namely "will this work?" as opposed to "how do I do this?" And the best way to answer that is to try it out and see. Unlike traditional IT, you don't have to do a lot of planning or invest a lot of capital up front to try it out. Spend a buck or two (literally that little) to run a little test program using DynamoDB and another using MySQL (or other RDBMS) and see how they work for you.
Dynamodb would work, however given that each record is small, static structure in my opinion, a relational database would be equally well suited for this task, perhaps even better (which is very subjective).
Don't forget to calculate the costs of both solutions; you can easily install mysql (free) or sql server (not free once you get past a certain point) on an ec2 instance and you will know exactly what your monthly costs will be.
Dynamodb is priced very differently, so you really need to quantify your reads/writes and storage requirements in order to know what you are in for. Best to figure these things out ahead of time unless money is not a concern.

Advice on C++ database program [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I want to build a simple program that searches through a local database and returns a list of results. Essentially the program will be a searchable archive. This will be my first project of this type and I'm looking for some advice.
Fast reading of the database is important. Writing can be slow.
There will be at most a few thousand records, and most records will probably contain less than 3 kb in text.
Records should be flexible when it comes to their fields. Some records will have the field "abc", others will not. The number of fields per record may vary.
Should I simply write my data structures in C++, and then serialize them? Does it make sense to use an existing (lightweight) database framework? If so, can you recommend any that are free and easy to use and preferably written in modern C++?
My target platform is Window and I'm using the Visual Studio compiler.
Edit: the consensus is to use SQLite. Any recommendations as to which of the many SQlite C++ wrappers to use?
As commented by #Joachim, I would suggest SQLite. I have used it in C++ projects and it's straightforward to use. You basically put two files in your project sqlite3.c and sqlite3.h and then start coding to the interface (read the last paragraph of http://www.sqlite.org/selfcontained.html). You don't have to worry about configuring a database server. It's fast and lightweight. Read about transactions and SQLite if you need to speed some operations up.

Mantis and Redmine, which one is better for issue tracking? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I consider to use Mantis or Redmine to manage projects. (Issue Tracking)
I know both are really good.
For now, I won't connect it with SVN or Git.
(It may happen later)
The main purpose is issue tracking on business with co-workers.
Please recommend one of them, or you can recommend the other one.
Thanks.
I can recommend redmine. I've been using it for more than 2 years, with 25-50 simultaneous users and more than 50 projects.
I went through a lot of updates without ever having any problems.
The database is properly normalized, so if you ever need to retrieve any data, you will be able to do so.
Numerous plugins exists which may cover special needs if there are any.
Edit: In the meantime, I had to change over to Jira, but I'd go back to redmine anytime if I could.
Never used Redmine, but we've been using Mantis for about 7-8 years for many projects for our distributed team. One of the benefits is its simplicity. We've even wrote a couple of our own extensions, e.g. widely used in our process Kanban board (one of the Agile approaches).
Sometimes I think it looks slightly outdated among other modern tools but it really works for us and we can extend it with our own PHP code.