Data mining? And how can I perform it on my website?

Data mining? And how can I perform it on my website? - data-mining

I’m preparing my graduation project from computer science, I made this website and it's running perfectly but my supervisor requested me to apply data mining on the website.
But I don’t understand what I should do.
The website is a social network, each user will have a profile and blog and access to some e-books that required you to be registered so you can download. The website also contains a music server that contains songs that a registered user can choose a song to download or to add it as a favorite in his profile page, the website contains ads (I used OpenX script), so this is most of the website services where I can perform data mining, the website is www.sy-stu.com.
I need ideas and what is the best way to present it in the interview?

You can ask your professor what was his intention of using data mining. Data mining algorithms can do various tasks, you need first define what you want to accomplish and then find some algorithms for this and technical possibilities.
Some ideas that came to my mind about usage of data mining in your project:
you can use data mining to find what songs (ebooks,etc.) can be favorited by a user based on other people favorites songs (find similarities, probably association rules would be a good algorithm for this).
you can use some clustering algorithms to group users based on some parameters and suggest them that they could become a connections with other people from the same group (if you have something like this)
Good luck!:)

Firstly, ask for clarification from your supervisor. Don't say 'What do you mean?', but ask 'Are you expecting something like this?' because it shows that you've at least thought about it.
If you can't think of anything, or your supervisor is vague, perform some simple data retrieval and analysis, e.g.
most active members
the most / least popular songs and books.
number of ads clicked etc
most popular website features
Just elementary analysis should suffice - you aren't doing a statistics degree. Work out the most songs downloaded in a day or per user, the average songs per user, how many users visit each day and how many sign up and never visit.
The purpose is to demostrate that your website is logging all activity, so that when you are asked 'how many books did the 20 most active users download in June' you will be able to work out the answer.
The alternative is a website that just runs and you don't have any knowledge of how your users are behaving and what they are doing, which means you aren't able to focus on things that they find important.

I dont know exactly what kind of data you are trying to mine, but have you check out google analytics? It is very easy to setup, once you register all you need is to include the javascript provided to your web pages. Google analytics will give you plenty of statistic about access to your site information regarding your site and visits. Is that what you need? The data produced is very easy to read as well and will be suitable for you to present I reckon.

Related

Can Datomic simplify querying data contained in dynamically accessed HTML documents?

I need to write an API which would provide access to data being served as HTML documents from a web server. I need for my users to be able to perform queries over the data.
Say on a web site there is a page which lists items and their owners. Then there is additional set of profile pages for owners which for each owner provide information about their reputation. An example query I may need to answer is "Give me ID's and owners of all items submitted in 2013 whose owners have reputation of at least 10".
Given a query to answer, I need to be able to screen scrape only the parts of the web site I need for answering the query at hand. And ideally cache the obtained information for future use with new queries.
I have no problem writing the screen scraping part, but I am struggling with designing the storage/query/cache part. Is there something about Clojure/Datomic that makes it an especially suitable technology choice for this kind of processing of data? I have been pointed in this direction before.

It seems a nice challenge but not sure about a few things: a) would you like to expose to your users a Datalog query box and so make them learn datalog-like syntax? b) what exact kind of results do you wish to cache, raw DB responses, html fomatted text, json ?
Anyway I suggest you to install and play a little bit with the Datomic console to get a grasp if you didn't before as it seems to me the more close idea to what you want to achieve atm https://www.youtube.com/watch?v=jyuBnl0XQ6s http://blog.datomic.com/2013/10/datomic-console.html
For the API I suggest you to use http://clojure-liberator.github.io/liberator/ as it provides sane defaults to implement REST services and let you focus on your app behaviour

Amazon AWS / Rakuten API - Inventory Management

I am sure this question may seem a bit lacking, but I literally do not know where to begin with. I want to develop a solution that will allow me to manage ALL of my Amazon and Rakuten/Buy.com inventory from my own website.
My main concern is keeping the inventory in sync, so the process would be as follows:
1.Fetch Orders sold today
a.Subtract the respective quantities
2.Fetch Rakuten orders sold
a.Subtract the respective quantities
3.Update Internal DB of products
a.Send out updated feeds to Amazon and Rakuten.
Again, I apologize if this question may seem a bit lacking, but I am having trouble understanding how exactly to implement this, any tips would be appreciated

For the Amazon part look at https://developer.amazonservices.com/
Rakuten, I think you will be able to do what you want with it via the FTP access, I'm still researching this. If I find more I'll respond with a better answer.

In order to process orders, you'll need to use be registered with Rakuten in order to get an authorisation token. For the API doc etc... try sending an email to support#rakuten.co.uk.
Incidentally, to send out updated feeds, you'll need to use the inventory API in order to update stock quantities (given that you'll be selling the same item Amazon etc..).

Creating apps for facebook groups possible or not?

I'm a member of a facebook group that has contests with artists on a weekly basis. There are over 5000 members to this group, fortunately not all of them participate because at the end of each week there is a voting for the favorite/best artwork of that week. And the admins have to manually go through image by image and count votes. Voting is limited to those who participate in the contest, so the artist places their vote as their image description... or part of it anyway.
I wanted to create an app that would retrieve the photo info from the album to build a list of the submitted images and the artists to make counting votes much easier.
I have, in fact, created such an application but it seems it only works on personal profiles and pages... not groups due to the need to be on a "white list". It strikes me as strange when a group is "OPEN" and an app isn't even allowed to read data there, but OK.
My question is if it is possible to get an app on that white list or at least to build an app specifically for a group for this purpose? I have been unsuccessful in my attempts to find any information on this subject. So, I am asking you all here at stack overflow since you all seem to be in bed with facebook in someway. I am just hoping to get a reply from someone that knows something rather than guessing or assuming.
The last contest had 325 participants and it was entirely too many for a poll.
I do not know if this topic has already been addressed... I used the search but stack overflow uses Google for a site search and because these topics are paginated Google has indexed results to be on a certain page but when you go there the topic is nowhere to be found... not very helpful...
Anyway, thanks for your time and I would be most appreciative of getting a reply rather than the post just getting buried to the point nobody will see it...

I was wondering the same thing for a similar reason.
It appears not:
user_groups
Provides access to the list of groups the user is a member of as the groups connection.
This permission is reserved for apps that replicate the Facebook
client on platforms that don’t have a native client.
https://developers.facebook.com/docs/facebook-login/permissions/v2.0#reference-extended-profile
If anybody discovers otherwise I'd love to know.

I don't know the exact answer of whether apps can be built only for groups without short-listing, but here is an alternative solutions.
If the purpose of this exercise is to limit some functionality (or entries) only to those users who are a member of a group, then why not get the "user_groups" permissions from the user, access his groups through the Graph API and flag users as either being a member of the group or not and extend the functionality accordingly. Perhaps you could even limit registrations to only those who are currently members of the group.

get personal Amazon purchase history and simiar titles

I'm writing my own service to track my growing library and notify me of when books become available. I'm in the middle of 5 series waiting for the next book to come out. I also pick some up locally and would like to grab similar titles from amazon. How can I get my purchase history and similar titles? Is there an API for these? I haven't found anything from searches.

I don't think Amazon exposes an API for order history.
The closest thing seems to be the product advertising API: http://docs.aws.amazon.com/AWSECommerceService/latest/DG/Welcome.html
That would allow you to search for items, for example using ItemSearch:
http://docs.aws.amazon.com/AWSECommerceService/latest/DG/ItemSearch.html
Alternatively, you probably could write a script scrape the data by navigating through the order history page, or to help you capture each page of results as you manually navigate your order history. You're on your own for this option, though.

How do sites count other sites' visitors and "value", and how can they tell users' location?

Hi actually this is a simple question but just came up out of the curiosity...
I have seen a web evaluation online tool recently called teqpad.com.I have lots of queries on it
How do they do it?? eg:page views daily visitors etc. without mapping real website??...
Website worth...is this getting any near to any site??
I don't know how do they got daily revenue??
I like traffic by country..it has seen same like in Google analytic s..how they got that info??
another one is ISP info and Google map location of server..
is there any one here done similar scripts?? if so what is your opinion??

They may be tracking user browser stats like Alexa does. (More info on Wikipedia.) A group of users installs a plug-in that reports which sites each user visits, like TV ratings work in most (all?) countries. This method is obviously not very reliable, and often nowhere near the actual numbers of visitors.
This is usually based on bullshit pseudo-scientific calculations and never a viable basis for evaluating the "value" of a web site, even though it may be possible to guesstimate the approximate ad revenues a site yields (see 3) But that is only one revenue stream - it says nothing about how expensive the site's daily maintenance is - servers, staff, content creation....
It should be possible to very roughly estimate daily revenue by taking the guesses on daily visitors/page views, count the frequency with which ads are shown, and look at what those ads usually yield per page view. It is probably pretty easy to get some rough numbers on what an ad view is worth on a big site if you're in the market.
and 5. It is possible to track down most IP addresses down to the visitor's country and sometimes even city. See the Geo targeting article on Wikipedia

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js