similar articles suggestion based on article read by a user - python-2.7

I am looking for the best algorithm to use for article suggestion in my projects. We have bunch of 1000 articles. I would like to recommend similar articles to users based on the article he is reading. Which algorithm best suits this. I tried content based recommendation, which involves training the model. In my case it can be simple text based similarity to the article the user is reading and not the history of articles read by users

maybe look at what karpathy has done with arxiv sanity.
https://github.com/karpathy/arxiv-sanity-preserver

Related

Is there a way to get user statistics from MoinMoin

I am a beginner with MoinMoin. I am looking for a way to capture some stats about users. Specifically, interested in knowing the top contributors, their edit frequency, etc. Is there a way to do that??
Aside from the usual webserver log based stuff, you could look at moin's editlog and eventlog, there are some helpers in MoinMoin.log package.
moin has also some buildin "stats" stuff in MoinMoin.stats and some macros using that.
if you have questions, the #moin irc channel on freenode (be patient) or the moin-user mailing list is maybe quicker than stackoverflow.

Bayesian classification or similar technique for recommendation system

I'm working on a news app. On the home page, the user sees a list of headlines and then he can click one to read the article and comment.
I would like to offer an option for "recommended articles" based on his history. For example, if he read an article - I'll feed the algorithm with the headline keywords so it will learn what this user likes to read.
My problem with what I've read about bayesian filters is that you need to train them with good input and bad input (such as good emails and spam emails). The difference in my case is that there are no bad examples. If the user didn't read an article - it doesn't mean it's a bad classification (since he still might read it in the future), but only if he read one - it's more likely that he'll read similar articles in the future.
Basically, I'm looking for an algorithm to help me recommend articles to a specific user - based on what he read in the past. It will run on a mobile device, so any implementation (C/C++/Obj-C) will work.
Thanks.
You can treat this as a binary classification problem. It is either an article he likes to read or an article he possibly doesn't like to read.
You can use the dlib C++ library for the binary classifier algorithm.

Automatic text classification using n-gram model

hi i'am a newbie to data mining. My task is to automatically classify text documents using n-grams method.
I could not find proper resources on this topic, kindly help me how to proceed in this topic, where can i find tutorials based on n-gram classification.
i need java source code on this topic for my understanding.
thanks in advance.
I highly recommend Stanford's online NLP course by Dan Jurafsky & Chris Manning. Chapter 4 addresses n-grams, but all the chapters before it give a great background.
Stanford also has some great open source software you can use for text classification, from tokenizing to part of speech tagging.
i found better tutorial with documentation in
http://textcat.sourceforge.net/README.txt
http://textcat.sourceforge.net/doc/index.html

Django Comments and Rating Systems

I am looking for a blogging and comments system that can smoothly integrate with my Django sites. I've found there is a lot on the Net and got lost a bit, and I don't have much experience on this. Hope you guys can give me some suggestions.
Here are the things that I would like to have:
Tag Clouds
Articles Archive (by months/by years)
Articles Rating (e.g. with Stars or customize icons)
Comments to the particular Topic/Articles
Sub-Comments of a particular comments (i.e. following up comments)
Blogs/Articles Searching
Able to relate other articles that is relevant (i.e. follow up Articles)
Pagination of the comments if get too long
OpenIDs supports (e.g. facebook, hotmail, blogger, twitter...etc)
Support login before user can comments
Able to retrieve Blogs' Header and customized the display order
Able to subscribe this article to RSS
Able to Email this to friends (this may not belongs to the comments system)
If I missed some common functions, please let me know. The comments system I am looking for should do most jobs that those popular comments system should do on the web, e.g. WordPress.
Thank you so much everyone. Have a nice day.
I myself really like django-threadedcomments. It supports threaded commenting like what you would see in Disqus.
i heard django-comment-utils is quite good. - may you test it :)

How do you structure a database for a wiki site?

What's does the table look like- is there only one? How do you revert to older versions? Similar to how Stack overflow works.
The best way to go about this is to look at other software such as MediaWiki and see how they structure their database. Then you can pick and choose what you want to use to start off on your own wiki design.
On the other hand, you could always start off with a pretty basic spread of tables that would keep track of Users, Articles, Revisions on an Article, etc. and start spiraling out from there.
Mediawiki details in their help pages how they layout their database.
I agree with CookieOfFortune's comment that you should take a look at an existing open source wiki to see how they do it, but I'll also offer this thought prefixed with the fact that I have no experience writing wiki software. Maybe some sort of partial star schema could be useful in maintaining the previous versions.