As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Can you recommend a full-text search engine? (Preferably open source)
I have a database of many (though relatively short) HTML documents. I want users to be able to search this database by entering one or more search words in my C++ desktop application. Hence, I’m looking for a fast full-text search solution to integrate with my app. Ideally, it should:
Skip common words, such as the, of, and, etc.
Support stemming, i.e. search for run also finds documents containing runner, running and ran.
Be able to update its index in the background as new documents are added to the database.
Be able to provide search word suggestions (like Google Suggest)
Have a well-documented API
To illustrate, assume the database has just two documents:
Document 1: This is a test of text search.
Document 2: Testing is fun.
The following words should be in the index: fun, search, test, testing, text. If the user types t in the search box, I want the application to be able to suggest test, testing and text (Ideally, the application should be able to query the search engine for the 10 most common search words starting with t). A search for testing should return both documents.
Other points:
I don't need multi-user support
I don't need support for complex queries
The database resides on the user's computer, so the indexing should be performed locally.
Can you suggest a C or C++ based solution? (I’ve briefly reviewed CLucene and Xapian, but I’m not sure if either will address my needs, especially querying the search word indexes for the suggest feature).
Also check out Sphinx
You can use Clucene for c/c++ and sphider for php. both are free but take time to setup and use, but not difficult to understand.
I have use with very success the dtSearch module.
They have a dll, that you can use with your application to index just anything and do more than the one you ask.
Note: Is not free.
I do not see in question that you ask for free one, so I write my favor one.
The dtSearch have inspire me and I create an indexer for my language Ellinika for my sites, because did not found what I was looking for my language.
There are some modules just for steeming if you just need to find suggestions for your words, I have get reference from here http://tartarus.org/~martin/PorterStemmer/
For example if you have a database like ms sql that all ready do some basic indexing, and some one search for a word, and you do not find nothing, you can do by your self steeming on this word, and search again...
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Is there some flowchart diagram tool that would (or could be made to) integrate with a self-hosted wiki?
Requirements:
basic functionality (e.g., drawing some boxes and some arrows)
would strongly prefer it to be visual (i.e., not written out in text that then gets converted)
allows for dynamic editing
it is important that the tool can be integrated into the wiki (e.g., as an extra panel somewhere)
can be run from a personal server
free
I've looked around at other threads here concerning a diagram tool, but they are either desktop applications, online ones which reside on third-party servers, or cost money.
[Edit] Thanks for the responses, but I would like them to be dynamically editable (I've added this to the requirements). What I mean is that I would like to integrate (or run it from a private server) some online collaborative diagramming tool. While I could create a JPG of something made in Graphviz and upload it, this is not easily editable. I would have to upload the source file somewhere, which someone would have to download and edit, then upload the new JPG.
Graphviz dot diagrams can be embedded in some wikis. Unfortunately for your requirements, it's text that gets converted. It's fairly simple to learn and use though.
http://www.graphviz.org/
EDIT: It's free / open source.
I've been looking for something similar - collaborative flowcharts in a wiki. The most interesting so far is this Mediawiki extension: http://www.flowchartwiki.org/wiki/index.php/Main_Page
Balsamiq Mockups for XWiki is the closest thing I've seen. It's more of a previsualization tool however for application mockups, though I'm not sure if this is the kind of tool you're looking for.
It is free if you qualify under their licensing.
Another option would be using Mediawiki with the Dia extension.
I like using the svgedit plugin in dokuwiki for quick diagramming on the run. It produces standard SVG text files and has an always up to the date javascript wysiwyg editor. And, I submitted a bug/feature request on github and the requested functionality was added post haste.
Edit: FOSS!
i understand this question is old enough. but you could try Origramy. it's a Flash-based visual tool. and XML as the result can be get from the component. alas integration to wiki must be made separately
Not sure of the technology you have on your server, but Open Diagram can create a jpg image file on the server which can then be referenced as a normal image in your wiki. Its open source.
I've enjoyed the simplicity of UMLet for a while as a desktop app. Don't let the name fool you! There is more than just UML - it has a lot of basic charting elements in it. It's not pretty, and it can be awkward sometimes, but it works. Has basic visual items in a template/toolbox that you double-click on to reproduce on your canvas. You can then move it about, resize it, or edit the item and modify it via text.
There isn't an existing online integration method (that I've seen), but being that it's good old fashioned java, you might be able to make it happen.
It's free and distributed under the GNU General Public Licence.
honestly i think you are going to have to use Java and code an applet. there are wondrous advancements in javascript libraries (AJAX, JQuery) that also might assist in this...
cheers my friend.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I need a note taking wiki-like "super" application. I'll start with a rundown of applications that I've already evaluated and/or used:
Wikidpad
Pros:
fast switching between the edit and view modes;
nice syntax (especially for pasting code snippets or just raw ASCII text, nice indenting visual clues);
it is standalone application that don't require server;
the wiki pages can be kept in flat text database;
easy drag-and-drop of file attachments (especially for image files).
Cons:
doesn't have history/version control of the pages and the state of the wiki database as a whole;
doesn't have the concept of namespaces for the wiki pages;
MoinMoin wiki
Pros:
nice syntax;
have standalone server (Python based) which makes it truly portable and standalone;
keeps the pages in flat files;
have a lots of nice plugins;
Cons:
its a wiki == slow iterations of editing/taking notes, viewing, rince-repeat...
doesn't have version control integration
Trac
Pros:
All of the features of the MoinMoin wiki, except the flat file database;
Version control integration: I can use the wiki changeset feature and the wiki pages as metadata of my personal codebase;
Cons:
All of the general drawbacks of the wikis;
Not truly portable;
todolist2 (by AbstractSpoon)
Pros:
fast, standalone todolist manager;
the tasks have this really nice and important for me feature of having an rich edit box for taking notes associated with the task with flipping between the task and the notes with a single key;
time tracking for the tasks;
Cons:
doesn't have version control built-in (it has "simple" version control by just making an automatic backup copies of the project/data file with time stamp embedded in its name).
it's hard to filter the tasks by urgency (in the GTD terms, it doesn't have the concept of the containers of tasks: Inbox, Maybe, Next action for each project, etc).
it doesn't have cross-referencing/linking between the tasks in wiki-like fashion.
Thinking Rock
Pros:
implements GTD almost perfectly;
it has notes for every action;
portable;
Cons:
(Maybe because of the Java GUI) doesn't have simple Undo when editing text notes;
it's clunky when switching between the projects/actions tree and the editable notes editbox;
doesn't have version control;
MonkeyGTD/TiddlyWiki
Pros:
truly standalone
almost 100% wiki
nice GTD implementation
Cons:
it's little confusing when there is no easy or user-friendly way to see an overview of the current structure of the wiki pages
I'm not sure if it scales well when there is a lots of pages/data/text/attachments.
doesn't have source control integration;
I'm not sure about version control/pages history...
I want an application that has the following:
the speed and the ease of edit/preview iteration cycle of wikidpad.
the wiki pages and the associated attachments as they are (like wikidpad and MoinMoin).
version control for the wiki pages (like MoinMoin or Trac).
source control integration (like Trac).
time tracking like todolist2 and the task/project nesting like todolist2 and ThinkingRock.
the almost perfect GTD implementation of ThinkingRock or MonkeyGTD.
It's obvious that I haven't decided which one to use because for some reason my requirements are somehow orthogonal in the terms of the features that the aforementioned applications provide... not that the features are orthogonal or it is impossible or impractical... actually, I think that maybe wikidpad is the closest to my ideal, which means that I could:
implement the features that I need (to add version control, GTD-life features/properties for the wiki pages themselves, source control integration), or
continue to search and evaluate, or
get some interesting and valuable opinions here.
Try ConnectedText: http://www.connectedtext.com/
ConnectedText has all the pros of Wikidpad and none of the cons. ConnectedText has a much superior query engine and contains semantic extensions not available in Wikidpad, and is much more stable.
Try KNote http://www.smartgoldfish.com/download.html.
I'm not sure you'll be able to find any application that meets all your requirements, but here is my shameless plug for a note-taking desktop application for Windows that works like a personal wiki:
- http://www.ppcsoft.com/blog/personal-wiki.asp
If you specify which feature(s) that is most important it is easier to tell which is more suitable ?
Any of the tools that save files as text can be added to your own version control system (which is better than using each tools' version control) that you could use for all your important documents.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Can anyone suggest a good open source django project to learn django development.
A great resource is www.djangopackages.com, which lists a lot of the notable Django apps out there, including links to their respective repos, popularity ratings, etc..
Another way to find popular projects is directly on GitHub: https://github.com/search?q=django
Finally:
Awesome Django # https://github.com/wsvincent/awesome-django
Awesome Python # https://github.com/vinta/awesome-python
django-basic-apps is also a very good start to learn django and reusable apps. These apps are simple enough and code is well written.
If you're looking to learn the popular reusable app feature of Django I would suggest Pinax, and you also may want to look at Django-Mingus. I'm the author behind Mingus and I recently posted a list of the apps included in Mingus along with a description of how and why they are used. It maybe helpful in finding some projects you may want to use yourself. Here's a link: "The apps that power Django-Mingus"
There's also a ton of Django projects on Google Code, GitHub, and BitBucket. Just search for "django".
Django-CMS, mentioned above, and Fein-CMS are both good CMS projects to dive into, and the screencasts by Eric are terrific - I absolutely suggest any noob to Django watch all 13 of those screencasts.
I asked Malcolm Tredinnick a few weeks ago if there was a project he admired and he suggested Django Packages. They keep their source on Github .
I wouldn't say that it should be used as a Django tutorial but they have an admirable style of programming and I have picked up more than a few tips and tricks by reading their source. It is definitely a good example to learn from.
One of the best for newbie: 13 screencasts "Django From the Ground Up" at This Week In Django#
edit:
#the website is closed. view archived page.
I recommend Waka Waka. Its a very well written wiki, that should give you a good idea of how to develop in django. It is an application used by Pinax, which by itself should be huge, to learn.
You can also of course go through some of ubernostrum's code like Registration, profiles and Contact Form, which are a standard in the django world. But as some of them involve dynamic forms, it may be best to get to it, after a little actual coding.
If you're interested in running Django in App Engine, checkout out this project. Here's a demo.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
Let's say I want to aggregate information related to a specific niche from many sources (could be travel, technology, or whatever).
How would I do that?
Have a spider/crawler who will crawl the web for finding the information I need (how would I tell the crawler what to crawl because I don't want to get the whole web?)?
Then have an indexing system to index and organize the information I crawled and also be a search engine?
Are systems like Nutch lucene.apache.org/nutch OK to be used for what I want? Do you recommend something else?
Or can you recommend another approach?
For example, how Techmeme.com is built? (it's an aggregator of technology news and it's completely automated - only recently they added some human intervention).
What would it take to build such a service?
Or how do Kayak.com aggregate their data? (It's a travel aggregator service.)
This all depends on the aggregator you are looking for.
Types:
Losely defined - Generially this requires for you datasource to be very flexible about determining the type of information gathers (answers the question of is this site/information Travel Related? Humour? Business related? )
Specific - This relaxes a requirement in the data storage that all of the data is specificially travel related requires for flights, hotel prices, etc.
Typcially an aggregator is a system of sub programs:
Grabber, this searches and grabs all of the content that is needed to be summarized
Summerization- this is typically done through queries to the db and can be adjusted based on user preferences [through programming logic]
View - this formats the information for what the user would like to see and can respond to feedback on the user's likes or dislikes of the item suggested.
For a basic look - check out this: http://en.wikipedia.org/wiki/Aggregator
It will give you an overview of aggregators in general.
In terms of how to build your own aggregator if you're looking for something out of the box that can get you content that YOU want - I'd suggest this: http://dailyme.com/
If you're looking for a codebase / architecture to BUILD your own aggregator-service - I'd suggest looking at something straight forward - like: Open Reddit from http://www.reddit.com/
You need to define what your application is going to do. Building your own web crawler is a huge task as you tend to keep adding new features as you find you need them... only to complicate your design, etc...
Building an aggregator is much different. Whereas a crawler simply retrieves data to be processed later, an aggregator takes already defined sets of data and puts them together. If you use an aggregator, you will probably want to look for already defined travel feeds, financial feeds, travel data, etc... An aggregator is easier to build IMO, but it's more constrained.
If you, instead, want to build a crawler you'll need to define starting pages, define ending conditions (crawl depth, time, etc...) and so on and then still process the data afterwards (that is aggregate, summarize and so on).
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Does anyone know of any existing packages or libraries that can be used to build a calendar in a django app?
A quick google search reveals django-gencal, which looks like exactly what you need. It would also be worth looking at the snippets under the calendar tag on Django Snippets at http://www.djangosnippets.org/tags/calendar/.
It seems that django-calendar has become django-agenda: http://github.com/dokterbob/django-agenda
Great Tipps
django-swingtime lives on
http://github.com/dakrauth/django-swingtime
The django-schedule code originally from thauber (thauber/django-schedule) has been forked and worked into the glamkit/glamkit-eventtools code for Galleries, Libraries, Museums and Archives. It has also been forked and updated by a variety of other folks, e.g. boskee/django-schedule, and my guess is that that might have fewer dependencies and be easier to integrate into another project. It says:
Django-schedule: A calendaring/scheduling application, featuring:
one-time and recurring events
calendar exceptions (occurrences changed or cancelled)
occurrences accessible through Event API and Period API
relations of events to generic objects
ready to use, nice user interface
view day, week, month, three months and year
project sample which can be launched immediately and reused in your project
See the github "network" tab for a graphical navigation from the point of view of a given branch to see how other branches relate to it (i.e. what is available for merging).
svn checkout http://django-calendar.googlecode.com/svn/trunk/ django-calendar-read-only
svn: URL 'http://django-calendar.googlecode.com/svn/trunk' doesn't exist
so google search may reveal, but it's no longer exists.
There is another calendar alternative here, Django Event Calendar from 3captus, that offers something a bit simpler. I'm trying it out now, but it looks like a better fit for me.
From the features list:
Full feature calendar display using python calendar class
Support month scrolling (forward or backward)
AJAX add, modify, delete GUI
Require mimimum knowledge of Django, should be a good compliment after you are done with django tutorial
(http://www.djangoproject.com/documentation/tutorial01/)
Calendar and Event class can be used in any python project
Full unit test included
There are also some calendar functions built into Python itself, you can see a simple implementation here.
Today I ran into django-swingtime. Worth checking out.