I'm using Aspell through pipes, and would like to know how can I add a new word to my personal dictionary.
For example, to check spelling for the word "tesst" I use:
echo tesst|aspell -a -p .aspell.en_US.pws
Here is explained that I can use "*word" to add this word to my personal dictionary:
echo *tesst|aspell -a -p .aspell.en_US.pws
But it doesn't.
What I'm doing wrong?.
Have you saved your personal dictionary ?
A line prefixed with # will cause the personal dictionaries to be saved.
Hope this helps
I could not make Aspell6 working with personal dictionaries. Instead I had to create extra-dictionaries as I explain at http://vouters.dyndns.org/tima/Linux-Windows-Perl-Aspell-Determining_the_country_of_a_Web_query.html. Otherwise I was getting an error from Aspell. If you read the Perl code at the above URL it uses the Aspell6 'extra-dicts' feature. This was my only workaround to Aspell's personal dictionaries.
FYI: the goal of the Perl code is to determine the language spoken by a Web visitor leaving his query into a company's Web search engine. All the queries are stored within an xlsx Excel file. The final purpose is to determine whether the translation of a company's English written Web document is financially worth to translate it to a local dialect.
In the hope this will help you.
Philippe Vouters (Fontainebleau/France)
Related
Is there a "code free" way to get SOLR/LUCENE (or something similar) pointed at a set of word docs to make them quickly searchable by a user?
I am prototyping, seeing if there is value in, a system to search through some homegrown news articles. Before I stand up code to handle search string input and document indexing, I wanted to see if it was even worth it before I starting trying to figure it all out.
Thanks,
Judd
Using the bin/post tool of Solr and the Tika handler (named the ExtractingRequestHandler), you should be able to get something up and running for prototyping rather quickly.
See the introduction of Uploading Data with Solr Cell using Apache Tika. Tika is used to process a wide range of different document types.
You can give the Solr post tool a directory or a list of files to submit to the index.
Automatically detect content types in a folder, and recursively scan it for documents for indexing into gettingstarted.
bin/post -c gettingstarted afolder/
I'm using "Safe Search and Replace on Database with Serialized Data v3.1.0" to do a search and replace of my database. I've been trying to write it myself but haven't had any luck and Google seems to not have the answer I'm looking for.
Basically I need a string that will target upload folders from 2010-2015 with jpg|jpeg|png|gif file types that I can replace the string with a simple placeholder.png file I created.
Here's what I've managed to make that doesn't seem to work lol sorry if it's just terrible:
(uploads)(.*?).(jpg|png|gif|jpeg)
or
^\/wp-content\/uploads\/((2014|2013|2012|2011)|(2015\/(01|02|03|04|05|06|07|08)))\/(.+)(jpg|png|gif|jpeg)
I've tried other variations I've made but when conducting a "dry run" it states that 0 cells would have been changed.
The image urls are full and not relative.
Can anyone assist?
Please try:
(uploads\/201[0-5]\/(?:0[1-9]|1[0-2])\/.*?\.(?:jpg|png|gif|jpeg))
REGEX 101 DEMO
Hope this works for you:
^\/wp-content\/uploads\/(201[0-4]\/\d{2}|2015\/0[0-8])\/(.+\.(jpg|png|gif|jpeg))$
I tested it here: regexp101.
You will find more in depth-explanation there.
Be aware that the uploads folder format can be changed in the Dashboard.
How I can generate random word from real language?
Anybody know any API from internet with this functional?
For example I send http-request to 'ht_tp://www.any...api.com/getword?lang=en' and I get responce 'Town'. Or 'Fast'. Or 'Received'... For example I send http-request to 'ht_tp://www.any...api.com/getword?lang=ru' and I get responce 'Ходить'. Or 'Шапка'. Or 'Отправлено'... Any form (noun, adjective, verb etc...) of the words of the any language.
I find resource 'http://www.randomlists.com/random-words'. But this is not JSON format, only English, and don't any warranty work in long time.
Please any ideas.
See this answer : https://stackoverflow.com/questions/824422/can-i-get-an-english-dictionary-word-list-somewhere Download a word dictionary, stick in the databse and fetch a random record or read a random line from the file each time. This way you don't depend on 3rd party API and you can extend it in all the languages you can find words for.
You can download the OpenOffice dictionaries. They come as extension (oxt), which is nothing different than a ZIP file. You may open them with 7zip or alike. Within you will find lots of files, interesting for you are the *.dic files. They will also contain resolutions or number words.
When you encounter something like abandon/LdS get rid of the /LdS this is used for hunspell.
Take these *.dic files use their name as key, put them into a database and pick a random word from there for a given language code.
Update
Older, but easier to access, the archived hunspell dictionaries from OpenOffice.
This question can be viewed in two ways and therefore I give two answers:
To collect words, I would run a spider on websites with known language (Wikipedia is a good starting point) and strip HTML tags.
To generate words from a real language is trickier. Using statistics from the collected words, it is possible to use Markow chains that produces statistically real words. I have tried letter by letter generation, and that works poorly. It is probably a better approach to use syllable construction instead.
Please notice that if i do not want to use database.
i am now learning Unix networking programming. And i have my university book library all book list in seperated txt files. for example, the 'b' begin books is stored in b.txt. all a-z book count is about 1 million record. a line for a book' name and detailed other info.
Now i want to do a program to provide the query service of book list, for example, giving a book name, it can return the detailed info of this bool is it exists.
So i need to first build a module to take the function of query.
Then write the server side to call the query module and get the result and sending the result to the client module.
My question is , if i do not using database. How to realize the query module using c/c++, just first locating the first letter, for example, H begin book name should find in H.txt or H1.txt and H2.txt, using fopen open the file, then read line by line, then compare with queried book name using strFind, strCmp similar function, if have then return the result. i just think this is a time consuming thing and is not realize for using. And if have any such query system could for reference not using database but is bearable in time?
There are several options. The cheapest option (=low development time, low maintenance, low hardware requirements), IMO, is to create a html page on a separate site that links to all the data files. Then you set up another page that uses google.com to search that site. Then you just tell the google web spider to index your site. That way you get excellent performance with minimal work. But... you don't get to program any C.
Simple solution using C:
Do as you yourself suggest. If you have lots of memory available for file caching the performance won't be so bad unless the load gets high.
There will still be some work to do with the rest of the solution since you should delegate the search to worker threads.
Intermediate solution using C:
Find a 3rd party search engine and integrate it with your network code.
Advanced solution using C:
Implement your own search engine.
The problem is WHY DON'T U WANT TO USE DATABASE ?
1. make it easier to deploy?
sqlite may a good choice .
2. trying another method ?
lucene is a good choice of information retrieval, which is written by java.
clucene is someone rewrite the lucene to c .
You may also need stemmer tool(get the root of words),ictclas(chinese words' term extract) etc .
3. would like to do anything by yourself ?
It is easy to manage text file in system , while , as for a "query system", store is not enough , the main problem is IR(information retrieval ).
You may learn something about index building, store and query the index
I want my program to search wikipedia and get the info it searches for and put it into a large string and output into a file. How can I do that in C++? Any info please tell? need more anwsers please
Use wget with the query URL
wget --output-document=result.html http://en.wikipedia.org/wiki/Special:Search?search=jon+skeet&go=Go
This searches for jon skeet and stores the result in result.html
To use it from C++ you can e.g. use the system() call to execute wget in a seperate process.
libcURL is pretty popular. I don't know that the interface is especially object-oriented, but it's certainly usable from C++.
There are a number of client APIs for MediaWiki (the wiki engine that powers Wikipedia). Here's a listing. They provide the ability to create/delete/edit/search articles. Nothing in straight C++ but it still may be useful.
DotNetWikiBot was quite useful on one project that I had...