I want to use word count on this dataset :
http://snap.stanford.edu/data/web-Movies.html
I can't find a program on the internet which will help me to do so.
Please suggest something ?
This is something that is pretty amenable to MapReduce. If you're a python guy, you might like mrjob, which actually uses a word count example in a lot of their documentation:
http://pythonhosted.org/mrjob/guides/writing-mrjobs.html
Have a look at easyLambda. It is a C++ and MPI library based on data-flow and map-reduce. It has a word-count example as well.
Related
I am currently working on my thesis and I am trying to analyze the results of NGS sequencing Illumina. I am not really familiar with bioinformatics and in this part of my project, I am trying two compare two vcf files corresponding to the results of healthy tissue and tumor tissue. I want to compare these vcf files and remove their similarities. More specifically I want to remove the information of the healthy tissue from the tumor one. Have you any suggestions on which tool I should use or any way that I can do my analysis? If you can help me I would be more than thankful. Thank you in advance!
I understand your problem. First thing I would recommend is to use Unix software (I don't know which OS you're running) called VCFtools. It's pretty simple to use. But if You want to do all the processing with, for example python, you can use the pandas library for python which helps to process data in column format or PyVCF library, which is a parser for VCF files. I can help you more if you can provide some example data you're processing.
I'm actually starting creating a small language (in vb net, yes I know, maybe not a good idea).
I already started learning tutorials about regex, but apparently this function is saying me to get out).
I want to add some kind of commands, such as a command that allow you to arg. a /print command, something like:
/PRINT["Hello world";"blue";propety:{bold;italic}]
So, for me, the regex is :
"{{^\^{\|^#\^~\{}~\^]|\~^[}^\}^#~\[}~^\}^##{\~{^}^#\#~#}\^#}^]|\|}]#\|{"
So you understand that's not something I like writing.
Would you show me how to construct regex code for the first command I let?
Regex alone isn't the best way to create a language that, well, actually works.
Read this article for more info. I'm sure you can find better way to write a language if you really need to write it. In vb.net...
Anyway, if you insist on writing it in vb, I found a video that will help you with it.
I am trying to create C++ code that allows User Input in selecting a variety of fields, then it will calculate many different angles and show the users the results, as well as a graph.
However, it has been suggested to us by our lecturer that it may be a good idea to write the code to these calculations etc in C++, then input the results into Excel.
Does anyone have any idea how to do this? Literally looking for a way for the user to fill in the required values on C++ and then to be AUTOMATICALLY taken to the excel file to show the results in the table and graph format.
If this is not possible, is there a way to display the results in the table and graph format through C++?
Thanks very much in advance
Excel provides COM interface which you can use from your C++ application.
This can be done in the way described in this article:
http://support.microsoft.com/kb/216686
This link might also be useful:
http://www.codeproject.com/Articles/10886/How-to-use-Managed-C-to-Automate-Excel
I think the second link would be better for you as its more of a step by step guide which should help you to workout the answer.
Use COM Automation to automate excel.
The best way to do this is to use the vole library by Matthew Wilson at
http://vole.sourceforge.net/
Take a look at the examples. I do not think there is an example for excel, but there is one for microsoft word at http://www.codeproject.com/KB/COM/VOLE_word.aspx
I have used vole in the past, and it makes it a whole lot easier
I do C++ and R programming since last 3 years.
I wish to know is there a search engine for C++ commands where I can find all the details regarding the command.
This is the example of what I am looking for:
This is a search engine for R commands:
http://www.rseek.org/
Google works pretty well.
If you only want C++ hits, use the site: restriction, as in "site:cppreference.com emplace_back"
Perhaps "site:cppreference.com pow" is a better example, since pow by itself would normally come up with many unrelated hits.
Of course, keyword search works also, try "site:cppreference.com natural logarithm"
I use this often: http://www.cplusplus.com/. It has a search box.
I want my program to search wikipedia and get the info it searches for and put it into a large string and output into a file. How can I do that in C++? Any info please tell? need more anwsers please
Use wget with the query URL
wget --output-document=result.html http://en.wikipedia.org/wiki/Special:Search?search=jon+skeet&go=Go
This searches for jon skeet and stores the result in result.html
To use it from C++ you can e.g. use the system() call to execute wget in a seperate process.
libcURL is pretty popular. I don't know that the interface is especially object-oriented, but it's certainly usable from C++.
There are a number of client APIs for MediaWiki (the wiki engine that powers Wikipedia). Here's a listing. They provide the ability to create/delete/edit/search articles. Nothing in straight C++ but it still may be useful.
DotNetWikiBot was quite useful on one project that I had...