Where should I write map-reduce programs - in text file or anything else ?
What is the file format to save file containing Map-reduce program?
e.g.
In java, text file, having java code saved as filename.java but what will be that for map-reduce program?
Please answer as I need it badly.
You can use Java to implement Map-reduce Program. Alternatively, you may use other languages including Perl, Python, Ruby, C++, PHP, and R via Hadoop Streaming.
The Map_reduce program will be simple java file like with naming mapreduce.java
Also regarding Map-Reduce or Elastic Map_reduce you can have help in
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-what-is-emr.html
or
mapreduce count example
or
http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html
Thanks
Related
I want to use word count on this dataset :
http://snap.stanford.edu/data/web-Movies.html
I can't find a program on the internet which will help me to do so.
Please suggest something ?
This is something that is pretty amenable to MapReduce. If you're a python guy, you might like mrjob, which actually uses a word count example in a lot of their documentation:
http://pythonhosted.org/mrjob/guides/writing-mrjobs.html
Have a look at easyLambda. It is a C++ and MPI library based on data-flow and map-reduce. It has a word-count example as well.
how can i extract numerical data from a webpage source code that is embedded in a text file using fortran? (e.g https://www.google.com/finance?q=NYSE:KO) My aim is to retrieve the stock price through the source code. Any help will be appreciated!
Thanks in advance.
Do you really want to do that in Fortran? Fortran is not very well suited for string manipulation (in my opinion)!
I would recommend to get and process the HTML in another language (e.g. Python), and write in the data in a structured way that Fortran can directly read. You could call that script / program from Fortran using the statement
call system('command')
I tried to build a chatbot in AIML. I downloaded the codes from http://nlp-addiction.com/chatbot/mathbot/ but couldn't get the idea about how to run the program. Please help me.
An AIML file isn't program code, it's a data file (much like any other xml file).
You need to use an interpreter like Program-AB to load and use the file to answer queries.
If you just want to test the contents and formatting of the aiml file, you could use Pandorabots and load the file into a blank bot fairly easily.
Yes, AIML file isn't program code. It's just like a data format. You can learn about it more from here : http://www.alicebot.org/aiml.html
AIML is a data encoding format that tells the bot when to do what to do. Many interpreters can be used to interpret the aiml tags.
One of them is PyAIML which is python based interpreter fairly simple to use.
I have a python web application and I would like to run multiple scripts from it. Scripts are written in various languages, like bash, lua, perl, c++, ruby, etc. The thing is that I would like to first parse the script to replace the predefined building blocks with actual numbers. So for example, let's say I have the following bash script:
#!/bin/bash
ping -c 3 {{ip}}
Then I would like to pass that script into a wrapper with all the variables the script will need - in this case just the ip variable. The wrapper should replace all the variables with actual values, so if we inputed the ip variable as a value = 10.1.1.1, then the script should become:
#!/bin/bash
ping -c 3 10.1.1.1
And I want this functionality for all of the programming languages. So I'm using python, which should in turn use some kind of wrapper that accepts a script + arguments, and outputs corresponding output script.
I've found swig already, but don't know if it does what I want, so suggestions are greatly appreciated.
Thank you
It sounds like you want m4, which would handle this easily for all scripting languages. C++, being a compiled language, would be a totally different undertaking, but I'll assume you didn't really mean that.
There are multiple free m4 implementations; it's already going to be on every Linux box.
You definitely don't want SWIG, it does something completely different.
I would design the scripts to take command line arguments, read a common configuration file, or read environment variables, instead of trying to modify the internals of the scripts.
Are you really going to have your Python web application modify the C++ code, recompile it, and then run it? Something about that seems wrong to me. It would probably be slow.
I have 3 standalone c++ componenets i.e Driver , Parser and Translater
Driver connects to data source and fetches data , parser parses data and Transform converts the data as needed i.e the flow of data looks like below
Driver.Out --> Parser.In -- Parser.Out --> Translator.In
I want to write a runtime interpreter which ties these components with Queues and produce the desired output.
I wanna use this interpreter as many times as possible , each being a independent Process
Any thoughts will be highly appreciated
Did you consider embedding an interpreter like lua inside your application, or embed your application as an extension for ocaml or python
But I don't understand exactly your question.
Use flex and bison. A good book on how to write interpreters or compilers with them is Flex & Bison: Text Processing Tools by John Levine.