Are there any good HTML decoders in clojure?
I have tried a few out such as the ones from clojure.tools.html-utils and codec from ring but they don't decode the html fully, ie there are still some encoded symbols.
If I put my code to be decoded in a website such as https://opinionatedgeek.com/Codecs/HtmlDecoder , for example, the HTML decodes properly into text.
The type of text I am getting is Broad Institute of MIT and Harvard - Cambridge/Boston, MA - ONSITE<p>Do you want to help cure cancer? Do you care about the mission behind your software engineering?<p>We are a motivated team of software engineers building scalable tools to analyze massive amounts of genomic data using cloud compute software to process 24TB of biological data daily... and that's just the beginning! We are co-developing products to advance science with the biggest partners in the industry -- working directly with and alongside their engineers.<p>We are seeking strong software engineers to join our team. We have a flat organizational structure with self-directed, agile teams.<p>We use Scala, Spark, Akka, React & Clojurescript. Experience in the tech stack or sciences not req'd.<p>Here is some recent information on our mission: http://www.wbur.org/commonhealth/2016/07/07/precision-medici...<p>Interested? Please email Amy Massey - massey#broadinstitute.org/n/n/nID is https://news.ycombinator.com/item?
When you put this through the website I linked above you get Broad Institute of MIT and Harvard - Cambridge/Boston, MA - ONSITEDo you want to help cure cancer? Do you care about the mission behind your software engineering?We are a motivated team of software engineers building scalable tools to analyze massive amounts of genomic data using cloud compute software to process 24TB of biological data daily... and that's just the beginning! We are co-developing products to advance science with the biggest partners in the industry -- working directly with and alongside their engineers.We are seeking strong software engineers to join our team. We have a flat organizational structure with self-directed, agile teams.We use Scala, Spark, Akka, React & Clojurescript. Experience in the tech stack or sciences not req'd.Here is some recent information on our mission: http://www.wbur.org/commonhealth/2016/07/07/precision-medici...Interested? Please email Amy Massey - massey#broadinstitute.org/n/n/nID is https://news.ycombinator.com/item?
This is how I want it to look
(import '[org.jsoup Jsoup])
(.text (Jsoup/parse s))
Taken from crouton
I'm working on a text mining problem: extract the place from the text. The place could be either only states, or more specific such as name of a neighborhood in Chicago, or even a specific address. But it's only in US.
I've been trying Yahoo Place maker api, but I can't create the api key ( the website is not responding). Is there anyway to do it, such as rapid miner, or write a comprehensive regex?
Consider Stanford Named Entity Recognizer (NER). Online demo here:
http://nlp.stanford.edu:8080/ner/process
It's a java library. License is GPL v2, though the license to distribute in a standalone app is pricey.
Are there to day any concept mining open source tools available? I have only be coming across like Leximancer, which although seem to fit the role is not open source and quite expensive for a undergraduate student. I have been unsuccessful so far since the word 'concept' on both google and google scholar seems to be un-matching what I want.
It seems to me you need a text mining tool for clustering. RapidMiner has an open-source, Java based Community Edition which has several extensions (Text Mining, R, etc.). In addition you can develop and integrate your own algorithms too.
Moreover Rexer Analytics offers a comprehensive data mining survey annually, you can call for reports for free.
Wondering if anyone knows how to create thumbnails in C++ from NITF 2.1 images
Using the package below you should be able to read a NITF image and then generate your own smaller version to save as a thumbnail.
NITRO is a full-fledged, extensible library solution for reading and writing National Imagery Transmission Format (NITF) files, a U.S. Department of Defense standard format. It is written in cross-platform C, with bindings available for other languages (C++, Java, Python). NITRO was originally developed by General Dynamics - Advanced Information Systems in 2004 and is continuously being improved. It is now released as open-source software under the Lesser GNU Public License.
http://nitro-nitf.sourceforge.net/wikka.php?wakka=HomePage
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Basically I want tools which generate source code visualization like:
function call graph
dependency graph
...
Doxygen is really excellent for this, although you will need to install GraphViz to get the the graphs to draw.
Once you've got everything installed, it's really rather simple to draw the graphs. Make sure you set EXTRACT_ALL and CALL_GRAPH to true and you should be good to go.
The full documentation on this function for doxygen is here.
I strongly recommend BOUML. It's a free UML modelling application, which:
is extremely fast (fastest UML tool ever created, check out benchmarks),
has rock solid C++ import support,
has great SVG export support, which is important, because viewing large graphs in vector format, which scales fast in e.g. Firefox, is very convenient (you can quickly switch between "birds eye" view and class detail view),
is full featured, impressively intensively developed (look at development history, it's hard to believe that so fast progress is possible).
So: import your code into BOUML and view it there, or export to SVG and view it in Firefox.
For the free version:
source is on Github as DoUML
Installers can be downloaded from http://www.bouml.fr/download.html
You can look at different tools for software design and modelling (Rational Rose, Sparx Enterprise Architect, Umbrello, etc). Majority of them have some functionality to reverse modeling by source code, and getting UML class diagrams, and sometimes even sequence diagrams (and this is very close to functions call graph).
But after you get some pictures on really big project code base you could realise that such graphs are rather hard to read and understand. Unfortunally visualization capabilities of complexity are very limited.
As for me, using a "divide and rule" idiom is more convinient approach. You can extract different functionality blocks or layers from your some code base (just sorting cpp-files by different folders sometimes enough). Another way is to use some scripts (bash, python) to create simple csv tables with interested parameters of files, classes or functions like "number of dependencies" etc).
If you use Visual Studio, the 2010 Ultimate release lets you generate sequence diagrams and dependency graphs. However, the release currently supports only .NET application projects.
The team has gotten lots of interest in supporting C++ in a future release, so you might want stay tuned. In the meantime, you can post in the VS 2010 Architectural Discovery & Modeling Tools forum at http://social.msdn.microsoft.com/Forums/en-US/vsarch/threads to request an update. I know the product team loves hearing customer feedback about the tools.
In the meantime, you can learn more about creating sequence diagrams and dependency diagrams from .NET code in the following topics:
How to: Find Code Using Architecture Explorer: http://msdn.microsoft.com/en-us/library/dd409431%28VS.100%29.aspx
How to: Generate Graph Documents from Code: http://msdn.microsoft.com/en-us/library/dd409453%28VS.100%29.aspx#SeeSpecificSource
How to: Explore Code with Sequence Diagrams: http://msdn.microsoft.com/en-us/library/ee317485%28VS.100%29.aspx
To try the RC release and provide feedback, download it at http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=457bab91-5eb2-4b36-b0f4-d6f34683c62a
Try doxygen
Example output from Xerces
In addition to written tools above, you may try understand. But, it is not free.
Might be a duplication, but check out ollydbg, IDA Pro and this website has a whole bunch of resources with some very sexy images.