Cannot find org.apache.hadoop.hbase.mapreduce and mapred packages in hbase 1.2.6 - mapreduce

I am trying to upgrade an existing project's hbase version which was previously 0.94.1 annd now I will upgrade to 1.2.6. In the official website I can see some packages and classes such as mapreduce and mapred, but they are not available in the jars that I found. I have found hbase-common hbase-annotations and hbase-client jars. However these jars does not includes all the classes in the API. Where can I find them?
Regards

You can usually work this sort of stuff out by looking at the source. Going through the different releases, you can see the following:
0.94.1 - https://github.com/apache/hbase/tree/2fd647d99e4aae7da71cebb9efd017eebfe0e4fb
Everything in hbase was combined in a single package.
1.2.6 - https://github.com/apache/hbase/tree/2f9b9e17d0522e36063bf52ecc58576243d20b3f
The code has been split into numerous packages. In this version the mapred and mapreduce code has been moved into hbase-server [See here].

Related

Stanford CoreNLP caseless classifier in NLTK

I can't find the file english.conll.4class.caseless.distsim.crf.ser.gz from the zip file downloaded from http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip .Can anyone please tell me how to get that caseless classfier from Stanford CoreNLP?
I don't think they are giving a direct gz files for caseless but are deriving via a makefile script, I checked in linux versions aswell and its not available there too, and somehow they are building it via truecaser it seems, While I dont totally understand the mechanism, below is a pointer, where I see the references in stanford core nlp git hub.
https://github.com/stanfordnlp/CoreNLP/blob/d558d95d80b36b5b45bc21882cbc0ef7452eda24/scripts/ner/Makefile
You can search for "english.conll.4class.caseless.distsim.crf.ser.gz" in corenlp github for more pointers about it.
FYI.. you can also look at older versions, as its mentioned in the doc that they have provided them seperately.
For those who face the same problem;
Download model jar from https://stanfordnlp.github.io/CoreNLP/index.html#download (There is a table that lists different models for different languages) and open/extract the jar content(e.g I used WinRar) then go to edu/stanford/nlp/models/ner directory you can find the ser.gz files for any model.

How should I provide library binaries to developers?

I want to make it easy for others to work on my repository. However, since some of the compiled dependencies are over 100mb in size, I cannot include them into the repository. Github rejects those files.
What is the best way to handle large binaries of dependencies? Building the libraries from source is not easy under Windows and takes hours. I don't want every developer to struggle with this process.
I've recently been working on using Ivy (http://ant.apache.org/ivy/) with C++ binaries. The basic idea is that you build the binaries for every build combination. You will then zip each build combination into a file with a name like mypackage-windows-vs12-x86-debug.zip. In your ivy.xml, you will associate each zip file with exactly one configuration (ex: windows-vs12-x86-debug). Then you publish this package of multiple zip files to an Ivy repo. You can either host the repo yourself or you can try to upload to an existing Ivy repo. You would create a package of zip files for each dependency, and the ivy.xml files will describe the dependency chain among all the packages.
Then, your developers must set up Ivy. In their ivy.xml files, they will list your package as a dependency, along with the configuration they need (ex: windows-vs12-x86-debug). They will also need to add an ivy resolve/retrieve step to their build. Ivy will download the zip files for your package and everything that your package depends on. Then they will need to set up unzip & move tasks in their builds to extract the binaries you are providing, and put them in places their build is expecting.
Ivy's a cool tool but it is definitely streamlined for Java and not for C++. When it's all set up, it's pretty great. However, in my experience as a person who is not really familiar with DevOps at all, integrating it into a C++ build has been challenging. I found that it was easiest to create simple ant tasks that do the required ivy actions, then use my "regular" build system (make) to call those ant tasks when needed.
So I should also mention that the reason I looked into using Ivy was that I was implementing this in a corporate environment where I couldn't change system files. If you and your developers can do that, you may be better off with a RPM/APT system. You'd set up a repo and get your developers to add your repo to the appropriate RPM/APT config file. Then they would run commands like sudo apt-get install mypackage and apt-get would do all the work of downloading and installing the right files in the right places. I don't know how this would work on Windows, maybe someone has created a windows RPM/APT client.

Changing where Sitecore module is installed

I have a package I want to install. I would like the files to end up in a different directory than the installation wizard choses for them.
For example, my Sitecore copy is running at C:\SiteCore\website
The module added files to C:\SiteCore\website\Console
I would like the files to ultimately live at C:\SiteCore\website\sitecore_modules\Console
I am using Sitecore 6.5 rev 111230, but we are planning to upgrade very soon. I would like for my installed packages to migrate seamlessly once we have upgraded. For reference, the package I want to install at the moment is the Sitecore Powershell Extensions. Although, I would prefer to apply a similar method to any future packages that I install.
Is there a secret switch in the package installation process to allow me to do this? Can I do it from the package installation wizard? Is there another way to install packages?
I'm assuming I can't just change the package path and expect everything to keep working. Do I have to update a configuration somewhere (a file or inside the Sitecore CMS GUI) to make the package recognize the new file locations?
The module creator defines where files exist. If you move them you run the risk of something not working. The best idea is to ask the creator on the Marketplace page of the module.
There is no turn-key way to change this.
I guess you cand take the code from MarketPlace and you can modify it.
I don't know how exactly is the licenses with MarketPlace modules, but I think people can modify others code.
Please check on code and also on items, maybe on some fields are values for folder path.
I discovered a way to accomplish this, but it can be quite involved or even impossible, depending on the complexity and size of the package.
First of all, I did take the question to the module creator and had a very helpful and informative conversation with the creator. So thanks for that suggestion - they may even move the install location in a future release, based on my request.
The workaround is to first install the package on a system as normal. Then you figure out everything that comes with the package. For files, this is easy if your Sitecore root is under source control. For items, this is really complicated. You can search for the installed items by owner, if you had the foresight to create & use a unique user for the package installation. Or you can check the untyped files in the package that are essentially xml based item manifests.
Once you have a detailed list, you make the desired modifications to the locations. Then you recreate the package yourself using the Sitecore package designer.
This works for simple packages - I did it to one small package that I hope to get up on the Sitecore marketplace as shared source soon. And by small, I mean it was 2 files and 3 items. The package that prompted me to ask this question would not cooperate with this workaround. The included .dll had some assumptions about the file structure hard-coded into it.
The workaround I took for the more complex package was really quite basic: I just created a new source-code external to the required path. That let me wrap everything up neatly without getting medieval on the package files.
Thanks for both your answers, a very fine +1 to you.

C++ Boost thread library pulls in the whole development environment

I am using boost-thread in my application. When I deploy this application on a client machine (running Ubuntu 11.10), I need to make sure that libboost_thread.so is available on the machine. However, when I run "apt-get install libboost-thread1.46," it seems to pull in the whole development enviornment (libgcc, libbost1.46-dev, etc.). This machine needs just the runtime environment, not the development environment. I am wondering if there is a better way to handle this.
No such package exception: The package "libboost-thread1.46" does not exist on Ubuntu, is treated by apt-get as a regular expression, and the development package also matches the expression. The two candidate packages are named libboost-thread1.46-dev and libboost-thread1.46.1, where the latter is the package you want. It depends only on three libraries (libgcc, libc, libstdc++), all of which you need to deploy anyway because your program and libboost-thread link against them.
So, deploy by installing libboost-thread1.46.1 and everything should be fine.
You can build individual requirements yourself by download the boost tar and using the bjam build tool.
You could link statically against boost.
You can also use bcp and copy the necessary files into your own source tree. I personally have the headers installed on my system and just added the source files to my project (once.cpp, thread.cpp, timeconv.inl, tss_null.cpp on Linux).

Keeping dependency versions up to date in Leiningen projects

Is there a simple way to find out what versions of dependencies are available using Leiningen?
E.g., if I have a web app which depends on Hiccup and Compojure, how can I be sure that I'm on the latest version of each without going to the github page for each?
NOTE: I use Ant and Ivy for building my Java projects, so I have limited knowledge of Maven - so please spell out (or provide Fine Links for me to read) any Maven concepts that Leiningen exposes to me which would help with this (I know that under the hood, Leiningen uses Maven for dependency resolution). Ta.
The Clojure ecosystem has evolved since the original answer was offered. At the present time, I would recommend using lein-ancient:
A Leiningen plugin to check your project for outdated dependencies and plugins. This plugin supersedes lein-outdated and uses metadata XML files in the different Maven repositories instead of a Lucene-based search index. Version comparison is done using version-clj.
Its precursor, lein-outdated, has this helpful message in its README: "lein-outdated is outdated". :)
The canonical way of doing this, at least for dependencies kept in clojars, is the lein-search plugin.
Update: See the highest-rated answer below for a more up-to-date response.
You should have a look at the answer to this question. Leiningen uses the same versioning mechanism as maven so, for example, if you want to use the latest version of a given library, you can substitute the word "LATEST" for the version number. You can also specify a release version or a version range. Again, look at the answer at that link.
Web service that provides this info, along with badges for readmes.
http://clj-deps.herokuapp.com
Disclaimer, by me.