HA on Yarn Timelineserver and MapReduce JobHistoryserver

HA on Yarn Timelineserver and MapReduce JobHistoryserver - mapreduce

Some Yarn services, like the ResourceManager, support High Availability.
After looking through the corresponding documentation of both the Timelineserver and the JobHistory, I haven't been able to see if that feature is supported for these 2 services. I have also been looking through the different properties of the corresponding configuration files (yarn-site and mapred-site) and haven't found anything either.
I found it strange that HA is not supported for none of the services mentioned.
Is it really not possible to enable such feature on both the Timelineserver and JobHistoryserver?
Has anyone been able to enable it?
Note: I have tried putting 2 hosts in properties such as yarn.timeline-service.webapp.address from yarn-site.xml, but it prints out errors.

Related

Mapping dependencies/requirements for GCP APIs/services

Does anyone knows a way to map the dependencies or requirements of any GCP API?
E.g. enabling container.googleapis.com would automatically enable compute.googleapis.com and others into a same chart/table/text/anything.
The GCP docs don't specify any such dependency for any API (from what I have seen so far). So I'm either looking for a Doc which specifies this, a gcloud command or a completely different tool that can help mapping it.

We don't have any public external documentation around service dependencies for now. therefore please open a FR in refer to this link

did you open a Feature Request as suggested ? If so, can you share the link ?
As a faint consolation, you can have a look at this article from which we can tell that the API interdependency information was once available through the serviceusage API.
There you'll find a diagram as of october 2020 (see screenshot bellow)

One workaround could be to use the Service Usage API. The disable method has a disableDependentServices field which disables all services that depend on the services being disabled.
You could enable a bunch of services in GCP, disable a service, and observe which dependent services are also disabled.

I did end up opening a feature request for this and the fact that I had to do so still boggles the mind.

Duplicating GCP Network and VPN Configuration

I have two GCP projects communicating with each over over a Classic VPN, I'd like to duplicate this entire configuration to another GCP account with two projects. So in addition to the tunnels and gateways, I have one network in each project to duplicate, some firewall rules, and a custom routing rule on one project.
I've found how I can largely dump these using various
gcloud compute [networks | vpn-tunnels | target-vpn-gateways] describe
commands, but looking at the create commands they don't seem setup to be piped to, nor use this output data as a file, not to mention there are some items that won't be applicable in the new projects.
I'm not just trying to save time, I'm trying to make sure I don't miss anything and I also want a hard copy of sorts, of my current configuration.
Is there any way to do this? thank you!

As clarified in other similar cases - like here and here - it's not possible to clone or duplicate entire projects in Google Cloud Platform. As explained in these other cases, you can use Terraformer as to generate Terraform files from existing infrastructure (reverse Terraform) and then, recreate the files in your new instance as explained here.
To summarize, you can try this CLI as a possible alternative to copy part of your structure, but as emphasized in this answer here, there is no automatic way or magic tool that will copy everything, so even your VMs configuration, your app contents, your data content, won't be duplicated.

GCP Deployment Manager - What Dev Ops Tool To Use In Conjunction?

I'm presently looking into GCP's Deployment Manager to deploy new projects, VMs and Cloud Storage buckets.
We need a web front end that authenticated users can connect to in order to deploy the required infrastructure, though I'm not sure what Dev Ops tools are recommended to work with this system. We have an instance of Jenkins and Octopus Deploy, though I see on Google's Configuration Management page (https://cloud.google.com/solutions/configuration-management) they suggest other tools like Ansible, Chef, Puppet and Saltstack.
I'm supposing that through one of these I can update something simple like a name variable in the config.yaml file and deploy a project.
Could I also ensure a chosen name for a project, VM or Cloud Storage bucket fits with a specific naming convention with one of these systems?
Which system do others use and why?

I use Deployment Manager, as all 3rd party tools are reliant upon the presence of GCP APIs, as well as trusting that those APIs are in line with the actual functionality of the underlying GCP tech.
GCP is decidedly behind the curve on API development, which means that even if you wanted to use TF or whatever, at some point you're going to be stuck inside the SDK, anyway. So that's why I went with Deployment Manager, as much as I wanted to have my whole infra/app deployment use other tools that I was more comfortable with.
To specifically answer your question about validating naming schema, what you would probably want to do is write a wrapper script that uses the gcloud deployment-manager subcommand. Do your validation in the wrapper script, then run the gcloud deployment-manager stuff.
Word of warning about Deployment Manager: it makes troubleshooting very difficult. Very often it will obscure the error that can help you actually establish the root cause of a problem. I can't tell you how many times somebody in my office has shouted "UGGH! Shut UP with your Error 400!" I hope that Google takes note from my pointed survey feedback and refactors DM to pass the original error through.
Anyway, hope this helps. GCP has come a long way, but they've still got work to do.

Cannot set consistency level when querying Amazon Keyspaces service from DataGrip

I'm trying to perform inserts on Amazon's Managed Cassandra service from IntelliJ's DataGrip IDE, however I recieve the following error:
Consistency level LOCAL_ONE is not supported for this operation. Supported consistency levels are: LOCAL_QUORUM
This is due to Amazon using the LOCAL_QUORUM consistency level for writes.
I tried to set the consistency level with CONSISTENCY LOCAL_QUORUM; before running other queries but it returned the following error:
line 1:0 no viable alternative at input 'CONSISTENCY' ([CONSISTENCY])
From my understanding, this is because CONSISTENCY is a cqlsh command and not a CQL command.
I cannot find any way to set the consistency level from within DataGrip so that I can run scripts and populate my tables.
Ultimately, I will use plain cqlsh if I cannot find a solution but I was hoping to use DataGrip as I find it useful and have many databases already configured. I hope someone can shed some light on the issue, this seems like it should be a basic feature.

I am Max from DataGrip team, and the correct answer is:
It could be JDBC driver issue and the desired method hasn't been implemented yet. Since you're trying to run pure cqlsh command as SQL. Follow the issue DBE-10638.

It's a DataGrip bug, see https://youtrack.jetbrains.com/issue/DBE-10182 :
Cassandra 'CONSISTENCY' command is not supported
So upvote that bug, and maybe add a comment that it makes DataGrip useless for writing to Amazon Managed Cassandra

Amazon Keyspaces (Apache Cassandra)
Now I used DataGrip version 2020.1.3 (Buy Licensed)
Encounter problems as well.
Cannot change type CONSISTENCY ONE to LOCAL_QUORUM
I have opened an issue already and waiting for the investigation.
So, I try so many tools and found that DBeaver is working,
The CONSISTENCY can be selected in the configuration GUI.
https://dbeaver.com/download

Kubernetes: how to properly change apiserver runtime settings

I'm using kube-aws to run a Kubernetes cluster on AWS, and everything works as expected.
Now, I realize that cron jobs aren't turned on in the version I'm using (v1.7.10_coreos.0), while the documentation for Kubernetes only states the following:
For previous versions of cluster (< 1.8) you need to explicitly enable batch/v2alpha1 API by passing --runtime-config=batch/v2alpha1=true to the API server (see Turn on or off an API version for your cluster for more).
And the documentation directed to in that text only states this (it's the actual, full documentation):
Specific API versions can be turned on or off by passing --runtime-config=api/ flag while bringing up the API server. For example: to turn off v1 API, pass --runtime-config=api/v1=false. runtime-config also supports 2 special keys: api/all and api/legacy to control all and legacy APIs respectively. For example, for turning off all API versions except v1, pass --runtime-config=api/all=false,api/v1=true. For the purposes of these flags, legacy APIs are those APIs which have been explicitly deprecated (e.g. v1beta3).
I have been unsuccessful in finding information about how to change the configuration of a running cluster, and I, of course, don't want to try to re-run the command on api-server.
Note that kube-aws still use hyperkube, and not kubeadm. Also, the /etc/kubernetes/manifests-directory only contains the ssl-directory.
The setting I want to apply is this: --runtime-config=batch/v2alpha1=true
What is the proper way, preferably using kubectl, to apply this setting and have the apiservers restarted?
Thanks.

batch/v2alpha1=true is set by default in kube-aws. You can find it here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js