Can we run multiple jobs in parallel having just one GoCD agent? - go-cd

I have installed GoCD server. And Installed one GoCD Agent.
Now i created a new pipeline : Pipeline1
Created one stage : Stage1
Now two jobs: Job1 and Job2
As i am having one GoAgent, does these two jobs will be executed parallely?

No, one agent only runs one job at a time.
You can, however, run multiple agents in parallel on the same machine, so that you can execute several jobs in parallel.

Related

Run command from terminal window in AWS Instance at specified time or on start up

I have a AWS Cloud9 Instance that starts running at 11:52 PM MST and stops running at 11:59 PM MST. I have a dockerfile within the Instance that when ran with the correct mount will run a set of c++ .cpp files that collect live web data. The ultimate goal of this instance is to be fully automatic so that every night it collects the live web data for that date, hence why the Instance is open at the very end of the day each night. Is it possible to have my AWS Instance run a given command in a terminal window at a certain time, say 11:55 PM or even upon startup. So at the time, or at startup, the command "docker run -it...." is ran within the instance.
Is automating this process possible? I have looked into CloudWatch events and think that might be the best way to go about automating this process but I am not quite sure how I would create a rule to fulfill the job. If it is not possible to automate a certain command within a terminal window, could I automate the dockerfile to run at a certain time?
ofcourse you can automate running of commands not just docker but for the fact any commands using cron daemon. all you need to do is place your command in shell script file say doc.sh in your desired directory.
ssh into your instance
open terminal and type crontab -e
enter the following details in this manner a b c d e /directory/command
where a -Minute, b-hour c-day d-month e-day of the week
the /directory/command specifies the location and script you want to run.
for more reference cron examples,https://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/
If you have a dockerfile that you want to run for a few minutes a day, you should look into Fargate. You can schedule an event with Cloudwatch, run the container and then shut it down when it's done.
It will probably cost around $0.01/day to run this.

How to execute mvn clean install in goCD

mvn clean build command does not execute in GoCD , The pipe line gets triggered but there is nothing displayed in logs and the job keeps running forever after setting inactivity time to 1 min.
I have created a pipe line and added mvn clean install command to it as in below image.Please let me know what needs to changed to generate artifacts as first step.
The most important clue is in your first screenshot, it says "Agent: Not yet assigned". That means that no agent (aka worker) could be found that that can handle your job.
Please read the manual on managing agents, specifically the section Matching jobs to agents.
Frequent reasons why no agent can be assigned:
No agents available at all
The agent(s) are in environments, but the pipeline isn't
Mismatch between resources specified in the job and in the agent management.

Is there a way to configure and change Yarn scheduler at runtime?

Currently I am using the default Yarn scheduler but would like to do something like -
Run Yarn using the default scheduler
If (number of jobs in queue > X) {
Change the Yarn scheduler to FIFO
}
Is this even possible through code?
Note that I am running Spark jobs on an aws EMR cluster with Yarn as RM.
Well, it can be possible by having a poller checking current queue(using RM API) and updating yarn-site.xml + probable restart of RM. However, restarting RM can impact your queue because the current jobs will be Killed or Shutdown(and probably retried later).
If you need a more efficient switch between Capacity and FIFO scheduler's , you might as well need to extend those classes and design your own Scheduler which can do the job of your pseudo code.
EMR by default uses capacity scheduler with DefaultResourceCalculator and spins up jobs on Default queue. For example , EMR has yarn configurations on a paths like the following:
/home/hadoop/.versions/2.4.0-amzn-6/etc/hadoop/yarn-site.xml
<property><name>yarn.resourcemanager.scheduler.class</name><value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value></property>
with
/home/hadoop/.versions/2.4.0-amzn-6/etc/hadoop/capacity-scheduler.xml
org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator

EMR & Spark adding dependencies after cluster creation

Is it possible to install additional libs/dependencies after the cluster is already up and running?
Things I've done that are related to this:
I've already done the pre-creation bootstrapping process (this is a different solution altogether)
alternatively, SSH'd into each node and installed dependencies after the cluster comes up
I think the post-startup installation solution would involve being able to fire off a command to all the executors from the driver. Is there a way to accomplish this?
If you want additional dependencies to your steps, you can add them in the command of the step ( i.e. for Spark use the --jars option )

spark job leverage all nodes

So my setup on AWS is 1 master node and 2 executor nodes.
I'd expect both 2 executor nodes would work on my task but I can see only one gets registered normally, the other one as ApplicationMaster. I can also see that 16 partitions at the time are processed.
I use spark-shell for now. All the default settings, EMR 4.3. Command to start the shell:
export SPARK_EXECUTOR_MEMORY=20g
export SPARK_DRIVER_MEMORY=4g
spark-shell --num-executors 2 --executor-cores 16 --packages com.databricks:spark-redshift_2.10:0.6.0 --driver-java-options "-Xss100M" --conf spark.driver.maxResultSize=0
Any ideas where to start debugging this? Or is it correct behaviour?
I think the issue is that you are running in 'cluster' mode and the spark driver is running inside an application master on one of the executor nodes, and using 1 core. Therefore because your executors require 16 cores, one of the nodes only has 15 cores available and does not have the required resources to launch a second executor. You can verify this by looking at "Nodes" in the YARN UI.
The solution may be to launch the spark shell in client mode --deploy-mode client or change the number of executor cores.