Concurrent workflow not starting from PMCMD Command - concurrency

I have a requirement to start workflow concurrently with multiple instances, all instances need to run in parallel. When I run an instance it is running and related param file is being picked up. But when I start another instance to run in parallel with previous instance, it is giving below Error.
"Start Workflow Advanced: ERROR: Workflow [wf_name]: Could not start execution of this workflow because the current run on this Integration Service has not completed yet."
I tried doing this using PMCMDcommand like below. It's starting without any param file and without instance name. But PMCMD log is showing the the workflow is started for the given instance successfully.
pmcmd startworkflow -sv 'INT_......' -d 'DOM_......' -u 'venkat' -p MyPass.... -f 'MyFold...' -nowait -rin $inst_name $wf_name
This is working fine in our test environment. But not working in QA. Is there a configuration setting to avoid this behavior.

Please make sure the workflow is properly configured to allow multiple executions: the Configure Concurrent Execution has to be enabled and Allow concurrent run... needs to be correctly set. If you run with same instance name, the Allow concurent run with same instance name must be chosen. Otherwise, choose the Allow concurent run only with unique instance name, add the instance name and desired parameter file to the list below.
In your command I don't see the parameterfile, so I assume the latter should be the proper setup.

The issue is resolved by restarting the integration service. We did not restart integration service to fix this issue. But that resolved this issue. When we contacted informatica support for resolution, below KB link is provided by them. https://kb.informatica.com/solution/23/Pages/59/501120.aspx
Please find the thread I have opened in Informatica network.
https://network.informatica.com/thread/83540

Related

How do I Prevent Long Lived Selenium Processes from Generating OOM Errors?

How do I prevent containers that use selenium from 'freezing' after getting OOM errors and seeing Failed to start thread - pthread_create failed (EAGAIN)? What is the root cause and how do I fix it? Further, how can I test the solution locally and how can I implement the solution on AWS?
The following is a guide to diagnose and solve OOM issues occurring due to long lived Selenium instances. Specifically, on AWS you will likely get Failed to start thread - pthread_create failed (EAGAIN) for attributes: stacksize: 1024k, guardsize: 0k, detached. and the dreaded OOM error should your Selenium process run long enough. This guide assumes that you are using Docker and can run your docker file via docker run <some args if you wish> your_image.
Diagnosis
Run your app using docker run .... In a separate tab do docker ps, find your Container ID and do docker container stats CONTAINER_ID. The key is to observe the PIDS column. Now trigger the selenium process to run, ideally many times (to test this you may want to create a for loop simply to test. You will notice that the PIDs grows without bound. This is because (reference: Selenium leaves behind running processes?) will leave around zombie processes.
Solution
The solution is to cull the zombie processes. Specifically per Selenium leaves behind running processes? there is a flag --init and this will be the key to the solution. You need to run docker run --init ... (note the --init). Per https://docs.docker.com/engine/reference/run/ "You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container." . To be confident that the solution will work for you, run your image with docker run --init .... Re-trigger the calls to Selenium. This time the PIDS may grow, but not without bound (for me the number of PIDs never passed 200).
Solution - AWS
References are https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-taskdefinition-linuxparameters.html and https://www.ernestchiang.com/en/posts/2021/using-amazon-ecs-exec/
In the 'Task Definition' (search for 'ECS' -> click on Task Definitions) select your task definition. Then scroll to the bottom and click on Configure via JSON. Next, find linuxParameters and, if it is null, replace null with the value:
{
"initProcessEnabled": true
}
If you already have a JSON value for linuxParameters then just add "initProcessEnabled": true as a JSON parameter. Next, crate your task definition and deploy!
Solution - Other
I have not used Google Cloud or Microsoft offerings, so I do not know how to add the --init flag. If someone with such experience could tell me how to do that, I would be happy to update the guide.

How to run docker task with Amazon ECS - getting error `STOPPED (CannotStartContainerError: Error response from dae)`

My goal is to execute a benchmark deployed as a docker image. While doing so, I had too many issues, so I decided to first make something extremely trivial work.
So I decided to follow the guide in https://docs.aws.amazon.com/AmazonECS/latest/developerguide/create-task-definition.html
and use the "ping" example - it should just ping a domain couple of times, and stop.
The problem is, I always receive this message in the task status:
STOPPED (CannotStartContainerError: Error response from dae)
I tried it with various subnets and security groups, but the result is always the same - the task starts, and after a minute or two fails with the message above.
I even tried it on a fresh new AWS account, using these steps:
in https://us-east-2.console.aws.amazon.com/ecs/ created new cluster (networking only)
in task definitions, created a taskdef
with docker image alpine:latest, command ping -c 4 google.com
then I select the cluster, switch to "tasks" tab, and enter the run dialog
with one of pre-created subnets
After executing:
the task appears in the cluster's tasks list in PENDING state
it takes couple of minutes
eventually (using refresh button), it changes to the mentioned message - STOPPED (CannotStartContainerError: Error response from dae)
My guess is that the reason is:
either the task cannot download the image
or the instance cannot reach outside net
What can I be doing wrong? How to fix?
In my case too the log group was the problem. The one I had configured wasnt working. Hence I enabled the "Auto-configure CloudWatch Logs" option in the "Log Configuration" of the container settings.
Also if you open the stopped task, navigate to the container section, expand it, under the Details section you can see a detailed error message. Screenshot below
It could be a problem with the entry point as pointed in the comments of the question (in the task definition) Entrypoint: ["sh","-c"]
It could also be a bad reference, for example a wrong log group in the LogConfiguration or something similar.
I just create de group log in my cloudwatch console because it have not created, and now everything is going well.

GoCD Custom Command

I am trying to run a very simple custom command "echo helloworld" in GoCD as per the Getting Started Guide Part 2 however, the job does not finish with the Console saying Waiting for console logs and raw output saying Console log for this job is unavailable as it may have been purged by Go or deleted externally.
My job looks like the following which was taken from typing "echo" in the Lookup Command (which is different to the Getting Started example which I tried first with the same result)
Judging from the screenshot, the problem seems to be that no agent is assigned to the task. For an agent to be assigned, it must satisfy all of these conditions:
An agent must be running, and connected to the server
The agent must be enabled on the "Agents" page
If you use environments, the job and the agent need to be in the same environment
The agent needs to have all of the resources assigned that are configured in the job
Found the issue.
The Pipelines have to be in the same Environment to work.

Amazon EC2 custom AMI not running bootstrap (user-data)

I have encountered an issue when creating custom AMIs (images) on EC2 instances. If I start up a Windows default 2012 server instance with a custom bootstrap/user-data script such as;
<powershell>
PowerShell "(New-Object System.Net.WebClient).DownloadFile('http://download.microsoft.com/download/3/2/2/3224B87F-CFA0-4E70-BDA3-3DE650EFEBA5/vcredist_x64.exe','C:\vcredist_x64.exe')"
</powershell>
It will work as intended and go to the URL and download the file, and store it on the C: Drive.
But if I setup a Windows Server Instance, then create a image from it, and store it as a Custom AMI, then deploy it with the exact same custom user-data script it will not work. But if I go to the instance url (http://169.254.169.254/latest/user-data) it will show the script has imported successfully but has not been executed.
After checking the error logs I have noticed this on a regular occasion:
Failed to fetch instance metadata http://169.254.169.254/latest/user-data with exception The remote server returned an error: (404) Not Found.
Update 4/15/2017: For EC2Launch and Windows Server 2016 AMIs
Per AWS documentation for EC2Launch, Windows Server 2016 users can continue using the persist tags introduced in EC2Config 2.1.10:
For EC2Config version 2.1.10 and later, or for EC2Launch, you can use
true in the user data to enable the plug-in after
user data execution.
User data example:
<powershell>
insert script here
</powershell>
<persist>true</persist>
For subsequent boots:
Windows Server 2016 users must additionally enable configure and enable EC2Launch instead of EC2Config. EC2Config was deprecated on Windows Server 2016 AMIs in favor of EC2Launch.
Run the following powershell to schedule a Windows Task that will run the user data on next boot:
C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\InitializeInstance.ps1 –Schedule
By design, this task is disabled after it is run for the first time. However, using the persist tag causes Invoke-UserData to schedule a separate task via Register-FunctionScheduler, to persist your user data on subsequent boots. You can see this for yourself at C:\ProgramData\Amazon\EC2-Windows\Launch\Module\Scripts\Invoke-Userdata.ps1.
Further troubleshooting:
If you're having additional issues with your user data scripts, you can find the user data execution logs at C:\ProgramData\Amazon\EC2-Windows\Launch\Log\UserdataExecution.log for instances sourced from the WS 2016 base AMI.
Original Answer: For EC2Config and older versions of Windows Server
User data execution is automatically disabled after the initial boot. When you created your image, it is probable that execution had already been disabled. This is configurable manually within C:\Program Files\Amazon\Ec2ConfigService\Settings\Config.xml.
The documentation for "Configuring a Windows Instance Using the EC2Config Service" suggests several options:
Programmatically create a scheduled task to run at system start using schtasks.exe /Create, and point the scheduled task to the user data script (or another script) at C:\Program Files\Amazon\Ec2ConfigServer\Scripts\UserScript.ps1.
Programmatically enable the user data plug-in in Config.xml.
Example, from the documentation:
<powershell>
$EC2SettingsFile="C:\Program Files\Amazon\Ec2ConfigService\Settings\Config.xml"
$xml = [xml](get-content $EC2SettingsFile)
$xmlElement = $xml.get_DocumentElement()
$xmlElementToModify = $xmlElement.Plugins
foreach ($element in $xmlElementToModify.Plugin)
{
if ($element.name -eq "Ec2SetPassword")
{
$element.State="Enabled"
}
elseif ($element.name -eq "Ec2HandleUserData")
{
$element.State="Enabled"
}
}
$xml.Save($EC2SettingsFile)
</powershell>
Starting with EC2Config version 2.1.10, you can use <persist>true</persist> to enable the plug-in after user data execution.
Example, from the documentation:
<powershell>
insert script here
</powershell>
<persist>true</persist>
Another solution that worked for me is to run Sysprep with EC2Launch.
The issue is that AWS doesn't reestablish the route to the profile service (169.254.169.254) in your custom AMI. See response by SanjitPatel in this post. So when I tried to use my custom AMI to create spot requests, my new instances were failing to find user data.
Shutting down with Sysprep, essentially forces AWS re-do all setup work on the instance, as if it were run for the first time. So when you create your instance, shut it down with Sysprep and then create your custom AMI, AWS will setup the profile service route correctly for the new instances and execute your user data. This also avoids manually changing Windows Tasks and executing user data on subsequent boots, as persist tag does.
Here is a quick step-by-step:
Create an instance using one of the AWS Windows AMIs (Windows Server 2016 Nano Server doesn't support Sysprep) and passing your desired user data (this may be optional, but good to make sure AWS wires setup scripts correctly to handle user data).
Customize your instance as needed.
Shut down your instance with Sysprep. Just open EC2LaunchSettings application and click "Shutdown with Sysprep". Full instructions here.
Create your custom AMI from the instance you just shut down.
Use your custom AMI to create other instances, passing user data on instance creation. User data will be executed on instance launch. In my case, I used Spot Request screen, which had a User Data text box.
Hope this helps!
At the end of initial bootstrap (UserData) script, just append persist tag as shown below.
Works perfectly.
<powershell>
insert script here
</powershell>
<persist>true</persist>
For those people that got here from Google and are running a Server 2016 instance, it seems that this is no longer possible.
Server2016 doesn't have ec2config service and so you can't use the persist flag.
<persist>true</persist>
Described in Anthony Neace's post.
Server 2016 uses EC2Launch and I haven't yet seen how it's possible to run a script at every boot. You can run a script on the first boot, but subsequent boots will not run it.
I added below powershell script to run during the AMI bake process which helped me fix this issue. This was Windows server 2019.
$EC2LaunchInitInstance = "C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\InitializeInstance.ps1"
$EC2LaunchSysprep = "C:\ProgramData\Amazon\EC2-Windows\Launch\Scripts\SysprepInstance.ps1"
Invoke-Expression -Command "$EC2LaunchInitInstance -Schedule"
Invoke-Expression -Command "$EC2LaunchSysprep -NoShutdown"

Hello World PipeLine with ShelCommandlActivity

I'm trying to create a simple dataFlow pipeline with a single Activity of ShellCommandActivity type. I've attached the configuration of the activity and ec2 resource.
When I execute this the Ec2Resource sits in the WAITING_ON_DEPENDENCIES state then after sometime changes to TIMEDOUT. The ShellCommandActivity is always in the CANCELED state. I see the instance launch and very quicky changes to the terminated stated.
I've specified a s3 log file url, but that never gets updated.
Can anyone give me any pointers? Also is there any guidance out there on debugging this?
Thanks!!
You are currently forcing your instance to shut down after 1 minute which gives the TIMEOUT status if it can't execute in that time. Try increasing it to 50 minutes.
Also make sure you are using an AMI that runs Amazon Linux and that you are using full absolute paths in your scripts.
S3 log files are written as:
s3://bucket/folder/