Ansible AWS using a role_arn with ansible playbook not giving permissions - amazon-web-services

I have been stuck on this issue for days, and I can't seem to find anything around the exact same problem I've been having. Currently, I have credentials and config set up like so:
~/.aws/credentials
[default]
aws_access_key_id = ###########
aws_secret_access_key = ######################
[dev]
role_arn=arn:aws:iam::############:role/###AccessRole
source_profile=default
~/.aws/config
[default]
region = us-east-1
output = json
[profile dev]
role_arn = arn:aws:iam::############:role/###AccessRole
source_profile = default
When I run aws cli commands, everything runs fine. If I end up using AWS creds which have admin permissions, it works - but I can't do this in our system.
Currently, the default role can't access anything on purpose, it assumes the dev role. However, I can't get Ansible to recognise dev. I configured it all, and it works across Terraform, AWS CLI, Git. Currently, this is my input and error using ansible-playbook. I have removed certain info/linted the output below. As you can see, I'm using ec2.ini and ec2.py.
Has anyone come across this? Is it to do with using role_arn with Ansible? I have tried a plethora of things to get this to work, the state below is the current state of things.
Thanks in advance!
AWS_PROFILE=dev ansible-playbook -i ./inventory/ec2.py playbook.yml --private-key ###.pem
----
[WARNING]: * Failed to parse {home}/Ansible/Bastion/inventory/ec2.py with script
plugin: Inventory script ({home}/Ansible/Bastion/inventory/ec2.py) had an
execution error: Traceback (most recent call last): File
"{home}/Ansible/Bastion/inventory/ec2.py", line 1712, in <module>
Ec2Inventory() File "{home}Ansible/Bastion/inventory/ec2.py", line 285, in
__init__ self.do_api_calls_update_cache() File
"{home}/Ansible/Bastion/inventory/ec2.py", line 552, in do_api_calls_update_cache
self.get_instances_by_region(region) File
"{home}/Ansible/Bastion/inventory/ec2.py", line 608, in get_instances_by_region
conn = self.connect(region) File "{home}/Ansible/Bastion/inventory/ec2.py", line
570, in connect conn = self.connect_to_aws(ec2, region) File
"{home}/Ansible/Bastion/inventory/ec2.py", line 591, in connect_to_aws
sts_conn = sts.connect_to_region(region, **connect_args) File "{home}.local/lib/python2.7/site-
packages/boto/sts/__init__.py", line 51, in connect_to_region **kw_params) File
"{home}/.local/lib/python2.7/site-packages/boto/regioninfo.py", line 220, in connect return
region.connect(**kw_params) File "{home}/.local/lib/python2.7/site-packages/boto/regioninfo.py",
line 290, in connect return self.connection_cls(region=self, **kw_params) File
"{home}/.local/lib/python2.7/site-packages/boto/sts/connection.py", line 107, in __init__
provider=provider) File "{home}/.local/lib/python2.7/site-packages/boto/connection.py", line
1100, in __init__ provider=provider) File "{home}/.local/lib/python2.7/site-
packages/boto/connection.py", line 555, in __init__ profile_name) File
"{home}/.local/lib/python2.7/site-packages/boto/provider.py", line 201, in __init__
self.get_credentials(access_key, secret_key, security_token, profile_name) File
"{home}/.local/lib/python2.7/site-packages/boto/provider.py", line 297, in get_credentials
profile_name) boto.provider.ProfileNotFoundError: Profile "dev" not found!
[WARNING]: * Failed to parse {home}/Ansible/Bastion/inventory/ec2.py with ini
plugin: {home}/Ansible/Bastion/inventory/ec2.py:3: Error parsing host definition
''''': No closing quotation
[WARNING]: Unable to parse {home}/Ansible/Bastion/inventory/ec2.py as an inventory
source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does
not match 'all'
PLAY [Create kp and access instance] *********************************************************
TASK [Setup variables] *************************************************************************************
ok: [localhost]
TASK [Backup previous key] *************************************************************************
changed: [localhost]
TASK [generate SSH key]
*******************************************************************
changed: [localhost]
TASK [Start and register instance] *****************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Profile given for AWS was not found. Please fix and retry."}
PLAY RECAP *************************************************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
EDITS:
Name Value Type Location
---- ----- ---- --------
profile dev manual --profile
access_key ****************#### assume-role
secret_key ****************#### assume-role
region <not set> None None
{
"UserId": "<ACCESS_KEY?>:botocore-session-##########",
"Account": "############",
"Arn": "arn:aws:sts::############:assumed-role/###AccessRole/botocore-session-##########"
}

ec2.py is too old, it only use boto and can't work with roles. It is also deprecated, the correct way now to use aws dynamic inventory is to use aws_ec2 from the aws collection. It used boto3, support roles and is in the end more flexible. If needed, there is a compatiblity ec2.py config here, but it is recommended always to use the aws_ec2 groups and variables directly for the long run.
Check this link in github for the full story

Related

Creating Connection for RedshiftDataOperator

So i when to the airflow documentation for aws redshift there is 2 operator that can execute the sql query they are RedshiftSQLOperator and RedshiftDataOperator. I already implemented my job using RedshiftSQLOperator but i want to do it using RedshiftDataOperator instead, because i dont want to using postgres connection in RedshiftSQLOperator but AWS API.
RedshiftDataOperator Documentation
I had read this documentation there is aws_conn_id in the parameter. But when im trying to use the same connection id there is error.
[2023-01-11, 04:55:56 UTC] {base.py:68} INFO - Using connection ID 'redshift_default' for task execution.
[2023-01-11, 04:55:56 UTC] {base_aws.py:206} INFO - Credentials retrieved from login
[2023-01-11, 04:55:56 UTC] {taskinstance.py:1889} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/redshift_data.py", line 146, in execute
self.statement_id = self.execute_query()
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/providers/amazon/aws/operators/redshift_data.py", line 124, in execute_query
resp = self.hook.conn.execute_statement(**filter_values)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 415, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/botocore/client.py", line 745, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (UnrecognizedClientException) when calling the ExecuteStatement operation: The security token included in the request is invalid.
From task id
redshift_data_task = RedshiftDataOperator(
task_id='redshift_data_task',
database='rds',
region='ap-southeast-1',
aws_conn_id='redshift_default',
sql="""
call some_procedure();
"""
)
What should i fill in the airflow connection ? Because in the documentation there is no example of value that i should fill to airflow. Thanks
Airflow RedshiftDataOperator Connection Required Value
Have you tried using the Amazon Redshift connection? There is both an option for authenticating using your Redshift credentials:
Connection ID: redshift_default
Connection Type: Amazon Redshift
Host: <your-redshift-endpoint> (for example, redshift-cluster-1.123456789.us-west-1.redshift.amazonaws.com)
Schema: <your-redshift-database> (for example, dev, test, prod, etc.)
Login: <your-redshift-username> (for example, awsuser)
Password: <your-redshift-password>
Port: <your-redshift-port> (for example, 5439)
(source)
and an option for using an IAM role (there is an example in the first link).
Disclaimer: I work at Astronomer :)
EDIT: Tested the following with Airflow 2.5.0 and Amazon provider 6.2.0:
Added the IP of my Airflow instance to the VPC security group with "All traffic" access.
Airflow Connection with the connection id aws_default, Connection type "Amazon Web Services", extra: { "aws_access_key_id": "<your-access-key-id>", "aws_secret_access_key": "<your-secret-access-key>", "region_name": "<your-region-name>" }. All other fields blank. I used a root key for my toy-aws. If you use other credentials you need to make sure that IAM role has access and the right permissions to the Redshift cluster (there is a list in the link above).
Operator code:
red = RedshiftDataOperator(
task_id="red",
database="dev",
sql="SELECT * FROM dev.public.users LIMIT 5;",
cluster_identifier="redshift-cluster-1",
db_user="awsuser",
aws_conn_id="aws_default"
)

Ansible inventory file in google cloud won't parse with any playbooks

My inventory file won't work for any of my playbooks I've tested it on multiple playbooks including ones that have worked for me in the past. I can't seem to figure out what's misconfigured or where the issue might be. The inventory file is fairly simple, and I have all of the requisites for getting ansible to run, ansible is up to date, and I'm positive the credentials file is fine since it's been working on my terraform configuration. Any input is appreciated!
Inventory file:
plugin: gcp_compute
projects:
- projectName
auth_kind: serviceaccount
service_account_file: /home/.../.config/gcloud/application_default_credentials.json
keyed_groups:
- prefix: gcp
key: labels
- prefix: tags
key: tags
Error that comes up when I try to use inventory:
config file = None
configured module search path = ['/home/.../.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /home/.../.local/lib/python3.10/site-packages/ansible
ansible collection location = /home/.../.ansible/collections:/usr/share/ansible/collections
executable location = /home/.../.local/bin/ansible-playbook
python version = 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] (/usr/bin/python3)
jinja version = 3.0.3
libyaml = True
No config file found; using defaults
host_list declined parsing /home/.../inventory.gcp.yaml as it did not pass its verify_file() method
redirecting (type: inventory) ansible.builtin.gcp_compute to google.cloud.gcp_compute
Using inventory plugin 'ansible_collections.google.cloud.plugins.inventory.gcp_compute' to process inventory source '/home/.../inventory.gcp.yaml'
toml declined parsing /home/.../inventory.gcp.yaml as it did not pass its verify_file() method
[WARNING]: * Failed to parse /home/.../inventory.gcp.yaml with script plugin: problem running
/home/.../inventory.gcp.yaml --list ([Errno 8] Exec format error:
'/home/.../inventory.gcp.yaml')
File "/home/.../.local/lib/python3.10/site-packages/ansible/inventory/manager.py", line 293, in parse_source
plugin.parse(self._inventory, self._loader, source, cache=cache)
...
plugin.parse(self._inventory, self._loader, source, cache=cache)
File "/home/.../.local/lib/python3.10/site-packages/ansible/plugins/inventory/yaml.py", line 114, in parse
raise AnsibleParserError('Plugin configuration YAML file, not YAML inventory')
[WARNING]: * Failed to parse /home/.../inventory.gcp.yaml with ini plugin: Invalid host pattern
'plugin:' supplied, ending in ':' is not allowed, this character is reserved to provide a port.
File "/home/.../.local/lib/python3.10/site-packages/ansible/inventory/manager.py", line 293, in parse_source
plugin.parse(self._inventory, self._loader, source, cache=cache)
File "/home/.../.local/lib/python3.10/site-packages/ansible/plugins/inventory/ini.py", line 137, in parse
raise AnsibleParserError(e)
[WARNING]: Unable to parse /home/.../inventory.gcp.yaml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit

GCP| Composer Dataproc submit job| Auth credential not found

I am running a GCP composer cluster on GKE. I am defining a DAG to submit a job to dataproc cluster. I have read GCP doc, and it says that Composer's service account will get used by the workers to send the dataproc api requests.
But DataprocSubmitJobOperator reports error in getting the auth credentials.
Stack trace below. Composer env info attached.
I need suggestion to fix this issue.
[2022-08-23, 16:03:25 UTC] {taskinstance.py:1448} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=harshit.bapna#dexterity.ai
AIRFLOW_CTX_DAG_ID=dataproc_spark_operators
AIRFLOW_CTX_TASK_ID=pyspark_task
AIRFLOW_CTX_EXECUTION_DATE=2022-08-23T16:03:16.986859+00:00
AIRFLOW_CTX_DAG_RUN_ID=manual__2022-08-23T16:03:16.986859+00:00
[2022-08-23, 16:03:25 UTC] {dataproc.py:1847} INFO - Submitting job
[2022-08-23, 16:03:25 UTC] {credentials_provider.py:312} INFO - Getting connection using `google.auth.default()` since no key file is defined for hook.
[2022-08-23, 16:03:25 UTC] {taskinstance.py:1776} ERROR - Task failed with exception
Traceback (most recent call last):
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/dataproc.py", line 1849, in execute
job_object = self.hook.submit_job(
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 439, in inner_wrapper
return func(self, *args, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataproc.py", line 869, in submit_job
client = self.get_job_client(region=region)
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataproc.py", line 258, in get_job_client
credentials=self._get_credentials(), client_info=CLIENT_INFO, client_options=client_options
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 261, in _get_credentials
credentials, _ = self._get_credentials_and_project_id()
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 240, in _get_credentials_and_project_id
credentials, project_id = get_credentials_and_project_id(
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/utils/credentials_provider.py", line 321, in get_credentials_and_project_id
return _CredentialProvider(*args, **kwargs).get_credentials_and_project()
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/utils/credentials_provider.py", line 229, in get_credentials_and_project
credentials, project_id = self._get_credentials_using_adc()
File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/utils/credentials_provider.py", line 307, in _get_credentials_using_adc
credentials, project_id = google.auth.default(scopes=self.scopes)
File "/opt/python3.8/lib/python3.8/site-packages/google/auth/_default.py", line 459, in default
credentials, project_id = checker()
File "/opt/python3.8/lib/python3.8/site-packages/google/auth/_default.py", line 221, in _get_explicit_environ_credentials
credentials, project_id = load_credentials_from_file(
File "/opt/python3.8/lib/python3.8/site-packages/google/auth/_default.py", line 107, in load_credentials_from_file
raise exceptions.DefaultCredentialsError(
google.auth.exceptions.DefaultCredentialsError: File celery was not found.
[2022-08-23, 16:03:25 UTC] {taskinstance.py:1279} INFO - Marking task as UP_FOR_RETRY. dag_id=dataproc_spark_operators, task_id=pyspark_task, execution_date=20220823T160316, start_date=20220823T160324, end_date=20220823T160325
[2022-08-23, 16:03:25 UTC] {standard_task_runner.py:93} ERROR - Failed to execute job 32837 for task pyspark_task (File celery was not found.; 356144)
[2022-08-23, 16:03:26 UTC] {local_task_job.py:154} INFO - Task exited with return code 1
[2022-08-23, 16:03:26 UTC] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check
GCP Composer Env
Based on the error File celery was not found, I think that the Application Default Credentials (ADC) tries to read a file named celery, and it doesn't find it, so check if you set the environment variable GOOGLE_APPLICATION_CREDENTIALS, because if you set it, ADC will read the the file to use it:
If the environment variable GOOGLE_APPLICATION_CREDENTIALS is set, ADC uses the service account key or configuration file that the variable points to.
If the environment variable GOOGLE_APPLICATION_CREDENTIALS isn't set, ADC uses the service account that is attached to the resource that is running your code.
This service account might be a default service account provided by Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run, or Cloud Functions. It might also be a user-managed service account that you created.
GCP doc

Salt masters behind ELB have flaky connection to minions

I am running the following setup at AWS:
Elastic Loadbalancer in front of two EC2 machines (Amazon Linux) with a docker container that the salt-master runs in
Two EC2 instances with salt-minions installed
The 'master' value in the minion config is set to the dns of the loadbalancer (SaltMaster-env-vpc-test.szfegmankg.us-east-1.elasticbeanstalk.com)
The ELB accepts all traffic from the minions
The Salt-masters accept all traffic from the ELB as well as from the minions
The Salt-masters PKI Folder is shared between the two masters
The Salt-masters have the same private+public keys
The Salt-masters run on 2017.7.1
The Salt-minions run on 2016.11.5 (I tried it with 2017.7.1, but got the same results)
The Salt-minions accept all traffic from the ELB as well as from the masters
The master config looks as follows:
open_mode: True
worker_threads: 20
auto_accept: True
log_level: error
log_level_logfile: debug
extension_modules: srv/salt/ext
rest_cherrypy:
port: 8000
disable_ssl: True
debug: True
external_auth:
pam:
saltdev:
- .*
- '#runner'
# Setting the job_cache to redis.
# The redis config settings are generated at the start of the docker container and
# will be written into /etc/salt/master.d/redis.conf
master_job_cache: redis
cache: redis
pki_dir: /etc/salt/pki/master/efs
The minion config looks as follows:
id: WIN-AB3GO7BJ72I
log_file: C:\salt.log
multiprocessing: False
log_level_logfile: debug
pki_dir: /conf/pki/minion
master: SaltMaster-env-vpc-test.szfegmankg.us-east-1.elasticbeanstalk.com
master_type: str
master_alive_interval: 30
open_mode: True
root_dir: c:\salt
ipc_mode: tcp
recon_default: 1000
recon_max: 199000
recon_randomize: True
In the master log files, I can see on both masters:
2017-09-05 10:06:18,118 [salt.utils.verify][DEBUG ][35] This salt-master instance has accepted 2 minion keys.
A salt-key -L on both masters yield the same result:
Accepted Keys:
WIN-AB3GO7BJ72I
WIN-EDMP9VB716B
Denied Keys:
Unaccepted Keys:
Rejected Keys:
So it looks like all is fine and everything should work. However, a test.ping is extremely flaky. Sometimes it works, but most of the time it doesnt.
Most of the time neither master gets any return from the minion and on the minion side I can see in the log that the minion never receives the message to execute 'test.ping' from the master.
Example 1:
test.ping from Master1:
root#d7383ff8f8bf:/# salt 'WIN-EDMP9VB716B' test.ping
[ERROR ] Exception raised when processing __virtual__ function for salt.loaded.int.cache.consul. Module will not be loaded: 'module' object has no attribute 'Consul'
[ERROR ] An un-handled exception was caught by salt's global exception handler:
KeyError: 'redis.ls'
Traceback (most recent call last):
File "/usr/bin/salt", line 10, in <module>
salt_main()
File "/usr/lib/python2.7/dist-packages/salt/scripts.py", line 476, in salt_main
client.run()
File "/usr/lib/python2.7/dist-packages/salt/cli/salt.py", line 173, in run
for full_ret in cmd_func(**kwargs):
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 805, in cmd_cli
**kwargs):
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 1597, in get_cli_event_returns
connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
File "/usr/lib/python2.7/dist-packages/salt/utils/minions.py", line 577, in connected_ids
search = self.cache.ls('minions')
File "/usr/lib/python2.7/dist-packages/salt/cache/__init__.py", line 244, in ls
return self.modules[fun](bank, **self._kwargs)
File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1113, in __getitem__
func = super(LazyLoader, self).__getitem__(item)
File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 101, in __getitem__
raise KeyError(key)
KeyError: 'redis.ls'
Traceback (most recent call last):
File "/usr/bin/salt", line 10, in <module>
salt_main()
File "/usr/lib/python2.7/dist-packages/salt/scripts.py", line 476, in salt_main
client.run()
File "/usr/lib/python2.7/dist-packages/salt/cli/salt.py", line 173, in run
for full_ret in cmd_func(**kwargs):
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 805, in cmd_cli
**kwargs):
File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 1597, in get_cli_event_returns
connected_minions = salt.utils.minions.CkMinions(self.opts).connected_ids()
File "/usr/lib/python2.7/dist-packages/salt/utils/minions.py", line 577, in connected_ids
search = self.cache.ls('minions')
File "/usr/lib/python2.7/dist-packages/salt/cache/__init__.py", line 244, in ls
return self.modules[fun](bank, **self._kwargs)
File "/usr/lib/python2.7/dist-packages/salt/loader.py", line 1113, in __getitem__
func = super(LazyLoader, self).__getitem__(item)
File "/usr/lib/python2.7/dist-packages/salt/utils/lazy.py", line 101, in __getitem__
raise KeyError(key)
KeyError: 'redis.ls'
I am aware that the redis error will be fixed soon https://github.com/saltstack/salt/issues/43295
Example 2:
test.ping from Master1, ~ 1 Minute after Example 1:
root#d7383ff8f8bf:/# salt 'WIN-EDMP9VB716B' test.ping
WIN-EDMP9VB716B:
True
Also during my tests, a test.ping from Master2 never succeeded.
I would like to know if there is some flaw in my setup that I am not seeing, or if Salt only works with an HA Proxy as an ELB?
Or maybe Salt doesn't work at all behind an ELB?
See https://github.com/saltstack/salt/issues/43368 for more answers.
TL;DR
Because there is no session stickyness for TCP connections, it is currently not possible to work with a saltmaster that is behind an ELB, if you use the ELB's ip/name as an entrypoint.

S3 module for downloading files is not working in ansible

This is the ansible code written for downloading files from S3 bucket "artefact-test".
- name: Download customization artifacts from S3
s3:
bucket: "artefact-test"
object: "cust/gitbranching.txt"
dest: "/home/ubuntu/"
mode: get
region: "{{ s3_region }}"
profile: "{{ s3_profile }}"
I have set the boto profile and aws profile too. I get different errors which i dont think are valid like -
failed: [127.0.0.1] => {"failed": true, "parsed": false}
Traceback (most recent call last):
File "/home/dmittal/.ansible/tmp/ansible-tmp-1462436903.77-107775915578620/s3", line 2320, in <module>
main()
File "/home/dmittal/.ansible/tmp/ansible-tmp-1462436903.77-107775915578620/s3", line 304, in main
ec2_url, aws_access_key, aws_secret_key, region = get_ec2_creds(module)
File "/home/dmittal/.ansible/tmp/ansible-tmp-1462436903.77-107775915578620/s3", line 2273, in get_ec2_creds
region, ec2_url, boto_params = get_aws_connection_info(module)
File "/home/dmittal/.ansible/tmp/ansible-tmp-1462436903.77-107775915578620/s3", line 2260, in get_aws_connection_info
if not boto_supports_profile_name():
File "/home/dmittal/.ansible/tmp/ansible-tmp-1462436903.77-107775915578620/s3", line 2191, in boto_supports_profile_name
return hasattr(boto.ec2.EC2Connection, 'profile_name')
AttributeError: 'module' object has no attribute 'ec2'
failed: [127.0.0.1] => {"failed": true}
msg: Target bucket cannot be found
failed: [127.0.0.1] => {"failed": true}
msg: Target key cannot be found
Whereas the bucket and key specified both exists on AWS.The same thing works if i use AWS-CLI commands to do the same.
There seems to have been a bug related to this. You might want to try adding the following import statement in the file ansible/module_utils/ec2.py -
import boto.ec2
Something like this - https://github.com/ansible/ansible/blob/9cf43ce17f20171b5740a6e0773f666cb47e2d5e/lib/ansible/module_utils/ec2.py#L31