Today I have a NIFI that saves data on the S3, but when changing environments and machines I put a directory with the credentials (sometimes I need to change the credential) each EC2 . I would like to know if there is a way that I can connect the S3 automatically without having to change the file with the credential at each machine change.
Thanks
I'm not sure I understand the question. Do you want to set the process so that NiFi is agnostic to the credentials and just "saves data to S3", being told the credentials by the particular machine this flow is running on? Or embed the credentials in NiFi so that no matter which machine this flow is running on, it uses the same credentials? Both are possible.
Credentials provided by machine
You can populate the AWS credentials (Access Key and Secret Key) in three ways:
Provide a file path for the Credentials File processor property pointing to a file on disk which contains these credentials.
Populate the appropriate properties of the PutS3Object component using parameters.
Create an AWSCredentialsProviderControllerService instance with those values and reference it from this processor
Whatever credentials are in the credentials file on disk, the parameter context, or the referenced controller service will be used. If the flow segment is deployed to a different NiFi instance (and the appropriate credentials file exists, or the parameters are populated, or the controller service populated [depending on the scenario]), those new values will be used.
Credentials embedded in NiFi flow
Either populate the AWS credentials (Access Key and Secret Key) in the appropriate properties of the PutS3Object component, or create an AWSCredentialsProviderControllerService instance with those values and reference it from this processor. If you deploy this flow to another NiFi instance, it will continue to use these same credentials.
Related
I am trying to create Django project that uses Google Cloud Storage bucket, and deploy it on Heroku or other cloud services.
In order to use Cloud Storage, I need to authenticate myself using the service account. So I got json file containing service account credentials.
In the application, I need to provide a path to that json file, which means that I must save json file within the application.
I can use environment variable to hide the path itself, but I still need the json file saved somewhere in the Django project and when deployed in the remote server. I was not sure if this is safe...
How can I access Google Cloud Storage safely during production?
The reason why one uses environment variables is to make sure sensitive credentials are not shared by an accident via for example GitHub. If someone has access to your server then everything is compromised no matter what.
Currently we put the google cloud credentials in json file and set it via GOOGLE_APPLICATION_CREDENTIALS environment variable.
But we do not want to hardcode the private key in the json file due to security reasons. We want to put the private key in Azure key vault (yes we use azure key Vault), is there a way to provide the credentials programatically to GCP. so that i can read azure key vault and provide the private key via code. I tried to check and use GoogleCredentials and googles DefaultCredentialsProvider etc classes, but i could not find a proper example.
Note: The google credentials type is Service Credential
Any help is much appreciated.
Store the service account JSON base64 encoded in Azure Key Vault as a string.
Read the string back from Key Vault and base64 decode back to a JSON string.
Load the JSON string using one of the SDK member methods with names similar to from_service_account_info() (Python SDK).
John's answer is the correct one if you want to load a secret from Azure vault.
However, if you need a service account credential in Google Cloud environment, you don't need a service account key file, you can use the automatically loaded service account into the metadata server
In almost all Google Cloud service, you can customize the service account to use. In the worse case, you need to use the default service account for the context, and grant it the correct permissions.
Service account key file use isn't a good practice on Google Cloud product. It's a long lived credential, and you have to rotate yourselves, keep it secrets,...
I am writing an application where I have to upload media files to GCS. So, I created a storage bucket and also created a service account which is being used by the application to put and get images from the bucket. To access this service account from the application I had to generate a private key as a JSON file.
I am tested my code and it is working fine. Now, I want to push this code to my Github repository but I don't want this service account key to be in Github.
How do I manage to keep this service account key secret, yet all my fellow colleagues should be able to use it.
I am going to put my application on GCP Container Instance and I want it to work there as well.
As I understand, if your application works from inside the GCP and use some custom service account, you might not need any private keys (as json files) at all.
The custom service account, which is used by your application, should get relevant IAM roles/permissions on the correspondent GCS bucket. And that's all you might need to do.
You can assign those IAM roles/permissions either manually (through UI console), or using CLI commands, or as part of your deployment CI/CD pipeline.
We wanted to copy a file from one project's storage to another.
I have credentials for project A and project B in separate service accounts.
The only way we knew how to copy files was to add service key credential permissions to the bucket's access control list.
Is there some other way to run commands across accounts using multiple service keys?
You can use Cloud Storage Transfer Service to accomplish this.
The docs should guide you to setup the permissions for buckets in both projects and do the transfers programmatically or on the console.
You need to get the service account email associated to the Storage Transfer Service by entering your project ID in the Try this API page. You then need to give this service account email the required roles to access the data from the source. Storage Object Viewer should be enough permissions.
At the data destination, you need get the service account email for the second project ID, then give it the Storage Legacy Bucket Writer role.
You can then do the transfer using the snippets in the docs.
I want to deploy a node application on a google cloud compute engine micro instance from a source control repo.
As part of this deployment I want to use KMS to store database credentials rather than having them in my source control. To get the credentials from KMS I need to authenticate on the instance with GCLOUD in the first place.
Is it safe to just install the GCloud CLI as part of a startup script and let the default service account handle the authentication? Then use this to pull in the decrypted details and save them to a file?
The docs walkthrough development examples, but I've not found anything about how this should work in production, especially as I obviously don't want to store the GCloud credentials in source control either.
Yes, this is exactly what we recommend: use the default service account to authenticate to KMS and decrypt a file with the credentials in it. You can store the resulting data in a file, but I usually either pipe it directly to the service that needs it or put it in tmpfs so it's only stored in RAM.
You can check the encrypted credentials file into your source repository, store it in Google Cloud Storage, or elsewhere. (You create the encrypted file by using a different account, such as your personal account or another service account, which has wrap but not unwrap access on the KMS key, to encrypt the credentials file.)
If you use this method, you have a clean line of control:
Your administrative user authentication gates the ability to run code as the trusted service account.
Only that service account can decrypt the credentials.
There is no need to store a secret in cleartext anywhere
Thank you for using Google Cloud KMS!