Add files to S3 Bucket using Shell Script - amazon-web-services

Goal: to push files in gri/ to S3 bucket using SendToS3.sh shell script.
I am following this Tutorial.
SendToS3.sh is in cwd. It needs to fetch all files, that are not in sub-folders, in cwd's gri/.
Terminal:
me#PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/data$ ./SendToS3.sh
./SendToS3.sh: line 17: logInfo: command not found
curl: Can't open '/gri/*'!
curl: try 'curl --help' or 'curl --manual' for more information
curl: (26) Failed to open/read local data from file/application
./SendToS3.sh: line 27: logInfo: command not found
SendToS3.sh:
bucket=simulation
files_location=/gri/ # !
now_time=$(date +"%H%M%S")
contentType="application/x-compressed-tar"
dateValue=`date -R`
# your key goes here..
s3Key= # CENSORED
# your secrets goes here..
s3Secret= # CENSORED
function pushToS3()
{
files_path=$1
for file in $files_path*
do
fname=$(basename $file)
logInfo "Start sending $fname to S3"
resource="/${bucket}/${now_date}/${fname}_${now_time}"
stringToSign="PUT\n\n${contentType}\n${dateValue}\n${resource}"
signature=`echo -en ${stringToSign} | openssl sha1 -hmac ${s3Secret} -binary | base64`
curl -X PUT -T "${file}" \
-H "Host: ${bucket}.s3.amazonaws.com" \
-H "Date: ${dateValue}" \
-H "Content-Type: ${contentType}" \
-H "Authorization: AWS ${s3Key}:${signature}" \
https://${bucket}.s3.amazonaws.com/${now_date}/${fname}_${now_time}
logInfo "$fname has been sent to S3 successfully."
done
}
pushToS3 $files_location
Please let me know if there is anything else I can add to post.

Your system doesn't have loginfo, so maybe switch that command to echo. For your curl error it could be a file permission errors, try running:
chmod -rwx gri.
Alternatively, you could use the aws cli instead, which is much easier to use imo.

The error is at this following line. The folder /gri/ is empty or the user launching the script cannot have access to it.
curl: Can't open '/gri/*'!
Moreover, it seems that your server does not have the executable LogInfo installed, or it is not accessible from your script SendToS3.sh.
Verify the installation and add the binary to the PATH env variable.
./SendToS3.sh: line 17: logInfo: command not found
Bonus: instead of using curl, you can use aws-cli which is optimized to interract with aws components. Please find the documentation for s3 here: https://docs.aws.amazon.com/cli/latest/reference/s3/
For example, you can copy a file to w bucket with this command:
aws s3 cp <path_to_file> s3://<bucket_name>/<path>/

Related

gcurl: command not found in Google Cloud Shell

In Google Cloud Shell, I would like to see a list of enabled service,
When I put the following command
gcurl "https://serviceusage.googleapis.com/v1/proj
ects/myProjectId/services?filter=state:ENABLED"
Then I got this error.
-bash: gcurl: command not found
How to install gcurl?
gcurl is an alias for regular curl plus some headers:
alias gcurl='curl -H "$(oauth2l header --json ~/credentials.json cloud-platform userinfo.email)" -H "Content-Type: application/json"'
Please see here for more details.

How to check for Jupyter active notebooks through command line

I have an AWS EMR running Jupyterhub version 0.8.1+ that I want to check if there are any active notebooks that are running any code.
I've tried the below commands but they don't seem to output what I'm looking for here since the users server is always running and notebooks can be running without any code being executed.
# only lists running servers and jovyan is always running.
sudo docker exec jupyterhub jupyter notebook list
# No useful information outputted
curl -k -i -H "Accept: application/json" "https://localhost:9443/api/sessions"
# always lists processes regardless of running notebooks
ps aux | grep ipykernel
# The last_activity only updates when a user creates a new file or folder in the ui.
curl -k https://localhost:9443/hub/api/users/$user -H "Authorization: token $admin_token" | jq -r .last_activity
curl -k https://localhost:9443/hub/api/users -H "Authorization: token $admin_token" | jq -r .last_activity
Im following this AWS blog to check if the entire EMR is idle before terminating the cluster but they never seemed to have fully implemented the jupyter checks.
https://aws.amazon.com/blogs/big-data/optimize-amazon-emr-costs-with-idle-checks-and-automatic-resource-termination-using-advanced-amazon-cloudwatch-metrics-and-aws-lambda/
Most of the files referenced can be found in Github https://github.com/septian-putra/emr-monitoring
To see if a notebook is "idle" for "busy" you can run curl -ks https://localhost:9443/user/jovyan/api/kernels -H "Authorization: token ${admin_token}"
With this command all you need to do is put it in a simple if statement with a grep -q in order to get a true false idle value.
if [ $(curl -ks https://localhost:9443/user/jovyan/api/kernels -H "Authorization: token ${admin_token}" | grep -q "busy") ]; then
JUPYTER_BUSY_NOTEBOOKS=1
else
JUPYTER_BUSY_NOTEBOOKS=0
fi
(curl -ks For a silent output and to ignore ssl. jovyan being my admin user)
Documentation
https://jupyter-kernel-gateway.readthedocs.io/en/latest/websocket-mode.html#http-resources
/api/sessions might also be useful to look at.

Using mssql-tools bcp from HDFS NFS mount

Trying to run bcp from the mssql-tools package (using centos7) to export tsv file data from an HDFS location mounted to local FS via NFS Gateway, but running into errors like...
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Unable to open BCP error-file
or
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Unable to open BCP host data-file
The bcp command being run looks like...
/opt/mssql-tools/bin/bcp "$TABLE" in \
"$filename" \
$TO_SERVER_ODBCDSN \
-U $USER -P $PASSWORD \
-d $DB \
$RECOMMEDED_IMPORT_MODE \
-t "\t" \
-e ${filename}.bcperror.log
# with the actual commmand w/ variables resolved looks like...
/opt/mssql-tools/bin/bcp "ACCOUNT" in \
"/HDFS_NFS/path/to/tsv/1_0_0.tsv" \
-D -S MyMSSQLServer \
-U myuser -P mypassword \
-d SOME_MSSQL_DB \
-c \
-t \t \
-e /HDFS_NFS/path/to/store/errlogs/1_0_0.tsv.bcperror.log
all of this seems fine to me, yet also sometimes getting errors like...
/opt/mssql-tools/bin/bcp: unknown option
usage: /opt/mssql-tools/bin/bcp {dbtable | query} {in | out | queryout | format} datafile ...
so not sure what that's about either. My /etc/odbc.ini file looks like...
[MyMSSQLServer]
Driver=ODBC Driver 17 for SQL Server
Description=My MS SQL Server
Trace=No
Server=<the server's IP>
Anyone know any further debugging tips or fixes for this?
The problem appears to be that the error logging file specified by the -e option was already existing in the location specified and that HDFS (mounted via NFS or not) did not like the bcp command trying to overwrite it. You would normally do something like
hadoop fs -put -f /some/local/file /hdfs/location/for/file
and I assume that bcp was attempting something else via the NFS gateway that was not this. I suppose there also could have been latency problems with bcp accessing the HDFS NFS location. Running the bcp command without the -e option worked in the example originally posted.
** As a workaround, based on another SO post, I bring the files down (hadoop fs -get ...) to a local temp dir /home/user/tmp/<some uuid>/ and do what needs to be done there, then hadoop fs -put ....

How to connect to AWS simple AD using ldapsearch?

I've created a simple AD on AWS and I'm trying to connect to it using the Administrator credentials set up while creating the simple AD. I'm running the ldapsearch command from another EC2 instance in the same subnet. However I"m running into an authentication error and I'm pretty sure it's not the password, as I've tried changing it multiple time with no luck.
Below is the ldapsearch command I'm using.
$ldapsearch -x -v -h "10.*.*.112" -b "dc=corp-testing,dc=example,dc=com" –D "Administrator#corp-testing.example.com" -W sAMAccountName=Administrator
Below is the output:
ldap_initialize( ldap://10.*.*.112 )
Enter LDAP Password:
ldap_bind: Invalid credentials (49)
additional info: 80090308: LdapErr: DSID-0C0903A9, comment: AcceptSecurityContext error, data 52e, v1db1
Would someone be able to point out the issue on this?
I ran into the same issue and I have found the solution, the username needs to be prefixed with the Directory NetBIOS name (this is available from the Directory details page), then login with:
ldapsearch -x \
-h 10.*.*.112 \
-b "cn=Users,dc=corp-testing,dc=example,dc=com" \
–D "${NetBIOSNAME}\\Administrator" \
-W sAMAccountName=Administrator
Obviously, change ${NetBIOSNAME} to the appropriate value.
Okay I figured it out, however I don't know the WHY. Try changing your search to:
ldapsearch -x -v -H "ldap://10.*.*.112:389/" -b "dc=corp-testing,dc=example,dc=com" –D "cn=Administrator,dc=corp-testing,dc=example,dc=com" -W sAMAccountName=Administrator
I tried this several times without the trailing / on the URI but it didn't work.
Problem was that the Administrator is inside of the Users node, so -D should include cn=Users also:
ldapsearch -x -v -h "10.*.*.112" -b "dc=corp-testing,dc=example,dc=com" –D "cn=Administrator,cn=Users,dc=corp-testing,dc=example,dc=com" -W sAMAccountName=Administrator

Cannot access /encrypt endpoint of PCF p-config-server service

I have followed the instructions from https://github.com/spring-cloud-services-samples/cook and managed to install and run Config Server in PCF environment (SERVICE: Config Server, PLAN: standard).
I'm now trying to hit /encrypt endpoint of the p-config-server service, in order to encrypt new value. I'm following the instructions at http://docs.run.pivotal.io/spring-cloud-services/config-server/configuring-with-git.html#encryption-and-encrypted-values:
TOKEN=$(curl -k ACCESS_TOKEN_URI -u CLIENT_ID:CLIENT_SECRET -d
grant_type=client_credentials | jq -r .access_token); curl -k -H
"Authorization: bearer $TOKEN" -H "Accept: application/json"
URI/encrypt -d 'VALUE'
...but I always get:
{
"error": "access_denied",
"error_description": "Access is denied"
}
On the other side, if I try to get standard endpoint, to get config for an app, I'm able to retrieve JSON containing app properties. E.g.
TOKEN=$(curl -k ACCESS_TOKEN_URI -u CLIENT_ID:CLIENT_SECRET -d
grant_type=client_credentials | jq -r .access_token); curl -k -H
"Authorization: bearer $TOKEN" -H "Accept: application/json"
URI/my-app/default
... gives result like:
{"name":"my-app","profiles":["default"],"label":null,"version":"bb6e64592ced731ebba272430291a595e0f14a77","state":null,"propertySources":[{"name":"https://github.com/some-user/config/my-app.yml","source":{"my-property.name":"Test123"}}]}
Any idea why I can not access /encrypt endpoint?
Thank you.
Btw, I'm executing the command in CentOS Linux release 7.4.1708, with installed jq (command-line JSON processor).
I've got the answer from CloundFoundry IT support. In my CF environment, "encrypt" endpoint should have a trailing slash (/). So it should be ...URI/encrypt/ -d 'VALUE'. Maybe it helps someone.
One more hint I've got: There is a CF CLI plugin for the Spring-Cloud-Services which you could use for convenience.
https://github.com/pivotal-cf/spring-cloud-services-cli-plugin
cf install-plugin -r CF-Community "Spring Cloud Services"
cf help config-server-encrypt-value
Hi Actually you need to hit cf env command first and take note of configuration values from that which for sample looks like below:
{
"VCAP_SERVICES": {
"p-config-server": [
{
"credentials": {
"access_token_uri": "https://p-spring-cloud-services.uaa.cf.wise.com/oauth/token",
"client_id": "p-config-server-876cd13b-1564-4a9a-9d44-c7c8a6257b73",
"client_secret": "rU7dMUw6bQjR",
"uri": "https://config-86b38ce0-eed8-4c01-adb4-1a651a6178e2.apps.wise.com"
},
[...]
and then use those values in your curl bash script. for example
TOKEN=$(curl -k https://config-86b38ce0-eed8-4c01-adb4-1a651a6178e2.apps.wise.com -u p-config-server-876cd13b-1564-4a9a-9d44-c7c8a6257b73:rU7dMUw6bQjR -d
grant_type=client_credentials | jq -r .access_token); curl -k -H
"Authorization: bearer $TOKEN" -H "Accept: application/json"
URI/ENDPOINT | jq
Basically following values are required:
ACCESS_TOKEN_URI with the value of credentials.access_token_uri
CLIENT_ID with the value of credentials.client_id
CLIENT_SECRET with the value of credentials.client_secret
URI with the value of credentials.uri
Replace ENDPOINT with the relevant endpoint:
application/profile to retrieve configuration from a Config Server service instance
eureka/apps to retrieve the registry from a Service Registry service instance
Then I think you will no more get access denied response.