I've enable log files rotations to amazon s3, every hour amazon create file "var_log_httpd_rotated_error_log.gz" for every instance at my elastic beanstalk environment.
first question :
the log files will not overlap ? so every time amazon save the file at s3, it also delete it from the instance and create a new one ! right ?
second question :
How could I collect all that files, I want to build a sever that collect all that files and enable me to search for text at those files !
That is what rotating means. It stops writing to one file and begins writing to a new file.
If they are being uploaded to s3, you can write code to download and index those files. Splunk or Loggly may help you here.
Related
I am beginner to GCP, I want to have two folders processed and unprocessed folder
in the cloud storage bucket. whenever a files comes to the google storage bucket from any source, after which cloud function will get triggered, if the files are successfully inserted into the target such as Bigquery, the file will go into the processed folder, if not into the unprocessed folder.
I want to know how can I get alerts when the files go into the unprocessed folder or error folder??
Do I have to write a code or Should I write a cloud function or anything else which gets me alerts??
Any help will be appreciated
Thank you
As you mentioned usage of Cloud Functions is the right approach.
A simple function is required, then it should be deployed with the proper trigger associated with a bucket.
More details, with examples can be found here:
https://cloud.google.com/functions/docs/calling/storage
I have an ETL application which is suppose to migrate to AWS infra. The scheduler being used in my application is Tivoli Work Scheduler and we want to use the same on cloud as well which has file dependencies.
Now when we move to aws , the files to be watched will land in S3 Bucket. Can we put the OPEN dependency for files in S3? If yes, What would be the hostname ( HOST#Filepath ) ?
If Not, what services should be aligned to serve the purpose. I have both time as well as file dependency in my SCHEDULES.
Eg. The file might get uploaded on S3 at 1AM. AT 3 AM my schedule will get triggered, look for the file in S3 bucket. If present, starts execution and if not then it should wait as per other parameters on tws.
Any help or advice would be nice to have.
If I understand this correctly, job triggered at 3am will identify all files uploaded within last e.g. 24 hours.
You can list all s3 files to list everything uploaded within specific period of time.
Better solution would be to create S3 upload trigger which will send information to SQS and have your code inspect the depth (number of messages) there and start processing the files one by one. An additional benefit would be an assurance that all items are processed without having to worry about time overalpse.
I have developed one small application. For logs , i'm using watchtower in aws. logs are working fine.Logs are inserted in cloudwatch by file wise logs in aws but i wants all logs to be registered in a single file only (for example api.views ) .is this possible? if yes, how?
solved this problem... i have written logs function in one file.. calling that function wherever i want to write logs.... so all logs are saving with same file name
I have a monthly activity where i get hundreds of PDF files in a folder and i need to transfer those to an AWS server . Currently i do this activity manually . But i need to automate this process of transfer of all pdf files form my local folder to a specific folder in AWS .
Also this process takes a lot of time ( approx 5 hours for 500 pdf files) . Is there a way to spped up the process?
While doing the copy from local to AWS you must be using some tool like winSCP or any SSH client, so you could automate the same using the script.
scp [-r] /you/pdf/dir youruser#aswhost:/home/user/path/
If you want to do it with speed, you could run multiple scp command in parallel of multiple terminal and may split files while creating to some logical grouped directories.
You can zip the files and transfer them. After transfer unzip the files.
Or else write a program which iterates over all files in your folder and uploads files to s3 using S3 api methods.
My system generate large log files continuously and I want to upload all the log files to Amazon S3. I am planning to use the s3 synch command for this. My system appens the logs in the same file until they are of about 50MB and then it create new log file. I understand that synch command will synch the modified local log file in s3 bucket, but I dont want to upload the entire log file when the file changes as the files are large and sending same data again and again will consume my data bandwidth.
So I am wondering if s3 synch command sends the entire modified file or just the delta in the file?
The documentation implies that it copies the whole updated files
Recursively copies new and updated files
Plus there would be no way to to do this without downloading the file from S3 which would effectively double the cost of an upload since you'd pay the download and upload costs.