Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
I got my first job as a BI support with AWS and the company has several glue jobs which is a very expensive game so I want to try and change it, instead of using glue jobs to use lambda function. The question is, how do I change a glue job to a lambda function? can anybody help? Thanks.
In general: you don't.
A glue Job can a) run for faaaar longer and b) can consume faaaar more resources and c) can have code and dependencies far exceeding the limits of Lambda. You can't replace a glue job with a lambda unless you did not need a glue job in the first place because you operate on few resources, for a short time with little code. If that is the case you would need to be a lot more specific how the current job is integrated. E.g. triggers will no longer work, network connectivity might no longer work, etc.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
A video player sends the server log data about what the user has been doing (start, pause, play, playing, etc.)
Sending the logs to the server and storing them in the DB, then running queued jobs to calculate stats on these has worked... okay, so far.
It's clear there should be some sort of optimization here. What services provide the best custom log storage?
What would be the best manual option? Considering running some Lambda functions and storing in AWS (RDS?) manually, but wondering if the maintenance of such a service is warranted.
I would store the logs in AWS S3 (Storage) and then use AWS Glue (Transform) and AWS Athena for ad-hoc querying of different stats, this will still work out cheaper than using a traditional database approach plus it has a lot of other advantages.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I would like to use the ML model I created in AWS in my QuickSight reports.
Is there a way to consume the ML endpoint in order to run batch predictions in QuickSight?
Can I define a 'calculated field' in order to do that?
At this time there is no direct integration with AWS SageMaker and QuickSight, however you can use utilize SageMaker's batch transform jobs to convert data outside of QuickSight and then import this information into QuickSight for visualization. The output format for SageMaker's batch transform jobs is S3, which is a supported input data source for QuickSight.
https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-sagemaker-supports-high-throughput-batch-transform-jobs-for-non-real-time-inferencing/
Depending on how fancy you want to be, you can also integrate calls to AWS services such as AWS Lambda or AWS SageMaker as a user-defined function (UDF) within your datastore. Here are a few resources that may help:
https://docs.aws.amazon.com/redshift/latest/dg/user-defined-functions.html
https://aws.amazon.com/blogs/big-data/from-sql-to-microservices-integrating-aws-lambda-with-relational-databases/
Calculated fields will probably not help you in this regard - calculated fields are restricted to a relatively small set of operations, and none of these operations support calls to external sources.
https://docs.aws.amazon.com/quicksight/latest/user/calculated-field-reference.html
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have an EC2 instance that schedule many tasks (using crontab).
some of them are executed every 1 min, 5 min, and so on..
I want to move all cron tasks into AWS service.
I am trying to figure which AWS service can give me the best solution.
I found 2 services that can schedule cron like tasks:
AWS Data Pipeline
AWS Lambda
which of them can give me the best solution?
I don't know how you want to define "best" but if you have many tasks, each one will require a separate pipeline, and that will cost you around $1 each.
Lambda on the other hand will probably be much less - you get 1M requests free, and they're $0.20 / million after that. You will also get charged based on the time & memory each task takes to run. There are some limits (5 min is the max time I think) so you'll have to take that into consideration.
But overall, I think Lambda will be much cheaper to run.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
For our videoplatform we store all of our videofiles in AWS S3 (sometimes deliver them on CloudFront). Customers are divided into groups; for every group we created a bucket with a Cost A. Tag.
So at this point we can monitor storage and streaming costs for a group. But for a new project we are required to get those reports based on the customers.
What should be the best approach? We could create a bucket for every customer, but i'm not a fan of that.
We could inspect the access logs; but according to the manual they can be "wrong".
Any suggestions?
The documentation is only hedging against the occasional lost or delayed log file. They are not guaranteed to be perfect, but in practice, they are reliable. I get the sense that the purpose of the disclaimer is to avoid petty disputes, rather than significant discrepancies.
Consider using the logs to do your own reporting on your existing projects, where you already know the costs... and compare those results to the results you get with the tag-based billing setup. If the answers are consistent, the problem seems effectively solved.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am looking for a way to pass log events from AWS application to my company site.
The thing is that the AWS application is 100% firewalled from everything except only one IP address because it's encryption related service.
I just don't know what service I should use to do this. There's so many services so I do really have no idea what is it.
I think I'd just use simple message service, does this makes sense? The thing is there's plenty of events (let's say 1M per day), so I don't want big extra costs for this.
Sorry for the generic question, but I think it's quite concrete - "What is the most optimal way to pass event message from AWS when volume is approx 1M per day each 256 bytes on average?".
I'd like to connect to AWS service instead to any of the EC2 hosts...
On both sides I have tomcats with AWS-SDK.
I just want to avoid rewriting. Maybe I should do it with S3? The files are immutable, but I could upload files every 1h. I don't need real-time events. I just need to have logfiles on site for analysis of user experience and that customers can access it, but having log in 1M chunks would either require further assembling etc, I am really confused, sorry.
Kinesis is good for streaming event data. S3 is good if you already have files that you want stored.