I need to do OCR on images that contain text in Hindi, Marathi, Malayalam, etc languages. I am using AWS Textract API in the python script, but OCR on Scanned Hindi text Document gives a response with the incorrect English like words.
Does AWS Textract support the Hindi language?
Please guide me on this.
Thank You in Advance.
Following are the documents from AWS, which enlist supported languages by Amazon Textract;
Amazon Textract supports handwriting and five new languages
Hard Limits in Amazon Textract
Amazon Textract FAQs
Moreover, you can think in direction of using Amazon Comprehend, please refer this document for more information about supported languages by Amazon Comprehend.
Related
I am trying to access the aws rds api to describe db snapshots. I plan on having this be parsed so that I can list all the available aws snapshots by id using groovy. However the biggest problem I am having is getting the api in the first place. I took a look at AWS's reference on this topic but I can't seem to figure out how to generate the pre-signed portion of the request with credentials. I am not sure why that part is even necessary. Why can't the user authenticate using the Access key ID and the Secret access key combination?
The reference:
https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_DescribeDBSnapshots.html
The section with the issue:
https://rds.us-west-2.amazonaws.com/
?Action=DescribeDBSnapshots
&IncludePublic=false
&IncludeShared=true
&MaxRecords=100
&SignatureMethod=HmacSHA256
&SignatureVersion=4
&Version=2014-09-01
&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=AKIADQKE4SARGYLE/20140421/us-west-2/rds/aws4_request
&X-Amz-Date=20140421T194732Z
&X-Amz-SignedHeaders=content-type;host;user-agent;x-amz-content-sha256;x-amz-date
&X-Amz-Signature=4aa31bdcf7b5e00dadffbd6dc8448a31871e283ffe270e77890e15487354bcca
If groovy is a hard requirement, I'd look into something like this https://grails.org/plugin/aws-sdk
If you're comfortable with Java, I'd say use the official AWS-SDK
If you're scripting this out, you could also use the official AWS cli tool and do something like
aws rds describe-db-snapshots [OPTIONS]
From there you could use a tool like jq to zero-in and parse out your specific ID's. You can find more documentation here.
The way you'd authorize with the SDK is either through environment variables (the preferred approach) or probably hardcoding your KEY and SECRET (big no no)
I think rather than trying to directly communicate with the API directly you should make use of the built in wrappers that AWS provide.
If you're accessing this with a supported programmatic language take a look at the AWS SDKs. There are currently officially supported libraries for:
C++
Go
Java
JavaScript
.NET
NodeJS
PHP
Python
Ruby
If your language of choice is not covered there may be a third party solution already. Alternatively take a look at the AWS CLI to resolve your problem.
For your specific action describe-db-snapshots you can get a list of all IDs by running the below, then parse as JSON.
aws rds describe-db-snapshots --query 'DBSnapshots[*].DBSnapshotIdentifier' --format json
I want to extract structured text from PDF contracts with AWS Textract.
Is the service configurable? For example, can I set the minimum vertical blank space the system uses to split paragraphs?
Thank you!
No, Amazon Textract is not configurable. It is a managed service that is meant to work "as is" without detailed settings.
Available actions are shown on: Actions - Amazon Textract
I'm looking for a tool that will allow me to query my Redshift cluster via a REST API? I'm building some analytics UI and would rather not have to stand up a separate server in order to query my instance.
Any suggestions would be much appreciated.
You could use boto SDK to describe redshift clusters. (describe_clusters)
http://boto.cloudhackers.com/en/latest/ref/redshift.html
I am trying to consider AWS as an option now that Parse is shutting down.
My question is when creating a social network, similar to instagram, AWS S3 for photo/video storage combined with DynamoDB for the database (friend connections, URL reference to photos in S3) would be a viable choice? Are these two products together roughly equivalent to what Parse offers?
i am not an expert in this topic but i would like to share my opinion. aws is more than capable for satisfying the needs of most applications. (as far as i know parse was using aws for their infrastructure, i cant give a reference though)
drawback here is that the learning curve is much more steeper compared to what parse offered. also in my opinion documentations of the sdk has a lot room for improvement.
i was developing an app which is kind of a social network using parse and most of the work was done when parse announced the bad news. after a long research i decided to migrate to aws. i am using cognito for authentication, dynamodb for data storage, lambda for "cloud code", s3 for "pffile" and sns for push notifications. as far as i have seen these services combined completely substitutes for parse, however it is harder to implement
i hope this was helpful
Hello all I am looking for an Amazon API that can be used to pull volume of a keyword from Amazon? Is there is any API available?
This website is doing the same but not sure what services or APIs they will be using: https://www.merchantwords.com/
As Amazon does not publish these numbers, my guess it's just applied statistics, based on search/Amazon ranking click through distributions and the way the products Amazon rankings fluctuate.