python aws sdk is missing transcribe streaming API - amazon-web-services

I checked github code for transcribe streaming options and it looks like there is no transcribe streaming mentions neither in docs nor in config file: src/botocore/botocore/data/transcribe/2017-10-26/service-2.json.
But I see documentation for Ruby: https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/TranscribeStreamingService.html
This is why I believe it makes sense to do it using scripting (not compiled) language.
Are there any plans to add it in the future? If yes, when approximately?
Or am I simply missing it?
I saw documentation describes low level communication for this kind of API, but I want to save dev time re-using the lib.

python sdk for transcribe streaming is not available yet.

Related

Working with google cloud storage in julia applications

I have a query related to the google cloud storage for julia application.
Currently, I am hosting a julia application (docker container) on GCP and would like to allow the app to utilize cloud storage buckets to write and read the data.
I have explored few packages which promise to do this operation.
GoogleCloud.jl
This package in the docs show a clear and concise representation of the implementation. However, adding this package result in incremental compilation warning with many of the packages failing to compile. I have opened an issue on their github page : https://github.com/JuliaCloud/GoogleCloud.jl/issues/41
GCP.jl
The scope is limited, currently the only support is for BigQuery
Python package google
This is quite informative and operational but will take a toll on the code's performance. But do advise if this is the only viable option.
I would like to know are there other methods which can be used to configure a julia app to work with google storage?
Thanks look forward to the suggestions!
GCP.jl is promising plus you may be able to do with gRPC if Julia support gRPC (see below).
Discovery
Google has 2 types of SDK (aka Client Library). API Client Libraries are available for all Google's APIs|services.
Cloud Client Libraries are newer, more language idiosyncratic but only available for Cloud. Google Cloud Storage (GCS) is part of Cloud but, in this case, I think an API Client Library is worth pursuing...
Google's API (!) Client Libraries are auto-generated from a so-called Discovery document. Interestingly, GCP.jl specifically describes using Discovery to generate the BigQuery SDK and mentions that you can use the same mechanism for any other API Client Library (i.e. GCS).
NOTE Explanation of Google Discovery
I'm unfamiliar with Julia but, if you can understand enough of that repo to confirm that it's using the Discovery document to generate APIs and, if you can work out how to reconfigure it for GCS, this approach would provide you with a 100% fidelity SDK for Cloud Storage (and any other Google API|service).
Someone else tried to use the code to generate an SDK for Sheets and had an issue so it may not be perfect.
gRPC
Google publishes for the subset of its services that support gRPC. If you'd prefer to use gRPC, it ought be possible to use the Protobufs in Google's repo to define a gRPC client for Cloud Storage

AWS SDK S3Waiter waitUntilObjectExists usage guidance

I am new to AWS S3 SDK and need guidance on using S3Waiter.waitUntilObjectExists() I could not find any exact examples. I have a S3 bucket in which every four hours files uploaded by upstream processes. I have while(true) loop which polls for this bucket but it seems to be unnecessary and doing lot of IOs.
I read about S3Waiter.waitUntilObjectExists() and it seems to be applicable and best practices kind of solution in my case.
The examples for Amazon S3 V2 API are located in the AWS Doc Github repo. You will find the latest Java V2 examples that are tested via Unit tests, etc in Github:
https://github.com/awsdocs/aws-doc-sdk-examples/tree/master/javav2/example_code/s3/src/main/java/com/example/s3
For example, if you want to learn how to use waiters when you create a bucket, see this example:
https://github.com/awsdocs/aws-doc-sdk-examples/blob/master/javav2/example_code/s3/src/main/java/com/example/s3/CreateBucket.java
This concept is also explained in the AWS SDK for Java 2.x Developer guide:
Using waiters
And yes, using waiters is best practice vs looping and polling.

Does Amazon provide S3DistCp java api?

I want to integrate s3 to hadoop file transfer in Java Code. Does Amazon provide any API for this task. I want to use S3DistCp.
not AFAIK, but you can use the apache one, it's in the org.apache.hadoop/hadoop-distcp module. I use it in tests (more specifically, the Hadoop azure & s3a clients use it to verify that distcp works with their object stores).
You can use the Java de-compiler to see the source code and the implementation by using the s3distcp.jar. This can be downloaded from this following location.
s3://elasticmapreduce/libs/s3distcp/1.latest/s3distcp.jar

Mocks for AWS SimpleWorkflowService and ElasticMapReduce

Are there any mocks for AWS SWF or EMR available anywhere? I tried looking at some other AWS API mocks such as https://github.com/atlassian/localstack/ or https://github.com/treelogic-swe/aws-mock but they don't have SWF or EMR which are the things that would be really painful to reproduce. Just not sure if anyone has heard of a way to locally test things that use dependencies on those services.
The "moto" project (https://github.com/spulec/moto) groups mocks for the "boto" library (the official python sdk for AWS), and it has mocks for basic things in SWF (disclaimer: I'm the author who contributed them) and EMR.
If you happen to work in Python they're ready to use via a #mock_swf decorator (use 0.4.x for boto 2.x or 1.x for boto 3.x). If you work with another language, moto supports a server mode that mimics an AWS endpoint. The SWF service is not provided out of the box yet, but with a minor change in "moto/backends.py" you should be able to try using it. I think the EMR service works out of the box.
Should you have any issue with the SWF mocks in this project, you can file an issue on the Github project, don't hesitate to cc me directly (#jbbarth), I can probably help improving this.

Can we develop streams in c++ using AWS sdk c++

I created a table with streams enabled, how can i get the stream records whenever update/insertion is happened? What are the steps involved? Or it is not possible in C++?
You should be able to do it using the sdk for C++.
Here there is an example
https://aws.amazon.com/blogs/developer/using-a-thread-pool-with-the-aws-sdk-for-c/
The AWS blog has many examples more https://aws.amazon.com/blogs/developer/category/cpp/
Check these links below (look at aws-cpp-sdk-kinesis) for more info:
https://github.com/aws/aws-sdk-cpp
https://aws.amazon.com/blogs/aws/aws-sdk-for-c-now-ready-for-production-use/
https://aws.amazon.com/blogs/aws/introducing-the-aws-sdk-for-c/