Is Amazon S3 storage a well-specified standard?

Is Amazon S3 storage a well-specified standard? - amazon-web-services

I have seen several storage products advertised as "S3-compatible".
Is there a well-defined specification available for these compatible products and if so, which things are specified?
For example, is there a well-defined REST API? Eventual consistency guarantees? Availability of webhooks?
I understand that different vendors make different business decisions and implementation choices on these products so I'd like to know this distinction and whether "S3-compatible" has a meaning other than exactly "sort of like the product Amazon sells, or used to sell on the last time we checked".

"S3-compatible" typically means "you can use any tools that work with S3, but point them to us and it will work fine".
This includes using the AWS Command-Line Interface (CLI) with a different endpoint (and a different set of credentials, of course).
The Amazon S3 REST API Reference is well-defined and has effectively become an industry standard.

Related

What does the gidl query string parameter represent in the header of the 'CheckCookie' 302 redirect after authenticating to Google Cloud Console?

I am trying to determine what the query string parameters represent after authenticating to Google Cloud Console.
It is the same when i log in to different GCP accounts and it is also the same when a friend logs into their GCP account however I noticed from this Stack Overflow question that the gidl is different to mine and my friends. Does anyone know what this id represents?

The interface description language (IDL) is a language that allows unambiguous specification of the interfaces that client objects3 may use and
(server) object implementations provide as well as all needed related constructs such as exceptions and data types.
IDLs are commonly used in remote procedure call software. In these cases the machines at either end of the link may be using different operating systems and computer languages. IDLs offer a bridge between the two different systems.
This language use these interfaces or create the associated data types cannot be written in IDL, but in a programming language, for which mappings from IDL constructs have been defined.

Correct Architecture for Micro Services with Multiple Customer Interfaces

I am new to micro services and I am keen to use this architecture. I am interested to know what architecture structure should be used for systems with multiple customer interfaces where customer systems may use one or many of the available services. Here is a simple illustration of a couple of ways I think it would be used:
An example of this type of system could be:
Company with multiple staff using system for quotes of products
using products, quotes and users mirco services
Company with website to display products
using products micro service
Company with multiple staff using system for own quotes
using quotes and users micro services
Each of these companies would have their own custom build interface only displaying relevant services.
As in the illustrations all quotes, products and users could be stored local to the mirco services, using unique references to identify records for each company. I dont know if this is advisable as it could make data difficult to manage and could grow fast making it difficult to manage.
Alternatively I could store such as users and quotes local to the client system and reference the micro services for data thats generic. Here mirco services could be used just to handle common logic and return results. This does feel someone illogical and problematic to me.
I've not been able to find anything online to explain the best course of action for this scenario and would be grateful for any experienced feedback on this.

I am afraid you will not find many useful recipes or patterns for microservice architectures yet. I think that the relative quiet on your question is that it doesn’t have enough detail for anybody to readily grasp. I will make a wag:
From first principles, you have the concept of a quote which would have to interrogate the product to get a price and other details. It might need to access users to produce commission information, and customers for things like discounts and lead times. Similar concepts may be used in different applications; for example inventory, catalog, ordering [ slightly different from quote ].
The idea in microservices is to reduce the overlap between these concepts by dispatching the common operations as their own (micro) services, and constructing the aggregate services in terms of them. Just because something exists as a service does not mean it has to be publicly available. It can be private to just these services.
When you have strained your system into these single function services, the resulting system will communicate more, but will be able to be deployed more flexibly. For example, more resources &| redundancy might be applied to the product service if it is overtaxed by requests from many services. In the end, infrastructure like service mesh help to isolate the implementation of these micro services from the sorts of deployment considerations.
Don’t be misled into thinking there is a free lunch. Micro service architectures require more upfront work in defining the service boundaries. A failure in this critical area can yield much worse problems than a poorly scaled monolithic app. Even when you have defined your services well, you might find they rely upon external services that are not as well considered. The only solace there is that it is much easier to insulate your self from these if you have already insulated the rest of your system from its parts.

After much research following various courses online, video tutorials and some documentation provided by Netflix, I have come to understand the first structure in the diagram in the best solution.
Each service should be able to effectively function independently, with the exception of referencing other services for additional information. Each service should in effect be able to be picked up and put into another system without any need to be aware of anything beyond the API layer of the architecture.
I hope this is of some use to someone trying to get to grips with this architecture.

Google Speech API compliance in Canada

A customer I am working with wants to use Google Speech API for transcribing audio but there are compliance concerns.
I know that you can upload files directly or have the API access files in Google Cloud Storage. For either of these methods is anyone familiar with how they interact with the data compliance laws in Canada?
For instance if the audio files are uploaded to a Cloud Storage bucket at the Montreal datacenter and we make an API call on it does the file ever leave that datacenter?
Thanks in advance for any insights!

Stack Overflow is not a great place to get a legal opinion, but is there a particular standard for compliance that they require? Google Cloud has a number of international data compliance certifications, one of which might be the one your customer requires. Talk to your customer and see what they need, and take a look at Google Cloud's list of standards that they are compliant with to see if it meets those needs: https://cloud.google.com/security/compliance
For example, the Cloud Speech API is compliant with ISO 27018, an international standard for cloud service privacy. Is that sufficient for your customer? You'll need to ask them.

Does SNS retain my data?

I am evaluating push notification services and cannot use services on the cloud as laws prohibit customer identification data being stored off-premise.
Question
Is there any chance data will be stored off-premise if I use AWS-SNS API (not the console) to send push notifications to end user devices via code hosted on-premise(using AWS SDK)? In other words, will SNS retain my data or will it forget it right after it send the notification?
What have I tried so far?
Combed through the documentation as much as I could, but couldn't find anything to be 100% sure.
Would appreciate any pointers on this. TIA.

I would pose this question directly to AWS as it pertains to a legal requirement. I would clarify if the laws you need to comply with are in relation to data at rest or in transit, or both. Additionally if there are any circumstances where it would be ok for one or both of the aforementioned if there was certain security aspects that have been met.
Knowing no real detail about your use case I will say that AWS has a Region specifically for use by the US Government. If your solution is for the US Government then you should be making use of this Region as it ticks off a lot of compliance forms for you well in advance.
You can open a support ticket in the AWS console.
Again if there is a legal requirement for your data I thoroughly recommend that you ask AWS directly so that you may reference their answer in writing in the future.

Even if they didn't store it, how can you prove that to auditors?
Besides, what is the difference between storing something in memory (which they obviously have to do) and storing something on disk? One is volatile and the other isn't I guess. But from a compliance point of view, an admin on the box can get both, so who cares if the hardware with your data on it is a stick of RAM or a disk plugged into a SATA port?

what does low level storage management like iRODS exactly for (in fedora-commons)?

I am not clear about the actual advantage of having iRODS or any other low level storage management. What are it's benefits exactly and when should we use it?
In Fedora-commons with normal file system low level storage:
a datastream created on May 8th, 2009 might be located in the 2009/0508/20/48/ directory.
How does iRODS helpful here?

I wanted to close the loop here, for other Stack Overflow users.
You posted the same question to our Google Group https://groups.google.com/d/msg/irod-chat/fti4ZHvmS-Y/LU8CQCZQHwAJ The question was answered there, and, thanks to you, the response is now also posted on the iRODS.org FAQ: http://irods.org/faq/
Here it is, once again, for posterity:
Don’t think of iRODS as simply low level storage management.
iRODS is really the only platform for policy managed data preservation. It does indeed virtualize storage, providing a global, logical namespace over heterogeneous types of storage, but it also allows you to enforce preservation policies at each storage location, no matter what client or access method is used. It also provides a global metadata catalog that is automatically maintained and reflects the application of your preservation policies, allowing audit and verification of your preservation policies.
iRODS is developing a powerful metadata management capability, allowing pluggable indexing and query capability that allow synchronization with external indices (e.g. Elastic Search, MAUI, Jena triple store).
With the pluggable rule engine and asynchronous messaging architecture, it becomes rather straightforward to generate audit and provenance metadata that will track every single (pre- and post-) operation on your data, including any plugins you may develop or utilize.
iRODS is middleware, rather than a prepackaged solution. This middleware supports plugins and configurable policies at all points, so you are not limited by a pre-defined set of tools. iRODS also can be connected to wide range of preservation, computation, and enterprise services, and can manage large amounts of data (both in number of objects and size of those objects), and efficiently move and manage data using high performance protocols, including third party data transfer protocols.
iRODS is built to support federation, so that your preservation environment may share data with other institutions or organizations while remaining under your own audit and policy control. Many organizations are doing this for many millions of objects, many thousands of users, and with a large range of object sizes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js