We have an internal application. As time went on and new applications were requested, that exchange data between eachother, the interaction became bound to the database schema. Meaning changes in the database require changes everywhere else. As we plan to build even more applications that will depend on the same data this quickly will become and unmanagable mess.
Now i'm looking to abstract that interaction behind an API. Currently i have trouble choosing the right tool.
Interaction at times could be complex, meaning data is posted to one service and if the action has been completed it should notify the sender of that.
Another example would be that some data does not have context without the data from other services. Lets say there is one service for [Schools] and one for [Students]. So if the [School] gets deleted or changed the [Student] needs to be informed about it immeadetly and not when he comes to [School].
Advice? Suggestions? SOAP/REST/?
I don't think you need an API. In my opinion you need an architecture which decouples your database from the domain logic and other parts of the application. Such an architecture is for example clean architecture, onion architecture and hexagonal architecture (ports&adapters by new name). They share the same concepts, you have a domain logic, which does not depend from any framework, external lib, delivery method, data storage solutions, etc... This domain logic communicates with the outside world through adapters having well defined interfaces. If you first design the inside of your domain logic, and the interfaces of the adapters, and just after the outside components, then it is called domain driven design (DDD).
So for example if you want to move from MySQL to MongoDB you already have a DataStorageInterface, and the only thing you need is writing a MongoDBAdapter which implements this interface, and ofc migrate the data...
To design the adapters you can use two additional concepts; command and query segregation (CQRS) and event sourcing (ES). CQRS is for connecting delivery methods like REST, SOAP, webapplications, etc... to the domain logic. For example you can raise a CreateUserCommand from your REST API. After that the proper listener in the domain logic processes that command, and by success it raises a domain event, like UserCreatedEvent. Your REST API can listen to that event and respond with a success message to the REST client. The UserCreatedEvent can be listened by one or more storage adapter too. So they can process that event and persist the new user. You don't necessary use only a single database. For example if a relational database is faster by a specific type of query, then you can use that, but if a noSQL database suites better to the job, then you can use that too. So you can use as many databases as you want for your queries, the only thing you need is writing a storage adapter for them. For example if your REST client wants to retrieve the profile of a specific user, then it can raise a GetUserProfileByIdQuery and the domain logic can ask the adapter of a database which can serve the query. After that the adapter can send for example an SQL query to a MySQL database and return the response. By ES you add EventStorage to your system, which stores the raised domain events. It can be very useful if you want to migrate your data from one query database to another. In that case you create a new storage adapter to your new database, and replay all of the domain events from the EventStorage in historical order to that adapter, so it can fill the new database with the relevant data. That's all, you don't have to write complicated migration scripts...
In your case I think your should create at least domain events, and use event sourcing. That will totally decouple your database from the other parts of your application. Adding a REST or SOAP API can have a similar effect, but building HTTP connections to access your database can slow down your application.
Related
My question is regarding the clean architecture. I am quite new to .NET Core, and my assignment is to integrate a third-party API into the console app so the data is stored in a SQL database. I have been researching for the past two days, and I found so many examples of how to create a repository service querying a local database, but I cannot see any decent example of how to integrate the data from external APIs to a local data storage solution, SQL database in my case.
Any information on how to structure services in a clean manner would help me a lot!
It basically boils down to the question whether you need to store locally data from a 3d party API. You are not the owner of that data.
In many cases it's better to query external API each time you need some data. You don't know if the data will change and when. E.g. you query each time weather data or currency convertion rate. External system knows how to produce such results, so ask the system each time. It's often ok to cache data from 3d party APIs, depending on the needs of your application (does your user needs weather updates with each refresh or once a day is enough, e.g. on a web site of a skiing area).
Now there can be cases when 3d party API is loosely coupled with your domain. E.g. you use a separate API to manage devices in a smart home solution. In this case we are talking about microservice architecture (Microservices by Fowler is a good start). It's OK in this case to build your local copy of the data from a 3d party API. Imaging your service is only interested in how often devices get replaced. So you may query device management API for broken devices and store this information locally in the form that is better for your application. You get data, reorganize it as it better fits you, store and use it.
So basically as always with such broad questions: it depends. On what are you building, on what external API provides, on what do you want to do with the data you query.
Going technical on your question: you query data e.g. via REST from a third party. This gets you TheirObjectDto. You process this data. Their object may change, you are not the owner of the contract. You create MyObject, which contains the data that you need. You save it to your DB (via repository, if you want). You build an Entity for it and DB table. When 3d party API changes and returns TheirObjectDto2 you still work with MyObject. Minimal change for you is to convert TheirObjectDto2 to MyObject. Your app continues working. In a second step if you want some new cool info of TheirObjectDto2 you modify your internal data structure.
I'd like to use the microservices architectural pattern for a new system, but I'm having trouble figuring out how to share and merge data between the services when the services are isolated from each other. In particular, I'm thinking of returning consolidated data to populate a web app UI over HTTP.
For context, I'm intending to deploy each service to its own isolated environment (Heroku) where I won't be able to communicate internally between services (e.g. via //localhost:PORT. I plan to use RabbitMQ for inter-service communication, and Postgres for the database.
The decoupling of services makes sense for CREATE operations:
Authenticated user with UserId submits 'Join group' webform on the frontend
A new GroupJoinRequest including the UserId is added to the RabbitMQ queue
The Groups service picks up the event and processes it, referencing the user's UserId
However, READ operations are much harder if I want to merge data across tables/schemas. Let's say I want to get details for all the users in a certain group. In a monolithic design, I'd just do a SQL JOIN across the Users and the Groups tables, but that loses the isolation benefits of microservices.
My options seem to be as follows:
Database per service, public API per service
To view all the Users in a Group, a site visitor gets a list of UserIDs associated with a group from the Groups service, then queries the Users service separately to get their names.
Pros:
very clear separation of concerns
each service is entirely responsible for its own data
Cons:
requires multiple HTTP requests
a lot of postprocessing has to be done client-side
multiple SQL queries can't be optimized
Database-per-service, services share data over HTTP, single public API
A public API server handles request endpoints. Application logic in the API server makes requests to each service over a HTTP channel that is only accessible to other services in the system.
Pros:
good separation of concerns
each service is responsible for an API contract but can do whatever it wants with schema and data store, so long as API responses don't change
Cons:
non-performant
HTTP seems a weird transport mechanism to be using for internal comms
ends up exposing multiple services to the public internet (even if they're notionally locked down), so security threats grow from greater attack surface
Database-per-service, services share data through message broker
Given I've already got RabbitMQ running, I could just use it to queue requests for data and then to send the data itself. So for example:
client requests all Users in a Group
the public API service sends a GetUsersInGroup event with a RequestID
the Groups service picks this up, and adds the UserIDs to the queue
The `Users service picks this up, and adds the User data onto the queue
the API service listens for events with the RequestID, waits for the responses, merges the data into the correct format, and sends back to the client
Pros:
Using existing infrastructure
good decoupling
inter-service requests remain internal (no public APIs)
Cons:
Multiple SQL queries
Lots of data processing at the application layer
harder to reason about
Seems strange to pass large quantities around data via event system
Latency?
Services share a database, separated by schema, other services read from VIEWs
Services are isolated into database schemas. Schemas can only be written to by their respective services. Services expose a SQL VIEW layer on their schemas that can be queried by other services.
The VIEW functions as an API contract; even if the underlying schema or service application logic changes, the VIEW exposes the same data, so that
Pros:
Presumably much more performant (single SQL query can get all relevant data)
Foreign key management much easier
Less infrastructure to maintain
Easier to run reports that span multiple services
Cons:
tighter coupling between services
breaks the idea of fundamentally atomic services that don't know about each other
adds a monolithic component (database) that may be hard to scale (in contrast to atomic services which can scale databases independently as required)
Locks all services into using the same system of record (Postgres might not be the best database for all services)
I'm leaning towards the last option, but would appreciate any thoughts on other approaches.
To evaluate the pros and cons I think you should focus on what microservices architecture is aiming to achieve. In my opinion Microservices is architectural style aiming to build loosely couple applications. It is not designed to build high performance application so scarification of performance and data redundancy are something we are ready accept when we decided to build applications in a microservices way.
I don't think you services should share database. Tighter coupling scarify the main objective of the microservices architecture. My suggestion is to create a consolidated data service which pick up the data changes events from all the other services and update the database behind it. You might want to design the database behind the consolidated data service in a way that is optimised for query (like a data warehouse) because that's all this service will be used for. You might want to consider using a NoSQL database to support your consolidated data service.
I'm creating a web-application and decided to use micro-services approach. Would you please tell me what is the best approach or at least common to organize access to the database from all web-services (login, comments and etc. web-services). Is it well to create DAO web-service and use only it to to read/write values in the database of the application. Or each web-service should have its own dao layer.
Each microservice should be a full-fledged application with all necessary layers (which doesn't mean there cannot be shared code between microservices, but they have to run in separate processes).
Besides, it is often recommended that each microservice have its own database. See http://microservices.io/patterns/data/database-per-service.html https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/ Therefore, I don't really see the point of a web service that would only act as a data access facade.
Microservices are great, but it is not good to start with too many microservices right away. If you have doubt about how to define the boundaries between microservices in your application, start by a monolith (all the time keeping the code clean and a good object-oriented with well designed layers and interfaces). When you get to a more mature state of the application, you will more easily see the right places to split to independently deployable services.
The key is to keep together things that should really be coupled. When we try to decouple everything from everything, we end up creating too many layers of interfaces, and this slows us down.
I think it's not a good approach.
DB operation is critical in any process, so it must be in the DAO layer inside de microservice. Why you don't what to implement inside.
Using a service, you loose control, and if you have to change the process logic you have to change DAO service (Affecting to all the services).
In my opinion it is not good idea.
I think that using Services to expose data from a database is ideal due to the flexibility it provides. Development of a REST service to expose some or all of your data as a service provides flexibility to consume the data directly to the UI via AJAX or by other services which can process the data and generate new information. These consumers do not need to implement a DAO and can be in any language. While a REST Service of your entire database is probably not a Micro-Service, a case could be made for breaking this down as Read only for Students, Professors and Classes for exposing on the School Web site(s), with different services for Create, Update and Delete (CUD) available only to the Registrars office desktop applications.
For example building a Service to exposes a statistical value on data will protect the data from examination by a user/program who only needs a statistical value without the requirement of having the service implement an entire DAO for the components of that statistic. Full function databases like SQL Server or Oracle provide a lot of functionality that application developers can use, including complex queries(using indexes), statistics the application of set operations on data.
Having a database service is a completely valid pattern. In fact, this is one of the key examples of where to start to export aspects of a monolith to a micro service in the Building Microservices book.
How to organize your code around such idea is a different issue. Yes, from the db client programmer's stand point, having the same DAO layer on each DB client makes a lot of sense.
The DAO pattern may be suitable to bind your DB to one programming language that you use. But then you need to ask yourself why you are exposing your database as a web service if all access to it will be mediated by the same DAO infrastructure. Or are you going to create one DAO pattern for each client programming language binding?
If all database clients are going to be written on the same programming language, then are you sure you really need to wrap your DB as a microservice? After all, the DB is usually already a remote service with a well-defined network protocol optimized to transfer data fast and reliably. Why adding HTTP on top of it? What are you expecting to gain from adding such complexity?
Another problem with using the DAO pattern is that the DAO structure does not necessarily follow the evolution of the web service. The web service may evolve in a way that does not make old clients incompatible. You may have different clients using different features of the micro service. In this case you are not sharing the same DAO layer structure on each client.
Make sure you are not using RPC-style programming over web services, which does not make much sense. You will be basically throwing away one of the key advantages of micro services, which is the decoupling between service and client.
I've just started messing around with AWS DynamoDB in my iOS app and I have a few questions.
Currently, I have my app communicating directly to my DynamoDB database. I've been reading around lately and people are saying this isn't the proper way to go about getting data from my database.
By this I mean is I just have a function in my code querying my Dynamo database and returning the result.
How I do it works but is there a better way I should be going about this?
Amazon DynamoDB itself is a highly-scalable service and standing up another server in front of it requires scaling the service also in line with the RCU/WCU configured for your tables, which we can and should avoid.
If your mobile application doesn't need a backend server and you can perform all the business functions from the mobile device, then you should probably think about
Using the AWS DynamoDB SDK for iOS devices to write your client application that runs on the mobile device
Use AWS Token Vending Machine to authenticate your mobile users to grant them credentials to be used to run operations on DynamoDB tables.
Control access (i.e what operations should be allowed on tables etc.,) using IAM policies.
HTH.
From what you say, I can guess that you are talking about a way you can distribute data to many clients (ios apps).
There are few integration patterns (a very good book on this: Enterprise Integration Patterns), one of which is called shared database. It is essentially about using a common database for multiple clients to share the data. Main drawback for that pattern (in your case) is that you are doing assumption about how the database schema looks like. It can potentially bring you some headache supporting the schema in the future, if your business logic changes.
The more advanced approach would be sending events on every change in your data instead of directly writing changes to the database from client apps. This way you can add additional processing to the events before the data they carry is written to the database. For example, you may want to change the event format in the new version of your app, but still want to support legacy users, so you add translation procedure which transforms both types of events to the format which fits the database schema. It's basically a question of whether to work with diffs vs snapshots.
You should be aware of added complexity of working with events, and it can be an overkill if your app is simple and changes in schema are unlikely.
Also consider that you can do data preprocessing using DynamoDB Streams, which gives you some advantages of using events still keeping it simple to implement.
I’ve been trying to wrap my head around how to expose my domain objects to the client. Whether I’m using a rich client or I’m using the web, I want to use the MVP and repository patterns.
What I’m trying to wrap my head around is how I expose my repository and model, which will be on the server. Is it even possible to expose complex business objects that have state via a web service, or will I have to use a proprietary technology that is not language/platform agnostic, like .Net remoting, EJB, COM+, DCOM, etc?
Some other constraints are that I don’t want to have to keep loading the complex domain object from the database or passing it all over the wire every time I want to do an operation. Some complex logic might be that certain areas of the screen might be disabled or invisible based on the users permissions in combination with the state of the object. Validation and error message information will also need to be displayed to the user. I want to be able to logically call a lot of my domain object operations as if it were running on the same machine.
With the web, you have free rein. You don’t have to expose your objects across service boundaries, so you can make them a rich as you would like. I’m trying to create an N-teir architecture that is rich and works when the client calling the model is on a different machine.
You can expose your domain objects like any other object through REST or web services. I think key is to understand that you will have to expose services that provide business value in a single call, and these do not necessarily map 1:1 to your repositories. So while you on the server may expect a single service call to use multiple repositories and perform various aggregations, the things you expose over any kind of web-service should be more or less complete results. The operations you expose on the service should not expose individual repositories but rather focus on meaningful operations that provide a given business value.
I hope this helps somewhat.
You can use a SOAP formater for .Net remoting,
but the resulting service will probably be hard
to consume as a service, and it will surly be very chatty.
If you want your domain model to be consumed as a service,it should be designed as a service.
As stated in domain driven design, a service is stateless, so it won't expose your objects directly. Your service should expose methods that provides meaningful business operations that will be executed as a single unit.
Usually consider that the model in your client is in a different bounded context because its concerns will be a bit different from the one on the server.
What I’m trying to wrap my head around
is how I expose my repository and
model, which will be on the server. Is
it even possible to expose complex
business objects that have state via a
web service, or will I have to use a
proprietary technology that is not
language/platform agnostic, like .Net
remoting, EJB, COM+, DCOM, etc?
A good domain model is going to be highly behavioral and designed around the problem domain (and your discussions with domain experts), I'd thus argue against designing it to be exposed to remote consumers (in the same way that designing it from the database or GUI first is a bad idea).
Instead I'd look at using a style like REST or messaging and decide on the interface you want to expose and then map to/from the domain. So if you went with REST you'd design your resources and API (URL's, representations, etc.) and then you'd need to fulfill it from the domain model.
If this becomes un-natural then you can always have multiple models, for example mapping a seperate read-only presentation specific model to the same data-source (or which wraps the complex behavioral domain model) is an approach I've used several times.
Some other constraints are that I
don’t want to have to keep loading the
complex domain object from the
database or passing it all over the
wire every time I want to do an
operation
Look at caching in HTTP and supporting multiple representations for a resource, also look at caching within your data-access solution.
Validation and error message
information will also need to be
displayed to the user. I want to be
able to logically call a lot of my
domain object operations as if it were
running on the same machine.
You can either represent this as a resource or more likely look at HTTP status codes and the response bodies you'd want to use in those situations.