For my master thesis I have to develop a web service that constantly monitors user interaction in a web application (recording, for example, the most clicked buttons or the most visited pages). The main goal of this project is to generate a set of behavioral profiles.
I have some experience with web services, but I feel I need some sort of starting point with this kind of project (similar projects, existing thesis or articles). Hope someone can help.
Assuming you are talking about restFul web services, a good starting point would be to list down all APIs that you think you will need in terms of GET, POST, PUT, DELETE methods.
For example, you might first need a list of all the buttons, links and pages in the website you are supposed to monitor (lets say getAllItems). Then for such individual items, you need to code to capture the 'onclick', 'onbuttonpress' etc events (lets say getOnClickForButtonA). You will have to save all this information periodically in a database.
When you have enough information in the database, you can write code to read this and generate some stats out of it.
So, a set of services to gather all info.
Set of services to store all this info.
Set of services to analyze gathered info.
Hope this helps!
Related
Consider the following micro services for an online store project:
Users Service keeps account data about the store's users (including first name, last name, email address, etc')
Purchase Service keeps track of details about user's purchases.
Each service provides a UI for viewing and managing it's relevant entities.
The Purchase Service index page lists purchases. Each purchase item should have the following fields:
id, full name of purchasing user, purchased item title and price.
Furthermore, as part of the index page, I'd like to have a search box to let the store manager search purchases by purchasing user name.
It is not clear to me how to get back data which the Purchase Service does not hold - for example: a user's full name.
The problem gets worse when trying to do more complicated things like search purchases by purchasing user name.
I figured that I can obviously solve this by syncing users between the two services by broadcasting some sort of event on user creation (and saving only the relevant user properties on the Purchase Service end). That's far from ideal in my perspective. How do you deal with this when you have millions of users? would you create millions of records in each service which consumes users data?
Another obvious option is exposing an API at the Users Service end which brings back user details based on given ids. That means that every page load in the Purchase Service, I'll have to make a call to the Users Service in order to get the right user names. Not ideal, but I can live with it.
What about implementing a purchase search based on user name? Well I can always expose another API endpoint at the Users Service end which receives the query term, perform a text search over user names in the Users Service, and then return all user details which match the criteria. At the Purchase Service, map the relevant ids back to the right names and show them in the page. This approach is not ideal either.
Am I missing something? Is there another approach for implementing the above? Maybe the fact that I'm facing this issue is sort of a code smell? would love to hear other solutions.
This seems to be a very common and central question when moving into microservices. I wish there was a good answer for that :-)
About the suggested pattern already mentioned here, I would use the term Data Denormalization rather than Polyglot Persistence, as it doesn't necessarily needs to be in different persistence technologies. The point is that each service handles its own data. And yes, you have data duplication and you usually need some kind of event bus to share data across services.
There's another option, which is a sort of a take on the first - making the search itself as a separate service.
So in your example, you have the User service for managing users. The Purchases services manages purchases. Each handles its own data and only the data it needs (so, for instance, the Purchases service doesn't really need the user name, only the ID). And you have a third service - the Search Service - that consumes data produced by other services, and creates a search "view" from the combined data.
It's totally fine to keep appropriate data in different databases, it's called Polyglot Persistence. Yes, you would like to keep user data and data about purchases separately and use message queue for sync. Millions of users seems fine to me, it's scalability, not design issue ;-)
In case of search - you probably want to search more than just username, right? So, if you use message queue to update data between services you can also easily route this data to ElasticSearch, for example. And from ElasticSearch perspective it doesn't really matter what field to index - username or product title.
I usually use both approaches. Sometimes i have another service which is sitting on top on x other services and combines the data. I don't really like this approach because it is causing dependencies and coupling between services. So in general, within my last projects we tried to stick to polyglot persistence.
Also think about, if you need to have x sub http requests for combining data in some kind of middleware service, it will lead you to higher latency. We always try to cut down the amount of requests for one task and handle everything what is possible through asynchronous queues. ( especially data sync )
If you conceptualize modules as the owners and controllers of the data they work on, then your model must also communicate that data out of that module to others. In contrast, the modules in a manufacturing process have the access to change data without possessing and controlling it.
Microservices is an architecture for distributed processing, like most code, where modules pass the data around to work on it. From classic articles by Harvard Business Review and McKinsey on the subject of owning members of a supply chain, I identified complexities arising from this model and wrote an article teaching programmers what you need to know: http://www.powersemantics.com/p.html
Manufacturing is an architecture for integrated processing, where modules work on the data without passing it around from point to point. This can be accomplished by having modules configured to access the same memory, files or database tables. My architecture shows how to accomplish this on memory via reference properties.
When you consider "exposing an API at the Users Service end which brings back user details based on given ids", you need to be aware that creates what HBR calls "irreversible" complexity, which I've dubbed centralization complexity. Don't build A->B (distributed) systems, because you can't decentralize them later after failing to separate requirements. Requirements in production processes represent user instructions, and centralized modules only enable you to change the wrong users' processes. In other words, centralized modules don't document user groups or distinguish them from derived-product-users.
This post http://www.theserverside.net/tt/articles/showarticle.tss?id=Top5WSMistakes
encourages me to create the web service for business logic layer but many people use it in the data access layer.
I want to create a project where i want to access the same data repository from a desktop application, website and a cell phone. What would you recommend me?
Is there any case it may be a good idea to implement web services to both layers?
The question is too open ended so the answer is: it depends.
What needs do your applications have for the data? Is it just data access or some business logic involved? If it is just accessing of data, do you really want the client to have direct control over it? How similar are the three applications? Do they share functionality or just data?
As I see it there are two main paths you can chose:
1 - expose a web service for the business, with the data hidden behind the web service. This is a good setup if the three clients (I'll call the desktop app, web app and cell phone "clients" since that is what they are) share functionality (i.e. they are different views for the same business model). This avoids duplicating similar business logic in all the clients;
2 - expose the data directly with a web service. This is a good setup if the three clients have nothing in common but just use the same data for different purposes. But in this case, with the three sets of business logic, where are you going to put the logic? In the clients? How will that work for the desktop application (considering you install this desktop app 300 times or so)? You again need some service and the clients to be thin clients not thick ones.
If you take 1) and 2) into consideration you will see that usually it is better to have a service layer in front of your data.
Going back to the "it depends", analyze your special needs first and only then choose the solution that is best suited for your situation.
How about a point 3? make your data access layer into a library (.jar, .dll or whatever technology you are using) and make that available to the (1? 2? 3?) business web services that serve your clients?
Nowadays a lot of web applications are providing API for other applications to use.
I am new to the usage of API so I want to understand the use cases for it.
Lets take Basecamp as an example.
What are the use cases for using their API in my web application?
For inserting current data in my web application into a newly created Basecamp account instead of inserting everything manually which could take days or weeks if the data is huge?
For updating my application data when the user changes something in Basecamp. If so, how do I know for example when a user add/edit/remove a contact in Basecamp. Do I make a request and check every minute from the backend?
For making backup of the Basecamp data so I can move it to other applications if necessary?
Are all the above examples good use cases for the usage of API?
Are there more use cases?
I want to have a clear picture of why it's good to use another web service API and how I can leverage that on my application.
Thanks.
I've found the biggest reason to use and provide web services is to be able to programmatically drive the application with another process. This allows the coupling of different actions in different applications driven by one event/process/trigger.
For example I could create a use a webservice provided by Basecamp, my bug tracking database and the continuous integration server. I could tie all those things together and kick them off from a commit hook script.
I can have a monitor in production automatically open a ticket in our ticket tracker. This could trigger an autoremediation process from the ticket tracker which logs into the box remotely and restarts the service.
The other major reason I've seen to use and provide web service is to reduce double entry. If you do change management in your production environment that usually means you create Change tickets. The changes that occur may also need to be reflected in the Change Management Database which is usually a model of how production is suppose to look. Most of these systems don't automatically drive the update of your configuration item with the data from the change. Using web services you can stitch them together to eliminate the double (manual) entry that would normally occur.
APIs are used any time you want to get data to/from an application without using the default interface.
*I'd bet there's a mobile app would use the basecamp api.
*You could use the api to pull information from basecamp into another application (like project manager software or an individual's todo webpage)
*the geekiest of us may prefer to update basecamp from a script/command line rather than interrupting our work flow to open a web page and click around.
I am tasked with creating an API that would allow 3rd party customers the ability to send orders into our Microsoft Dynamics NAV 5.0 SP1.
I want to be able to create a SalesOrder in Dynamics NAV not with the client but via an API so i can allow a seperate process to enter in orders automatically.
Any help is appreciated in leading me in the right direction.
Well, it depends on how complicated you want to make it. Do you need confirmation of the Sales Order creation in "real time"? If so, you'll need to use a Web Service and ensure that there is a network path from wherever customers will create orders (public internet, extranet) to your NAV Web Service - likely using a VPN tunnel, etc.
es th
Alternatively, if you can live with a batch type process then you can have your customers create SOs via a web-based form, etc. and then import these orders into NAV on a regular basis using Dataports or XMLPorts.
For example, you may have a form online that your customer can create an Order on that places the Order in a staging table in SQL or even an XML or CSV file. Then you can run a process on a regular basis that imports these Orders into NAV and creates the appropriate SalesOrders.
Presumably, you also need a way to expose your Item database to the Ordering interface so customers can select which Items to order (and therefore create SalesLines from).
Which type of scenario are you interested in?
Web Services is the way to go; we have several applications that have a similar requirement. I'd recommend building an interface (ASP, to utilise the web service from NAV) and have it talk to NAV that way.
Editing the database directly is not recommended as it will cause locking and may result in deadlocks if not careful. Also NAV can be quite sensitive when it comes to the database, so best not write to it directly if possible :)
I'd recommend creating a codeunit that handles the sales order, in which you can create your functions, 'CreateOrder' and then expose that via Web Services. Even if you're not planning to use a web-based interface, NAV uses the SOAP protocol -- many libraries exist to enable you to connect and interface to Web Services from other languages, like Java.= for instance.
At my day job we have load balanced web servers which talk to load balanced app servers via web services (and lately WCF). At any given time, we have 4-6 different teams that have the ability to add new web sites or services or consume existing services. We probably have about 20-30 different web applications and corresponding services.
Unfortunately, given that we have no centralized control over this due to competing priorities, org structures, project timelines, financial buckets, etc., it is quite a mess. We have a variety of services that are reused, but a bunch that are specific to a front-end.
Ideally we would have better control over this situation, and we are trying to get control over it, but that is taking a while. One thing we would like to do is find out more about what all of the inter-relationships between web sites and the app servers.
I have used Reflector to find dependencies among assemblies, but would like to be able to see the traffic patterns between services.
What are the options for trying to map out web service relationships? For the most part, we are mainly talking about internal services (web to app, app to app, batch to app, etc.). Off the top of my head, I can think of two ways to approach it:
Analyze assemblies for any web references. The drawback here is that not everything is a web reference and I'm not sure how WCF connections are listed. However, this would at least be a start for finding 80% of the connections. Does anyone know of any tools that can do that analysis? Like I said, I've used Reflector for assembly references but can't find anything for web references.
Possibly tap into IIS and passively monitor the traffic coming in and out and somehow figure out what is being called and where from. We are looking at enterprise tools that could help but it would be a while before they are implemented (and cost a lot). But is there anything out there that could help out quickly and cheaply? One tool in particular (AmberPoint) can tap into IIS on the servers and monitor inbound and outbound traffic, adds a little special sauce and begin to build a map of the traffic. Very nice, but costs a bundle.
I know, I know, how the heck did you get into this mess in the first place? Beats me, just trying to help us get control of it and get out of it.
Thanks,
Matt
The easiest way is to look through the logs, but if that doesn't include the referrer than you may also want to monitor what is going out from your web to the app server. You can use tools like Wireshark or Microsoft Network Monitor to see this traffic.
The other "solution" and I use this loosely is to bind a specific web server to app server and then run through a bundle and see what it is hitting on the app server. You could probably do this in a test environment to lesson the effects on the users of the site.
You need a service registry (UDDI??)... If you had a means to catalog these services and their consumers, it would make this job of dependency discovery a lot easier. That is not an easy solution, though. It takes time and documentation to get a catalog in place.
I think the quickest solution would be to query your IIS logs and find source URLs which originate from your own servers. You would at least be able to track down which servers your consumers are coming from.
Also, if you already have some kind of authentication mechanism in place, you could trace who is using a particular service based on login.
You are right about AmberPoint. There are other tools that catalog the service traffic and provide reports showing what is happening to your services. Systinet, SOA Software and Actional also has a products similar to Amberpoint but Amberpoint has a free-ware version, I believe.