how to track users with websocket++ 0.3X - c++

Is there a way to track per user data with websocket++ 0.3X?
I want to be able to identify them so to keep track of what they're looking at and what should be sent to them. Take stack as an example: while you're looking at this question, a websocket could (and I think does) keep in memory that you're looking at this question and send you the appropriate updates like votes, new comments & answers, and StackExchange updates in the upper left corner.
Also, the users need to be able to identified. Is there a session id inherent in websockets that are already hiding in websocket++? If not, how does websocket++ track users?

The simplest way is, as you mentioned in your answer, to use connection_hdl as a key for an associative container that stores any other data you need. WebSocket++ does have some other options for cases where that sort of lookup would be too costly.
Since this is a common question, I've written up some more formal examples & documentation on how to do this here: http://www.zaphoyd.com/websocketpp/manual/common-patterns/storing-connection-specificsession-information.

Ah, OK. It was hiding in plain sight all along. http://www.zaphoyd.com/websocketpp/manual/building-program-websocket
The connection handle is a token that uniquely identifies the
connection that received the message. It can be used to identify where
to send reply messages or stored and used to push messages later. The
type of the connection handle is websocketpp::connection_hdl.

Related

How to handle out of order microservice messages?

We have adopted an AWS powered microservice architecture where different sorts of payloads enter the system with a UUID and type via mysql.lambda_async from our database.
The problem is, that we've noticed that messages can come out of order. Imagine the scenario with the following type of message:
DEASSIGN_ROLE
ASSIGN_ROLE
When the actual intention was a quick toggle:
ASSIGN_ROLE
DEASSIGN_ROLE
Now we have a user with the wrong (elevated) permissions.
I've done some cursory research and for example answers like Handling out of order events in CQRS read side suggest using sequence numbers.
Introducing a sequence number would be quite hard as we have many different types of messages. A sequence number would require a syncronous counter, where we have gone great pains to be simply asynchronous. Bear in mind, our system that generates the message is an SQL trigger ultimately.
Are there simpler solutions I am missing?
I would say there is an unsolvable problem :
you want to be full asynchronous
you need sequentiality in your results
We had the same problem as yours, and we ended by setting sequences by type of messages.
Trying to be asynchronous and parallel when possible (according to message types/topics)

ReST philosophy - how to handle services and side effects

I've been diving into ReST lately, and a few things still bug me:
1) Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
For example, in my application it is possible to trigger a service that connects to a remote server and executes a shell scripts. I don't know how this scenario would apply to a resource?
2) Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT? This feels a bit odd.
For the client this means that updating an attribute of this ressource might only change the attribute, or it also might do a lot of other things. So PUT =/= PUT, kind of.
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects. So there would be a lot of checks like if(old_attribute != new_attribute) {side_effects}
Is this how it's supposed to be?
BR,
Philipp
Since there are only resources and no services to call, how can I provide operations to the client that only do stuff and don't change any data?
HTTP is a document transport application. Send documents (ie: messages) that trigger the behaviors that you want.
In other words, you can think about the message you are sending as a description of a task, or as an entry being added to a task queue. "I'm creating a task resource that describes some work I want done."
Jim Webber covers this pretty well.
Another thing I'm not sure about is side effects: Let's say I have a resource that can be in certain states. When transitioning into another state, a lot of things might happen (e-mails might be sent). The transition is triggered by the client. Should I handle this transition merely by letting the resource be updated via PUT?
Maybe, but that's not your only choice -- you could handle the transition by having the client put some other resource (ie, a message describing the change to be made). That affords having a number of messages (commands) that describe very specific modifications to the domain entity.
In other words, you can work around PUT =/= PUT by putting more specific things.
(In HTTP, the semantics of PUT are effectively create or replace. Which is great for dumb documents, or CRUD, but need a bit of design help when applied to an entity with its own agency.)
And implementation wise, I have to check what exacty the PUT request changed, and according to that trigger the side effects.
Is this how it's supposed to be?
Sort of. Review Udi Dahan's talk on reliable messaging; it's not REST specific, but it may help clarify the separation of responsibilities here.

Getting feed from server client model

In a typical client server model, what does it mean to subscribe or unsusbscribe to a feed? Is there a generic codebase or boilerplate model or set of standard procedures or class design and functionalities involved? This is all C++ based. There's no other info other than the client is attempting to connect to the server to retrieve data based on some sort of signature. I know it's somewhat vague, but I guess this is really a question of what are things to keep in mind and what a typical subscribe or unsubscribe method might entail. Maybe something along the lines of extending a client server model like http://www.linuxhowtos.org/C_C++/socket.htm.
This is primarily an information architecture question. "Subscribing to feeds" implies that the server offers a lot of information, which may not be uniformly relevant to all clients. Feeds are a mechanism by which clients can select relevant information.
Concretely, you first need to identify the atoms of information that you have. What are the smallest chunks of data ? What properties to they have? Can new atoms replace older atoms, and if so, what identifies their relation? Are there other atom relations besides replacement?
Next, there's the mapping of those atoms to particular feeds. What are the possible combinations of atoms needed by a client? How can these combinations be bundled in two ore more feeds? It is possible to map each atom uniquely to a single feed? Or must atoms be shared between feeds? If so, is that rare enough that you can ignore it and just send duplicates?
If a client connects, how do you figure out which atoms need to be shared? Is it just live streaming (atoms are sent only when they're generated on the server), do you have a set of current atoms (sent when a client connects), or do you need some history as well? Is there client caching?
It's clear that you can't have a single off-the-shelf solution when the business side is so diverse.

where should I validate data on javaee?

I did a search on the board and there were some threads related to what I will ask but the other questions were not exactly like my situation.
I want to implement a service (ejbs) and different clients (rest api, webservice, jsf managed beans and maybe some other client) are going to use this service. My question is: in this scenario, where should data validation occur?
Seems reasonable to me to do it inside my business control (ejbs),- since I don't want to implement one validator type for each client- but I don't see people doing it...
best regards,
Oliver
The general advice would be: Every component, which exposes functionality to the outside should validate the input it receives. It should not hope for the best that it will in all cases receive valid input. Additionally, as you said, it keeps the validation at one place.
On the other hand it may be a reasonable decision when you have both sides under your control to decide for an early validation on the client and document the expected/required valid input data.
You have a similar problem when designing a relational database structure - you can have all sorts of constraints to ensure valid input data or you can check validity in the component storing the data in the database.
And, not to forget, whenever you validate in a deeper layer, all higher layers have to handle the exceptions or error messages when validation fails.
Regarding your specific question, the usage of the same service from different clients advises to validate within the service.

cleaning up missed geocoding (or general advise on data cleaning)

I've got a rather large database of location addresses (500k+) from around the world. Though lots of the address are duplicates or near duplicates.
Whenever a new address is entered, I check to see if it is in the database already, and if so, i take the already existing lat/long and apply it to the new entry.
The reason I don't link to a separate table is because the addresses are not used as a group to search on, and their are often enough differences in the address that i want to keep them distinct.
If I have a complete match on the address, I apply that lat/long. If not, I go to city level and apply that, if I can't get a match there, I have a separate process to run.
Now that you have the extensive background, the problem. Occasionally I end up with a lat/long that is far outside of the normal acceptable range of error. However, strangely, it is normally just one or two of these lat/longs that fall outside the range, while the rest of the data exists in the database with the correct city name.
How would you recommend cleaning up the data. I've got the geonames database, so theoretically i have the correct data. What i'm struggling with is what is the routine you would run to get this done.
If someone could point me in the direction of some (low level) data scrubbing direction, that would be great.
This is an old question, but true principles never die, right?
I work in the address verification industry for a company called SmartyStreets. When you have a large list of addresses and need them "cleaned up", polished to official standards, and then will rely on it for any aspect of your operations, you best look into CASS-Certified software (US only; countries vary widely, and many don't offer such a service officially).
The USPS licenses CASS-Certified vendors to "scrub" or "clean up" (meaning: standardize and verify) address data. I would suggest that you look into a service such as SmartyStreets' LiveAddress to verify addresses or process a list all at once. There are other options, but I think this is the most flexible and affordable for you. You can scrub your initial list then use the API to validate new addresses as you receive them.
Update: I see you're using JSON for various things (I love JSON, by the way, it's so easy to use). There aren't many providers of the services you need which offer it, but SmartyStreets does. Further, you'll be able to educate yourself on the topic of address validation by reading some of the resources/articles on that site.