Should I prefix my method with "get" or "load" when communicating with a web service? - web-services

I'm writing a desktop application that communicates with a web service. Would you name all web-service functions that that fetch data LoadXXXX, since they take a while to execute. Or would you use GetXXXX, for instance when getting just a single object.

Use MyObject.GetXXXX() when the method returns XXXX.
Use MyObject.LoadXXXX() when XXXX will be loaded into MyObject, in other words, when MyObject keeps control of XXXX.
The same applies to webservices, I guess.

I would use Load if you expect it to take "file-time" and Get if you expect it to take "simple DB" time.
That is, if the call is expensive, use "Load".

Get. And then provide a way of calling them asynchronously to emphasize that they may be out to lunch for a while...

Do what the verb implies. GetXXX implies that something is being returned to the caller, while LoadXXX doesn't necessarily return something as it may be just loading something into memory.
For an API, use GetXXX to be clear to the caller that something will be returned.

Always use Get, except perhaps when actually loading something (eg, loading a file into memory).

When I read LoadXXX, I'm already thinking that the data comes from some storage media. Since the web service is up in the cloud, GetXXX feels more natural.

Related

What options do I have in Amazon RDS for using 'fixed_date'?

Historically, in Oracle I've used the fixed_date parameter to change the system date to run a series of reports that tie together to verify those links still are correct.
Now that we've moved to Amazon RDS, that capability is not available.
What are my options?
I've considered changing all calls to 'system_date' to use a custom function that simulates this. (Ugh, this is hundreds of packages, but is possible)
Are there better options for using fixed_date?
Seems like the only option you have is to create custom function and replace all the calls to system_date.
CREATE OR REPLACE FUNCTION fml.system_date
RETURN date
AS
BEGIN
return to_date('03-04-2021','DD-MM-YYYY');
END;
Not sure I would do this approach, but you could also investigate "stored outlines" if there are not too many queries involved. Have it call the alternate function/package instead. The fix_date call will still fail, but maybe it can be a workaround. That outline could then be used only for reports user for example.
I am not sure why Amazon doesn't support something like this yet...

Should I store a list in memory or in a database and should I build a class to connect to DB?

I am writing a C++ program, I have a class that provides services for the rest of the clases in the program.
I am writing now the clases and the UML.
1) the class that I refer to has a task list that is changing over time and conditions are being checked on this list, I am thinking to keep it in a table in a databasse that every line in the table would represent a task, this way in case that the program crashes or stops working I can restore the last situation, the other option is to keep the task list in memory and keep a copy in the database.
the task list should be searched every second
Which approach is more recommended?
2) In order to write and to read to the database I can call the database directly from the class or build a database communication class, if I write a data communication class I need to give specific options and to build a mini server for this,
e.g. write a line to the database, read a line to the database, update only the first column etc..
what is the recommended approach for this?
Thanks.
First, if the database is obvious and easy, and there are no performance problems, just do that. You're talking about running a query once/second, and maybe marking a task done or adding a new one every so often; even sqlite on a slow SMB share should be able to handle that just fine.
If you do need to optimize it, then there are two approaches: Either still with the database and cache it in-memory, or use memory as your primary storage and come up with a persistence mechanism that uses the database. But until you need to optimize it, don't.
Next, how should you do it? Your question makes it sound like you're thinking in terms of a whole three-tier system, with a "mini-server" sitting between the database server and your task list. There's really no need for that. What you want is a bespoke ORM, but that makes it sound more complicated than it is. All you're doing is writing a class that wraps a database connection and provides a handful of methods—get_due, mark_done, add, get_next_id—each of which maps SQL parameters to Task members. For example (with no error handling):
void mark_done(Task task) {
db.execute("UPDATE Task SET done=true WHERE id=%s", task.id);
}
Three more methods like that, plus a constructor to connect to the database (including creating the Task table if it didn't already exist), and your class is done.
The reason you don't want to write the database stuff directly into Task is that you don't really have anywhere to store shared information like the database connection object; either you need globals (or class attributes, which are effectively globals), or you need copies in every single Task instance (or, really, weak references—which you're going to fake with either a reference or a raw pointer, either way leading to shutdown problems somewhere down the line).
Finally, your whole reason for doing this is error recovery, and databases do a great job of journaling so nothing ever gets inconsistent, but you do have to make sure to structure your app to take advantage of that. For example, you may want to mark all the now-due tasks "in process", then process them, then mark them all "done"; that way, at recovery time, you know exactly which tasks may or may not have been done, and can act appropriately. The more steps you can commit to the database, the less data loss you have to deal with—but of course the more code you have to write, and the slower it gets. So, do as much as necessary, but no more.
Saving information in Database just to recover crashed information may be bit of an overkill.
You ideally want to serialize the list and save it - as binary, xml or csv based values. This can be done based on a timer or certain events in your applications.
Databases may also be used if you can come up with a structure that looks exactly similar to tables - so that you can do one-to-one mapping between the objects and probably write SQL queries easily. But keep that on a separate layer for abstraction.

Should output be encoded at the API or client level?

We are moving our Web app architecture to being microservice based. We have an internal debate as to whether an REST API that provides content (in JSON, let's say) should be looking to encode content to make it safe, or whether the consumers that take that content and display it (in HTML, for example, or otherwise use it) should be responsible for that encoding. The use case is to prevent XSS attacks and similar.
The provider stance is "Well, we can't know how to encode it for everyone, or how you're going to use the content, so of course the consumers should encode the content."
The consumer stance is "There is one provider and multiple consumers, so it's more secure to do it once in the providing API than to hope that every consumer does it."
Are there any generally accepted best practices on this and why?
As a rule, data when passing through "internal" processes (whatever that might mean to use) should be stored or encoded in whatever "internal" format makes sense. The format chosen is typically designed to minimize encoding/decoding steps and to prevent data loss.
Then, upon output, data is encoded using whatever output format makes sense. Preventing data loss is important, but also proper escaping and formatting is key here.
So for example, with internal APIs, data in binary format may be sufficent. But when you output JSON or HTML or XML or PDF, you have to encode and escape your data appropriately to fit the output format.
The important point here is that different output formats have different concepts of "safe". What's "safe" for HTML may not be safe for JSON, and what's safe for JSON may not be safe for SQL. Data is encoded upon output specifically so that you can use the proper encoding for the task. You cannot assume that this step is done for you ahead of time, nor should you put your output function in the position to determine whether or not encoding must be done. If you stick with the rule: "output function ALWAYS encodes for safety", then you will never have to worry about data injection attacks.
I would say that the two important points are the following:
The encoding used by the provider MUST be specified with extreme clarity and precision in a reference document, so that all consumer implementors can know what to expect.
Whatever default encoding is used by the provider MUST keep all needed information, i.e. still be amenable to transcoding by any consumer who would wish to do it.
If you follow these two rules then you will have done 95% of the job for reliability and security.
As for your specific question, a good practice is a middle-ground: the provider follows by default a "generic" encoding, but consumers can ask (optionally) for a specific encoding which the provider may then apply -- this allows the provider to support a number of dumb, lightweight clients of possibly different kinds and can be extended later on with extra encodings without breaking the API.
I firmly believe it is both the consumer and the provider that need to do their part in being good citizens in the security space.
As the provider I want to make sure I deliver a secure product. I don't need to know the context in which my client is going to use my product, all I need to know is how I am going to deliver it. If my delivery is in JSON, then I can use that context to escape my data before sending it off, similarly for XML, plain text, etc. Further more there are transport methods that aid in security already. JSONP is one such delivery method. This ensures the payload is consumed appropriately.
As the consumer, which by the way in our environment no one is the final consumer, we are all providers to the final end client (the end users via a web browser mostly.). Because of this we have to also secure the data at this end. I would never trust a black box API to do this job for me, I would always make a point to ensure a secure payload. There are many tools out there, the ESAPI project from OWASP comes to mind, that will aid in the sanitization by context of data. Remember that you are eventually sending this data on to the end-user (browser) and if there is something awry you won't be able to pass the buck. Your service will be viewed as the vulnerable one regardless of where the flaw lies. Additionally, as the consumer, you may not always be able to rely on the black box provider to fix their flaws in a timely fashion. What if their support is lacking or they have higher priorities. Does that mean you continue to provide a known flaw to your end-users?
Security is about layers, and having safeguards at the source and end-points is always preferable.

Web Service to return complex object with optional parts

I'm trying to think of the correct design for a web service. Essentially, this service is going to perform a client search in a number of disparate systems, and return the results.
Now, a client can have various pieces of information attached - e.g. various pieces of contact information, their address(es), personal information. Some of this information may be complex to retrieve from some systems, so if the consumer isn't going to use it, I'd like them to have some way of indicating that to the web service.
One obvious approach would be to have different methods for different combinations of wanted detail - but as the combinations grow, so too do the number of methods. Another approach I've looked at is to add two string array parameters to the method call, where one array is a list of required items (e.g. I require contact information), and the other is optional items (e.g. if you're going to pull in their names anyway, you might as well return that to me).
A third approach would be to add additional methods to retrieve the detail. But that's going to explode the number of round trips if I need all the details for potentially hundreds of clients who make up the result.
To be honest, I'm not sure I like any of the above approaches. So how would you design such a generic client search service?
(Considered CW since there might not be a single "right" answer, but I'll wait and see what sort of answers arrive)
Create a "criteria" object and use that as a parameter. Such an object should have a bunch of properties to indicate the information you want. For example "IncludeAddresses" or "IncludeFullContactInformation".
The consumer is then responsible to set the right properties to true, and all combinations are possible. This will also make the code in the service easier to do. You can simply write if(criteria.IncludeAddresses){response.Addresses = GetAddresses;}
Any non-structured or semi-structured data is best handled by XML. You might pass XML data via a string or wrap it up in a class adding some functionality to it. Use XPathNavigator to go through XML. You can also use XMLDocument class although it is not too friendly to use. Anyway, you will need some kind of class to handle XML content of course.
That's why XML was invented - to handle data which structure is not clearly defined.
Regards,
Maciej

How should I do an ISerializable Stream (or near enough)

I have a web service accessed via SOAP. I'd really like one of the methods to return a Stream.
What are my options?
My thoughts right now amount to implement Stream and stuff all the data in a string. Is there a type that does this already? If possible (and I don't think it is) I'd love to actually tunnel the stream through SOAP so that data gets pulled lazily even after the method returns.
Your best bet is to read the Stream into a byte array. You can then serialize the byte array in the web service. The client can then consume the raw byte array and re-assemble it into it's original format.
I've also used the same strategy for uploading files via web service it worked great.