Apache Axis 1: Disable serialization of xsd:float in scientific notation - web-services

We have a webservice client that calls a service (rpc-encoded, the reason we are still using Axis 1) that defines some values as xsd:float but refuses values passed in scientific notation. I understand that it is totally standard compliant behaviour of the client (https://www.w3.org/TR/xmlschema-2/ 3.2.4.1) and probably it would be the right way for the server to use xsd:decimal I wonder if there is a way to tell axis not to use scientific notation for xsd:float and xsd:double
Possibly related: BigDecimal has scientific notation in soap message asks how to achieve this with JAX-B with the accepted answer suggesting to use an XMLAdapter. Maybe there exists a similar mechanism for axis.

It looks like in Axis (http://axis.apache.org/axis/java/user-guide.html#XML_-_Java_Data_Mapping_in_Axis), you can define your own serializer and deserializer for (e.g.) float (you will have to change the xsd: ; do you have access?). This is done via a typeMapping:
<typeMapping qname="ns:local" xmlns:ns="someNamespace"
languageSpecificType="java:my.java.thingy"
serializer="my.java.Serializer"
deserializer="my.java.DeserializerFactory"
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>

Related

Google Natural Language Sentiment Analysis incorrect result

We have Google Natural AI integrated into our product for Sentiment Analysis (https://cloud.google.com/natural-language). One of the customers complained that when they write "BAD" then it shows a positive sentiment.
On further investigation, we found that when google Sentiment Analysis Natural Language API is called with input as BAD or Bad (pls see its in all caps or first letter caps ), it identifies text as an entity (a location or consumer good) & sends back the result as Positive while when we write "bad" in all small case, it sends negative.
Has anyone faced a similar problem? How did you solve it?
One obvious way looks like converting text into a small case but that may break some other use cases (maybe where entities do not get analyzed due to a small case text). Another way which we are building is to use our own dictionary of words with sentiments before calling google APIs but that doesn't answer the said problem, which may occur with any other text.
Inputs will help us. Thank you!
The NLP API uses an underlying model that is neural in nature. The knowledge comes from training on real world text. It is normal to get different results for different capitalizations as they can relate to different uses of the same trigram, e.g. Mike (person), mike (microphone, slang), MIKE (military alphabet entry).
The second key aspect is that the model is tuned and meant to be used on larger pieces of text and not on single words, hence good results can not be expected in this case.

How to perform integration of a number in python2.7

Someone please help me to integrate a_z as shown in the picture using Python2.7
Is there any predefined method to perform the action.Thank you
Try this:
http://docs.scipy.org/doc/scipy/reference/tutorial/integrate.html
for numerical integration.
In your particular example, a_z is a linear function of t and thus you dont need any numerical methods, you can do a symbolic integral and just compute the resultant function, which is much more efficient.
In fact, the solution is already printed there, so not sure what else you need to know?

Language agnostic cookie encoding / decoding standards

I'm having difficulties to figure out what is the standard (or is there any?) for encoding/decoding cookie values regardless to backend platforms.
According to RFC 2109:
The VALUE is opaque to the user agent and may be anything the origin server chooses to send, possibly in a server-selected printable ASCII encoding. "Opaque" implies that the content is of interest and relevance only to the origin server. The content may, in fact, be readable by anyone that examines the Set-Cookie header.
which sounds like "server is the boss" and it decides whatever the encoding will apply. This makes it quite difficult to set a cookie from, say PHP backend and read it from Python or Java or whatever, without writing any manual encode/decode handling on both sides.
Let's say we have a value needs to be encoded. Russian /"печенье (*} значения"/ means "cookie value" with some additional non alpha-numeric chars in it.
Python:
Almost every WSGI server does the same and uses Python's SimpleCookie class that encodes to / decodes from octal literals even though many says that octal literals are depreciated in ECMA-262, strict mode. Wtf?
So, our raw cookie value becomes "/\"\320\277\320\265\321\207\320\265\320\275\321\214\320\265 (*} \320\267\320\275\320\260\321\207\320\265\320\275\320\270\321\217\"/"
Node.js:
Haven't tested at all but I'm just guessing a JavaScript backend would do it with native encodeURIComponent and decodeURIComponent functions that use hexadecimal escaping / unescaping?
PHP:
PHP applies urlencode to the cookie values that is similar to encodeURIComponent but not exactly the same.
So the raw value becomes; %2F%22%D0%BF%D0%B5%D1%87%D0%B5%D0%BD%D1%8C%D0%B5+%28%2A%7D+%D0%B7%D0%BD%D0%B0%D1%87%D0%B5%D0%BD%D0%B8%D1%8F%22%2F that is not even wrapped with double quotes.
However; if the JavaScript value variable has the PHP encoded value above, decodeURIComponent(value) gives /"печенье+(*}+значения"/, see "+" chars instead of spaces..
What is the situation in Java, Ruby, Perl and .NET? Which language is following (or closest) to the desired behaviour. Actually, is there any standard for this defined by W3?
I think you've got things a bit mixed up here. The server's encoding does not matter to the client, and it shouldn't. That is what RFC 2109 is trying to say here.
The concept of cookies in http is similar to this in real life: Upon paying the entrance fee to a club you get an ink stamp on your wrist. This allows you to leave and reenter the club without paying again. All you have to do is show your wrist to the bouncer. In this real life example, you don't care what it looks like, it might even be invisible in normal light - all that is important is that the bouncer recognises the thing. If you were to wash it off, you'll lose the privilege of reentering the club without paying again.
In HTTP the same thing is happening. The server sets a cookie with the browser. When the browser comes back to the server (read: the next HTTP request), it shows the cookie to the server. The server recognises the cookie, and acts accordingly. Such a cookie could be something as simple as a "WasHereBefore" marker. Again, it's not important that the browser understands what it is. If you delete your cookie, the server will just act as if it has never seen you before, just like the bouncer in that club would if you washed off that ink stamp.
Today, a lot of cookies store just one important piece of information: a session identifier. Everything else is stored server-side and associated with that session identifier. The advantage of this system is that the actual data never leaves the server and as such can be trusted. Everything that is stored client-side can be tampered with and shouldn't be trusted.
Edit: After reading your comment and reading your question yet again, I think I finally understood your situation, and why you're interested in the cookie's actual encoding rather than just leaving it to your programming language: If you have two different software environments on the same server (e.g.: Perl and PHP), you may want to decode a cookie that was set by the other language. In the above example, PHP has to decode the Perl cookie or vice versa.
There is no standard in how data is stored in a cookie. The standard only says that a browser will send the cookie back exactly as it was received. The encoding scheme used is whatever your programming language sees fit.
Going back to the real life example, you now have two bouncers one speaking English, the other speaking Russian. The two will have to agree on one type of ink stamp. More likely than not this will involve at least one of them learning the other's language.
Since the browser behaviour is standardized, you can either imitate one languages encoding scheme in all other languages used on your server, or simply create your own standardized encoding scheme in all languages being used. You may have to use lower level routines, such as PHP's header() instead of higher level routines, such as start_session() to achieve this.
BTW: In the same manner, it is the server side programming language that decides how to store server side session data. You cannot access Perl's CGI::Session by using PHP's $_SESSION array.
Regardless of the cookie being opaque to the client, it still needs to conform to the HTTP spec. rfc2616 specifies that all HTTP headers should be ASCII (ISO-8859-1). rfc5987 extends that to support other character sets, but I don't know how widely supported it is.
I prefer to encode into UTF8 and wrap with base64 encoding. It's fast, ubiquitous, and will never mangle your data at either end.
You will need to ensure an explicit conversion into UTF8 even when wrapping it. Other languages & runtimes, while supporting Unicode, may not store strings as UTF8 internally... like many Windows APIs. Python 2.x, in my experience, rarely gets Unicode strings right without explicit conversion.
ENCODE: nativeString -> utfEncode() -> base64Encode()
DECODE: base64Decode() -> utfDecode() -> nativeString
Almost every language I know of, these days, supports this. You can look for a universal single-function encode, but I err on the side of caution and choose the two-step approach... especially with foreign character sets.

Once something is HTML or URL encoded should it ever be decoded? Is encoding enough?

First time AntiXSS 4 user here. In order to make my application more secure, I've used Microsoft.Security.Application.Encoder.UrlEncode on QueryString parameters and
Microsoft.Security.Application.Encoder.HtmlEncode on a parameter entered into a form field.
I have a multiple and I would appreciate it if you could try to answer all of them (doesn't have to be at once or by the same person - any abswers at all would be very helpful).
My first question is am I using these methods appropriately (that is am I using an appropriate AntiXSS method for an appropriate situation)?
My second question is once I've encoded something, should it ever be decoded. I am confused because I know that HttpUtility class provides ways to both encode and decode so why isn't the same done in AntiXSS? If this helps, the parameters that I've encoded are never going to be treated as anything other then text inside the application.
My third question is related to the third one but I wanted to emphasize it because it's important (and is probably the source of my overall confusion). I've heard that the .NET framework automatically decodes things like QueryStrings, hence no no need for explicit decode method. If that is so, then what is the point of HTML encoding something in the first place if it is going to be undone. It just... doesn't seem safe? What am I missing, especially since, as mentioned the HttpUtility class provides for decoding.
And the last question, does AntiXSS help against SQL injection at all or does it only protext against XSS attacks?
It's hard to say if you're using it correctly. If you use UrlEncode when building a query string which is then output as a link in a page then yes that's correct. If you're Html Encoding when you write something out as a value then yes, that's correct (well kind of, if it's set via an HTML attribute you ought to use HtmlAttributeEncode, but they're pretty much the same.)
The .NET decoders work with AntiXSS's encoded values, so there was no point in me rewriting them grin
The point of encoding is that you do it when you output. So, for example, if a user has, on a form, input window.alert('Numpty!) and you just put that input raw in your output the javascript would run. If you encoded it first you would see < become < and so on.
No, SQL injection is an entirely different problem.

parser: parsing formulas in template files

I will first describe the problem and then what I currently look at, in terms of libraries.
In my application, we have a set of variables that are always available. For example: TOTAL_ITEMS, PRICE, CONTRACTS, ETC (we have around 15 of them). A clients of the application would like to have certain calculations performed and displayed, using those variables. Up until now, I have been constantly adding those calculations to the app. It's pain in the butt, and I would like to make it more generic by way of creating a template, where the user can specify a set of formulas that the application will parse and calculate.
Here is one case:
total_cost = CONTRACTS*PRICE*TOTAL_ITEMS
So, want to do something like that for the user to define in the template file:
total_cost = CONTRACTS*PRICE*TOTAL_ITEMS and some meta-date, like screen to display it on. Hence they will be specifying the formula with a screen. And the file will contain many formulas of this nature.
Right now, I am looking at two libraies: Spirit and matheval
Would anyone make recommendations what's better for this task, as well as references, examples, links?
Please let me know if the question is unclear, and I will try to further clarify it .
Thanks,
Sasha
If you have a fixed number of variables it may be a bit overkill to invoke a parser. Though Spirit is cool and I've been wanting to use it in a project.
I would probably just tokenize the string, make a map of your variables keyed by name (assuming all your variables are ints):
map<const char*,int*> vars;
vars["CONTRACTS"] = &contracts;
...
Then use a simple postfix calculator function to do the actual math.
Edit:
Looking at MathEval, it seems to do exactly what you want; set variables and evaluate mathematical functions using those variables. I'm not sure why you would want to create a solution at the level of a syntax parser. Do you have any requirements that MathEval does not fulfill?
Looks like it shouldn't be too hard to generate a simple parser using yacc and bison and integrate it into your code.
I don't know about matheval, but boost::spirit can do that for you pretty efficiently : see there.
If you're into template metaprogramming, you may want to have a look into Boost::Proto, but it will take some time to get started using it.