How does Sentry aggregate errors?

How does Sentry aggregate errors? - django

I am using Sentry (in a django project), and I'd like to know how I can get the errors to aggregate properly. I am logging certain user actions as errors, so there is no underlying system exception, and am using the culprit attribute to set a friendly error name. The message is templated, and contains a common message ("User 'x' was unable to perform action because 'y'"), but is never exactly the same (different users, different conditions).
Sentry clearly uses some set of attributes under the hood to determine whether to aggregate errors as the same exception, but despite having looked through the code, I can't work out how.
Can anyone short-cut my having to dig further into the code and tell me what properties I need to set in order to manage aggregation as I would like?
[UPDATE 1: event grouping]
This line appears in sentry.models.Group:
class Group(MessageBase):
"""
Aggregated message which summarizes a set of Events.
"""
...
class Meta:
unique_together = (('project', 'logger', 'culprit', 'checksum'),)
...
Which makes sense - project, logger and culprit I am setting at the moment - the problem is checksum. I will investigate further, however 'checksum' suggests that binary equivalence, which is never going to work - it must be possible to group instances of the same exception, with differenct attributes?
[UPDATE 2: event checksums]
The event checksum comes from the sentry.manager.get_checksum_from_event method:
def get_checksum_from_event(event):
for interface in event.interfaces.itervalues():
result = interface.get_hash()
if result:
hash = hashlib.md5()
for r in result:
hash.update(to_string(r))
return hash.hexdigest()
return hashlib.md5(to_string(event.message)).hexdigest()
Next stop - where do the event interfaces come from?
[UPDATE 3: event interfaces]
I have worked out that interfaces refer to the standard mechanism for describing data passed into sentry events, and that I am using the standard sentry.interfaces.Message and sentry.interfaces.User interfaces.
Both of these will contain different data depending on the exception instance - and so a checksum will never match. Is there any way that I can exclude these from the checksum calculation? (Or at least the User interface value, as that has to be different - the Message interface value I could standardise.)
[UPDATE 4: solution]
Here are the two get_hash functions for the Message and User interfaces respectively:
# sentry.interfaces.Message
def get_hash(self):
return [self.message]
# sentry.interfaces.User
def get_hash(self):
return []
Looking at these two, only the Message.get_hash interface will return a value that is picked up by the get_checksum_for_event method, and so this is the one that will be returned (hashed etc.) The net effect of this is that the the checksum is evaluated on the message alone - which in theory means that I can standardise the message and keep the user definition unique.
I've answered my own question here, but hopefully my investigation is of use to others having the same problem. (As an aside, I've also submitted a pull request against the Sentry documentation as part of this ;-))
(Note to anyone using / extending Sentry with custom interfaces - if you want to avoid your interface being use to group exceptions, return an empty list.)

See my final update in the question itself. Events are aggregated on a combination of 'project', 'logger', 'culprit' and 'checksum' properties. The first three of these are relatively easy to control - the fourth, 'checksum' is a function of the type of data sent as part of the event.
Sentry uses the concept of 'interfaces' to control the structure of data passed in, and each interface comes with an implementation of get_hash, which is used to return a hash value for the data passed in. Sentry comes with a number of standard interfaces ('Message', 'User', 'HTTP', 'Stacktrace', 'Query', 'Exception'), and these each have their own implemenation of get_hash. The default (inherited from the Interface base class) is a empty list, which would not affect the checksum.
In the absence of any valid interfaces, the event message itself is hashed and returned as the checksum, meaning that the message would need to be unique for the event to be grouped.

I've had a common problem with Exceptions. Currently our system is capturing only exceptions and I was confused why some of these where merged into a single error, others are not.
With your information above I extraced the "get_hash" methods and tried to find the differences "raising" my errors. What I found out is that the grouped errors all came from a self written Exception type that has an empty Exception.message value.
get_hash output:
[<class 'StorageException'>, StorageException()]
and the multiple errors came from an exception class that has a filled message value (jinja template engine)
[<class 'jinja2.exceptions.UndefinedError'>, UndefinedError('dict object has no attribute LISTza_*XYZ*',)]
Different exception messages trigger different reports, in my case the merge was caused due to the lack of the Exception.message value.
Implementation:
class StorageException(Exception):
def __init__(self, value):
Exception.__init__(self)
self.value = value

Related

#mswjs/data question: why does RTK-Query sandbox example need separately handcoded POST and PUT mocks?

This is a question about the default behaviour of #mswjs/data.toHandlers function using this example with #mswjs/data to create mocks for RTK-Query calls.
https://codesandbox.io/s/github/reduxjs/redux-toolkit/tree/master/examples/query/react/mutations?from-embed
the file src/mocks/db.ts creates a mock database using #mswjs/data and defines default http mock responses using ...db.post.toHandlers('rest') but fails to work if I remove the additional PUT and POST mocks.
My understanding is that #mswjs/data toHandlers() function provides PUT and POST mock API calls for a defined database (in this case Posts) by default according to the github documentation so I am seeking advice to understand better why toHandlers does not work for PUT and POST in this example. i.e. if i remove PUT and POST mock API calls they fail.
What do the manual PUT and POST API mocks do that the default toHandlers dont?

You are correct to state that .toHandlers() generates both POST /posts and PUT /posts/:id request handlers. The RTK-Query example adds those handlers explicitly for the following reasons:
To emulate flaky error behavior by returning an error response based on the Math.random() value in the handler.
To set the id primary key to nanoid().
Adding a post fails if you remove the explicit POST /posts handler because the model definition for post does not define the initial value for the id primary key. You cannot create an entity without providing a primary key to it, which the example does not:
// PostManager.tsx
// The "post" state only contains the name of the new post.
const [post, setPost] = useState<Pick<Post, "name">>(initialValue);
// Only the "post" state is passed to the code that dispatches the
// "POST /posts" request handled by MSW.
await addPost(post).unwrap();
If we omit the random error behavior, I think the example should've used nanoid as the initial value of the id property in the model description:
import { nanoid } from "#reduxjs/toolkit";
const db = factory({
post: {
- id: primaryKey(String),
+ id: primaryKey(nanoid),
name: String
}
});
This way you would be able to create new posts by supplying the name only. The value of the id primary key would be generated using the value getter—the nanoid function.
The post edit operation functions correctly even if you remove the explicit PUT /posts/:id request handler because, unlike the POST handler, the PUT one is only there to implement a flaky error behavior (the edited post id is provided in the path parameters: req.params.id).

What is the difference between startFlow and startTrackedFlow in Corda?

So what is the advantage of using startTrackedFlow over startFlow?

The difference is defined in the official documentation:
The process of starting a flow returns a FlowHandle that you can use to observe the result, and which also contains a permanent identifier for the invoked flow in the form of the StateMachineRunId. Should you also wish to track the progress of your flow (see Progress tracking) then you can invoke your flow instead using CordaRPCOps.startTrackedFlowDynamic or any of its corresponding CordaRPCOps.startTrackedFlow extension functions. These will return a FlowProgressHandle, which is just like a FlowHandle except that it also contains an observable progress field.

Enforcing custom enumeration in AWS LEX for slot values

I want to be able to specify a custom list of valid options for a slot that LEX will either attempt to approximate towards or, in the event that no valid option can be approximated, reject the invalid response.
At first I attempted to do this through custom slot types. And though their examples may lead you to believe these are enumerations, they are not. A user still has the capacity to input whatever value they like.
Their documentation has this to say: https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/migrating-to-the-improved-built-in-and-custom-slot-types#literal
A custom slot type is not the equivalent of an enumeration. Values outside the list may still be returned if recognized by the spoken language understanding system. Although input to a custom slot type is weighted towards the values in the list, it is not constrained to just the items on the list. Your code still needs to include validation and error checking when using slot values.
I am aware that I can validate their submission through a lambda after they have completed their full submission, but by then it's too late. A user has submitted their full intent message. I'm unable to capture it midway and correct them.
Am I missing some way to input slot options or a configuration option for custom slot types? Is there any way to enforce a custom list of options for a slot? (Similar to utterances for intents, or the built in slot types, which will ask the same question again if there is no match.)
Thanks!

I'm unable to capture it midway and correct them.
You can capture the error in lambda without fulfilling the intent and starting over. Here's how I validate input with Python.
If you detect a validation error in lambda, you can elicit the same slot and pass your error message. This allows you to set complex validation rules and have your bot return specific responses to the user.
def validate(input):
if input not in ['foo', 'bar']:
return elicit_slot("Your response must be foo or bar")
def elicit_slot(error_message):
return {
'dialogAction': {
'type': 'ElicitSlot',
'intentName': current_intent,
'slots': current_slots,
'slotToElicit': slot_with_validation_error,
'message': {'contentType': 'PlainText', 'content': error_message }
}
}

Disabling loggers if No handlers are present

I'm creating a python script and class object that passes logging handlers into the latter. The handlers are created using the logging class like so:
#Debug Handler
debug_handler = logging.FileHandler('Debug.log')
debug_handler.setLevel(logging.DEBUG)
#Info Handler
info_handler = logging.FileHandler('Normal.log')
info_handler.setLevel(logging.INFO)
These handler objects are passed directly into the object initializer:
def __init__(self, type, path, info_handler = False, debug_handler = False):
#Establishes Class Logger
self.logger = logging.getLogger('LoggerName')
self.logger.setLevel(logging.DEBUG)
if (info_handler):
self.logger.addHandler(info_handler)
if (debug_handler):
self.logger.addHandler(debug_handler)
My goal is to make the handlers completely optional for the object class, but to use them I must scatter calls across the code as frequently as print statements. Ex:
self.logger.info('INITIALIZING RESULTS OBJECT')
Which means that it will error should no handlers be passed. How can I manage/nullify these statements without placing try/catch on every single call in the code?

Update: There should be no issue with calling a logger when no handlers are present. The library simply prints out a statement acknowledging the lack of a handler. My error was caused by trying to add a handler that was not defined, and this was easily rectified by the statement below:
if (info_handler):
self.logger.addHandler(info_handler)
if (debug_handler):
self.logger.addHandler(debug_handler)
I'll leave this question up to show basic syntax for the logging library.

Asynchronous network calls

I made a class that has an asynchronous OpenWebPage() function. Once you call OpenWebPage(someUrl), a handler gets called - OnPageLoad(reply). I have been using a global variable called lastAction to take care of stuff once a page is loaded - handler checks what is the lastAction and calls an appropriate function. For example:
this->lastAction == "homepage";
this->OpenWebPage("http://www.hardwarebase.net");
void OnPageLoad(reply)
{
if(this->lastAction == "homepage")
{
this->lastAction = "login";
this->Login(); // POSTs a form and OnPageLoad gets called again
}
else if(this->lastAction == "login")
{
this->PostLogin(); // Checks did we log in properly, sets lastAction as new topic and goes to new topic URL
}
else if(this->lastAction == "new topic")
{
this->WriteTopic(); // Does some more stuff ... you get the point
}
}
Now, this is rather hard to write and keep track of when we have a large number of "actions". When I was doing stuff in Python (synchronously) it was much easier, like:
OpenWebPage("http://hardwarebase.net") // Stores the loaded page HTML in self.page
OpenWebpage("http://hardwarebase.net/login", {"user": username, "pw": password}) // POSTs a form
if(self.page == ...): // now do some more checks etc.
// do something more
Imagine now that I have a queue class which holds the actions: homepage, login, new topic. How am I supposed to execute all those actions (in proper order, one after one!) via the asynchronous callback? The first example is totally hard-coded obviously.
I hope you understand my question, because frankly I fear this is the worst question ever written :x
P.S. All this is done in Qt.

You are inviting all manner of bugs if you try and use a single member variable to maintain state for an arbitrary number of asynchronous operations, which is what you describe above. There is no way for you to determine the order that the OpenWebPage calls complete, so there's also no way to associate the value of lastAction at any given time with any specific operation.
There are a number of ways to solve this, e.g.:
Encapsulate web page loading in an immutable class that processes one page per instance
Return an object from OpenWebPage which tracks progress and stores the operation's state
Fire a signal when an operation completes and attach the operation's context to the signal

You need to add "return" statement in the end of every "if" branch: in your code, all "if" branches are executed in the first OnPageLoad call.
Generally, asynchronous state mamangment is always more complicated that synchronous. Consider replacing lastAction type with enumeration. Also, if OnPageLoad thread context is arbitrary, you need to synchronize access to global variables.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js