I'm creating a python script and class object that passes logging handlers into the latter. The handlers are created using the logging class like so:
#Debug Handler
debug_handler = logging.FileHandler('Debug.log')
debug_handler.setLevel(logging.DEBUG)
#Info Handler
info_handler = logging.FileHandler('Normal.log')
info_handler.setLevel(logging.INFO)
These handler objects are passed directly into the object initializer:
def __init__(self, type, path, info_handler = False, debug_handler = False):
#Establishes Class Logger
self.logger = logging.getLogger('LoggerName')
self.logger.setLevel(logging.DEBUG)
if (info_handler):
self.logger.addHandler(info_handler)
if (debug_handler):
self.logger.addHandler(debug_handler)
My goal is to make the handlers completely optional for the object class, but to use them I must scatter calls across the code as frequently as print statements. Ex:
self.logger.info('INITIALIZING RESULTS OBJECT')
Which means that it will error should no handlers be passed. How can I manage/nullify these statements without placing try/catch on every single call in the code?
Update: There should be no issue with calling a logger when no handlers are present. The library simply prints out a statement acknowledging the lack of a handler. My error was caused by trying to add a handler that was not defined, and this was easily rectified by the statement below:
if (info_handler):
self.logger.addHandler(info_handler)
if (debug_handler):
self.logger.addHandler(debug_handler)
I'll leave this question up to show basic syntax for the logging library.
Related
I am using a ServiceBusTrigger to execute code when receiving a message. I would like to use the Singleton attribute to limit which messages can be executed in parallel. This attribute allows specifying a scope bound to properties on the incoming message, such that messages with different values can be executed in parallel but ones with the same value must be done serially.
This works when using top level properties on the incoming message object like CorrelationId.
Example
[Singleton("{CorrelationId}", SingletonScope.Function, Mode = SingletonMode.Function)]
public async Task HandleMessage(
[ServiceBusTrigger("my-topic-name", "my-subscription-name"), ServiceBusAccount("my-account-name")]
Message message,
CancellationToken cancellationToken
)
{
await Task.Yield();
}
What I am struggling to figure out is how to achieve the same behavior with user properties on the message. These are stored in the UserProperties dictionary on the Message object. I'm not seeing a way to refer to these with the binding statement in the Singleton attribute, but it seems like this would be a very common use case when combining Singleton with ServiceBusTrigger
The Service Bus Bindings exposes Message Metadata in binding expressions. So, userProperties.<key> should do the trick.
I'm trying to get to grips with Python's logging module which frankly so far has not been approachable. Currently I have one 'main' logger in my main script:
logger = logging.getLogger(__name__)
handler = logging.FileHandler('debug.log')
handler.setFormatter(logging.Formatter('%(levelname)s: %(asctime)s: %(name)s: %(message)s'))
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
logger.debug(
'{} run for {} using {} values.'.format(
skill, str(datetime.now()), key, mode
)
)
and I have a secondary logger in an imported module:
logger = logging.getLogger(__name__)
handler = logging.FileHandler('debug.log')
handler.setFormatter(logging.Formatter('%(levelname)s: %(asctime)s: %(name)s: %(message)s'))
logger.addHandler(handler)
However, although I tell both loggers to log to a file only (both only have the handlers I've set), I still get information printed to stout from the root logger. Calling logging.root.handlers shows the logger has a StreamHandler which only appears when importing the module containing the second module.
My hacking method of solving the additional stream is to just delete from the roots handlers. However, this feels like a non-canonical solution. I'm assuming I've implemented the module wrong in some way rather than this being the intended function of the module. How are you meant to set up loggers in this hierarchical fashion correctly?
A proper [mcve] would certainly help here - I can't reproduce this root logger handler suddenly appearing out of the blue.
This being said, you're doing it wrong anyway: one of the main goals of the logging module and which is not clearly, explicitely documented is to separate logger's usage (.getLogger() and logger.log()) calls from logging configuration.
The point is that library code cannot know in which context it will be used - only the application will -, so library code should NOT try to configure their loggers in anyway - just get a logger and use it, period. Then it's up to the application (here your main.py script) to configure the loggers for the libs (hint: the .dictConfig() function is by far the most usable way to configure eveything at once).
I'm trying to create a Logger object which can log info to my console without having the root name.
# Set up logger.
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setLevel(logging.DEBUG)
handler.setFormatter(logging.Formatter("%(levelname)s:%(message)s"))
logger.addHandler(handler)
logger.info("test")
Returns two logging messages: the correct one set up by handler and the original if i hadn't added a handler, what's the issue?
INFO:root:test
INFO:test
After messing around with it, I'm finding that this only occurs if a) I am adding the handler or b) I import another module with a logger.
I thought you've missed
logger.setLevel(logging.DEBUG)
before doing logging, you just set for your handler
without this, I could not get any output
and since you got two output, maybe you got other files that also create an logger ?
I have an AppConfig.ready() implementation which depends on the readiness of an other application.
Is there a signal or method (which I could implement) which gets called after all application ready() methods have been called?
I know that django processes the signals in the order of INSTALLED_APPS.
But I don't want to enforce a particular ordering of INSTALLED_APPS.
Example:
INSTALLED_APPS=[
'app_a',
'app_b',
...
]
How can "app_a" receive a signal (or method call) after "app_b" processed AppConfig.ready()?
(reordering INSTALLED_APPS is not a solution)
I'm afraid the answer is No. Populating the application registry happens in django.setup(). If you look at the source code, you will see that neither apps.registry.Apps.populate() nor django.setup() dispatch any signals upon completion.
Here are some ideas:
You could dispatch a custom signal yourself, but that would require that you do that in all entry points of your Django project, e.g. manage.py, wsgi.py and any scripts that use django.setup().
You could connect to request_started and disconnect when your handler is called.
If you are initializing some kind of property, you could defer that initialization until the first access.
If any of these approaches work for you obviously depends on what exactly you are trying to achieve.
So there is a VERY hackish way to accomplish what you might want...
Inside the django.apps.registry is the singleton apps which is used by Django to populate the applications. See setup in django.__init__.py.
The way that apps.populate works is it uses a non-reentrant (thread-based) locking mechanism to only allow apps.populate to happen in an idempotent, thread-safe manner.
The stripped down source for the Apps class which is what the singleton apps is instantiated from:
class Apps(object):
def __init__(self, installed_apps=()):
# Lock for thread-safe population.
self._lock = threading.Lock()
def populate(self, installed_apps=None):
if self.ready:
return
with self._lock:
if self.ready:
return
for app_config in self.get_app_configs():
app_config.ready()
self.ready = True
With this knowledge, you could create some threading.Thread's that await on some condition. These consumer threads will utilize threading.Condition to send cross-thread signals (which will enforce your ordering problem). Here is a mocked out example of how that would work:
import threading
from django.apps import apps, AppConfig
# here we are using the "apps._lock" to synchronize our threads, which
# is the dirty little trick that makes this work
foo_ready = threading.Condition(apps._lock)
class FooAppConfig(AppConfig):
name = "foo"
def ready(self):
t = threading.Thread(name='Foo.ready', target=self._ready_foo, args=(foo_ready,))
t.daemon = True
t.start()
def _ready_foo(self, foo_ready):
with foo_ready:
# setup foo
foo_ready.notifyAll() # let everyone else waiting continue
class BarAppConfig(AppConfig):
name = "bar"
def ready(self):
t = threading.Thread(name='Bar.ready', target=self._ready_bar, args=(foo_ready,))
t.daemon = True
t.start()
def _ready_bar(self, foo_ready):
with foo_ready:
foo_ready.wait() # wait until foo is ready
# setup bar
Again, this ONLY allows you to control the flow of the ready calls from the individual AppConfig's. This doesn't control the order models get loaded, etc.
But if your first assertion was true, you have an app.ready implementation that depends on another app being ready first, this should do the trick.
Reasoning:
Why Conditions? The reason this uses threading.Condition over threading.Event is two-fold. Firstly, conditions are wrapped in a locking layer. This means that you will continue to operate under controlled circumstances if the need arises (accessing shared resources, etc). Secondly, because of this tight level of control, staying inside the threading.Condition's context will allow you to chain the configurations in some desirable ordering. You can see how that might be done with the following snippet:
lock = threading.Lock()
foo_ready = threading.Condition(lock)
bar_ready = threading.Condition(lock)
baz_ready = threading.Condition(lock)
Why Deamonic Threads? The reason for this is, if your Django application were to die sometime between acquiring and releasing the lock in apps.populate, the background threads would continue to spin waiting for the lock to release. Setting them to daemon-mode will allow the process to exit cleanly without needing to .join those threads.
You can add a dummy app which only purpose is to fire a custom all_apps_are_ready signal (or method call on AppConfig).
Put this app at the end of INSTALLED_APPS.
If this app receives the AppConfig.ready() method call, you know that all other apps are ready.
An alternative solution:
Subclass AppConfig and send a signal at the end of ready. Use this subclass in all your apps. If you have a dependency on one being loaded, hook up to that signal/sender pair.
If you need more details, don't hesitate!
There are some subtleties to this method:
1) Where to put the signal definition (I suspect in manage.py would work, or you could even monkey-patch django.setup to ensure it gets called everywhere that it is). You could put in a core app that is always the first one in installed_apps or somewhere where django will always load it before any AppConfigs are loaded.
2) Where to register the signal receiver (you should be able to do this in AppConfig.__init__ or possibly just globally in that file).
See https://docs.djangoproject.com/en/dev/ref/applications/#how-applications-are-loaded
Therefore, the setup is as follow:
When django first starts up, register the signal.
at the end of every app_config.ready send the signal (with the AppConfig instance as the sender)
in AppConfigs that need to respond to the signal, register a receiver in __init__ with the appropriate sender.
Let me know how it goes!
If you need it to work for third-party apps, keep in mind that you can override the AppConfigs for these apps (convention is to place these in a directory called apps). Alternatively, you could monkey patch AppConfig
I am using Sentry (in a django project), and I'd like to know how I can get the errors to aggregate properly. I am logging certain user actions as errors, so there is no underlying system exception, and am using the culprit attribute to set a friendly error name. The message is templated, and contains a common message ("User 'x' was unable to perform action because 'y'"), but is never exactly the same (different users, different conditions).
Sentry clearly uses some set of attributes under the hood to determine whether to aggregate errors as the same exception, but despite having looked through the code, I can't work out how.
Can anyone short-cut my having to dig further into the code and tell me what properties I need to set in order to manage aggregation as I would like?
[UPDATE 1: event grouping]
This line appears in sentry.models.Group:
class Group(MessageBase):
"""
Aggregated message which summarizes a set of Events.
"""
...
class Meta:
unique_together = (('project', 'logger', 'culprit', 'checksum'),)
...
Which makes sense - project, logger and culprit I am setting at the moment - the problem is checksum. I will investigate further, however 'checksum' suggests that binary equivalence, which is never going to work - it must be possible to group instances of the same exception, with differenct attributes?
[UPDATE 2: event checksums]
The event checksum comes from the sentry.manager.get_checksum_from_event method:
def get_checksum_from_event(event):
for interface in event.interfaces.itervalues():
result = interface.get_hash()
if result:
hash = hashlib.md5()
for r in result:
hash.update(to_string(r))
return hash.hexdigest()
return hashlib.md5(to_string(event.message)).hexdigest()
Next stop - where do the event interfaces come from?
[UPDATE 3: event interfaces]
I have worked out that interfaces refer to the standard mechanism for describing data passed into sentry events, and that I am using the standard sentry.interfaces.Message and sentry.interfaces.User interfaces.
Both of these will contain different data depending on the exception instance - and so a checksum will never match. Is there any way that I can exclude these from the checksum calculation? (Or at least the User interface value, as that has to be different - the Message interface value I could standardise.)
[UPDATE 4: solution]
Here are the two get_hash functions for the Message and User interfaces respectively:
# sentry.interfaces.Message
def get_hash(self):
return [self.message]
# sentry.interfaces.User
def get_hash(self):
return []
Looking at these two, only the Message.get_hash interface will return a value that is picked up by the get_checksum_for_event method, and so this is the one that will be returned (hashed etc.) The net effect of this is that the the checksum is evaluated on the message alone - which in theory means that I can standardise the message and keep the user definition unique.
I've answered my own question here, but hopefully my investigation is of use to others having the same problem. (As an aside, I've also submitted a pull request against the Sentry documentation as part of this ;-))
(Note to anyone using / extending Sentry with custom interfaces - if you want to avoid your interface being use to group exceptions, return an empty list.)
See my final update in the question itself. Events are aggregated on a combination of 'project', 'logger', 'culprit' and 'checksum' properties. The first three of these are relatively easy to control - the fourth, 'checksum' is a function of the type of data sent as part of the event.
Sentry uses the concept of 'interfaces' to control the structure of data passed in, and each interface comes with an implementation of get_hash, which is used to return a hash value for the data passed in. Sentry comes with a number of standard interfaces ('Message', 'User', 'HTTP', 'Stacktrace', 'Query', 'Exception'), and these each have their own implemenation of get_hash. The default (inherited from the Interface base class) is a empty list, which would not affect the checksum.
In the absence of any valid interfaces, the event message itself is hashed and returned as the checksum, meaning that the message would need to be unique for the event to be grouped.
I've had a common problem with Exceptions. Currently our system is capturing only exceptions and I was confused why some of these where merged into a single error, others are not.
With your information above I extraced the "get_hash" methods and tried to find the differences "raising" my errors. What I found out is that the grouped errors all came from a self written Exception type that has an empty Exception.message value.
get_hash output:
[<class 'StorageException'>, StorageException()]
and the multiple errors came from an exception class that has a filled message value (jinja template engine)
[<class 'jinja2.exceptions.UndefinedError'>, UndefinedError('dict object has no attribute LISTza_*XYZ*',)]
Different exception messages trigger different reports, in my case the merge was caused due to the lack of the Exception.message value.
Implementation:
class StorageException(Exception):
def __init__(self, value):
Exception.__init__(self)
self.value = value