MessageReceiver2topic/Subscriptions/mutationProcessor' is force detached - azure-servicebus-topics

We are having issues with an Import process and after investigation it looks like its related to issues with a Service Bus.
Exception #1
The link '678ed7a8-31d5-4236-8cc1-a316c3329943;42:47:48:source(address:topic/Subscriptions/mutationProcessor):MessageReceiver2topic/Subscriptions/mutationProcessor' is force detached. Code: RenewToken. Details: Unauthorized access. 'Listen' claim(s) are required to perform this operation. Resource: 'sb://asb.servicebus.windows.net/topic/subscriptions/mutationprocessor'.. TrackingId:b352c1a9858b497b869a58a7be09ae2a_G12, SystemTracker:gateway7, Timestamp:2020-07-27T15:18:02
Expectation:
What happens to messages that are sent when the subscription(s) is detached.
Work arounds/fixes
GitHub links for reference:
https://github.com/Azure/azure-sdk-for-net/issues/11619
https://github.com/Azure/azure-sdk-for-net/issues/8884
The connection was inactive for more than the allowed 60000 milliseconds and is closed by container '1c7fe518-491a-47dd-aa5e-5ae96f0245df'.
Below is a MS documentation link which describes similar but not exact message.
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-amqp-troubleshoot
Expectation:
Would this cause any issues in case messages were sent when the connection is inactive. If yes, please suggest a way to handle it.
GitHub links for reference:
https://github.com/Azure/azure-service-bus-java/issues/280

Related

ReadEventsAsync got EventHubsExeception(ConsumerDisconnected) intermitently

I am using EventHubConsumerClient.ReadEventsAsync method to read events in eventHub. It works perfectly when I use default eventHub. However, when I route it to a new eventHub I am getting EventHubsExeception(ConsumerDisconnected) from time to time. From the documentation. It says this happen due to A client was forcefully disconnected from an Event Hub instance. This typically occurs when another consumer with higher OwnerLevel asserts ownership over the partition and consumer group. I almost got this exception every time. Only a few time it works. Anyone know how to resolve this? Or is there a better way to read message from eventHub? I don't want to use eventProcessorClient since it requires blobContainerClient
for the code, I followed the sample
await using var consumerClient = new EventHubConsumerClient(
EventHubConsumerClient.DefaultConsumerGroupName,
eventHubConnectionString,
eventHubName
);
await foreach (PartitionEvent partitionEvent in consumerClient.ReadEventsAsync(cancelToken)){
...
}
The error that you're seeing is very specific to a single scenario: another client has opened an AMQP link to one of the partitions you're reading from and has requested that the Event Hubs service give it exclusive access. This results in the Event Hubs service terminating your link with an AMQP error code of Stolen which the Event Hubs SDK translates into the form that you're seeing. (source)
These requests for exclusive access are enforced on a consumer group level. In your snippet, you're using the default consumer group, which is apparently also used by other consumers. As a best practice, I'd recommend that you create a unique consumer group for each application that is reading from the Event Hub - unless you specifically want them to interact.
In your case, your client is not requesting exclusive access, so anyone that is will take precedence. If you were to create a new consumer group and use that to configure your client, I would expect your disconnect errors to stop.

How can I filter out errors on sentry to avoid consuming my quota?

I'm using Sentry to log my errors, but there are errors I'm not able to fix (or could not be fixed by me) like
OSError (write error)
Or error that come from RQ (each time I deploy my app)
Or client errors (which are client.errors)
I can't just ignore them because I consume all my quota. How I can filter out this errors?
Here some references for interested people.
uwsgi: OSError: write error during GET request
Fixing broken pipe error in uWSGI with Python
https://github.com/unbit/uwsgi/issues/1623
I created a Gist for rate limiting the amount of events that are being send to Sentry:
https://gist.github.com/jurrian/e22f8e724b8499a29c5537e956f0dc7f
It uses ratelimitingfilter which can be configured to set a rate per minute, and additionally add a burst to start rate limiting after a number of events.
I get the same errors, but i never had any problems with my quota. But if you really want to filter it, you can just do it in your sdk:
https://docs.sentry.io/error-reporting/configuration/filtering/?platform=python
But beware, this could hide other errors as mentioned here:
https://github.com/pypa/warehouse/issues/679
To safe yourself some quota, you have two options:
Avoid forwarding events client side, thus preventing events being send to sentry at all. Have a look at the docs for available client-side filters. The drawback with this approach is of course that you need a new code deployment for any adjustment of client-side filters and some clients may not instantly reflect your code changes.
Avoid forwarding events on sentry's side, via inbound filters ([Project] > Project Settings > Inbound Filters). According to the sentry documentation on quota usage, events filtered via inbound filters are not affecting your quota.
Inbound filters include:
Common browser extension errors
Events coming from localhost
Known legacy browsers errors
Known web crawlers
By their error message
From specific release versions of your code
From certain IP addresses
Business plans and above also allow to filter events by error messages.

Random “upstream connect error or disconnect/reset before headers” between services with Istio 1.3

So, this problem is happening randomly (it seems) and between different services.
For example we have a service A which needs to talk to service B, and some times we get this error, but after a while, the error goes away. And this error doesn't happen too often.
When this happens, we see the error log in service A throwing the “upstream connect error” message, but none in service B. So we think it might be related with the sidecars.
One thing we notice is that in service B, we get a lot of this error messages in the istio-proxy container:
[src/istio/mixerclient/report_batch.cc:109] Mixer Report failed with: UNAVAILABLE:upstream connect error or disconnect/reset before headers. reset reason: connection failure
And according to documentation when a request comes in, envoy asks Mixer if everything is good (authorization and other things), and if Mixer doesn’t reply, the request is not success. So that’s why exists an option called policyCheckFailOpen.
We have that in false, I guess is a sane default, we don’t want the request to go through if Mixer cannot be reached, but why can’t?
disablePolicyChecks: true
policyCheckFailOpen: false
controlPlaneSecurityEnabled: false
NOTE: istio-policy is running with the istio-proxy sidecar. Is that correct?
We don’t see that error in some other service which can also fail.
Another log that I can see a lot, and this one happens in all the services not running as root with fsGroup defined in the YAML files is:
watchFileEvents: "/etc/certs": MODIFY|ATTRIB
watchFileEvents: "/etc/certs/..2020_02_10_09_41_46.891624651": MODIFY|ATTRIB
watchFileEvents: notifying
One of the leads I'm chasing is about default circuitBreakers values. Could that be related with this?
Thanks
The error you are seeing is because of a failure to establish a connection to istio-policy
Based on this github issue
Community members add two answers here which could help you with your issue
If mTLS is enabled globally make sure you set controlPlaneSecurityEnabled: true
I was facing the same issue, then I read about protocol selection. I realised the name of the port in the service definition should start with for example http-. This fixed the issue for me. And . if you face the issue still you might need to look at the tls-check for the pods and resolve it using destinationrules and policies.
istio-policy is running with the istio-proxy sidecar. Is that correct?
Yes, I just checked it and it's with sidecar.
Let me know if that help.

Worker role using event hubs gives 'No connection handler was found for virtual host'

I have a worker role that uses an EventProcessorHost to ingest data from an EventHub. I frequently receive error messages of the following kind:
Microsoft.ServiceBus.Messaging.MessagingCommunicationException:
No connection handler was found for virtual host 'myservicebusnamespace.servicebus.windows.net:42777'. Remote container id is 'f37c72ee313c4d658588ad9855773e51'. TrackingId:1d200122575745cc89bb714ffd533b6d_B5_B5, SystemTracker:SharedConnectionListener, Timestamp:8/29/2016 6:13:45 AM
at Microsoft.ServiceBus.Common.ExceptionDispatcher.Throw(Exception exception)
at Microsoft.ServiceBus.Common.Parallel.TaskHelpers.EndAsyncResult(IAsyncResult asyncResult)
at Microsoft.ServiceBus.Messaging.IteratorAsyncResult`1.StepCallback(IAsyncResult result)
I can't seem to find a way to catch this exception. It seems I can just ignore the error because everything works as expected (I had previously mentioned here that it was dropping messages because of this error, but I have since found out that a bug in the software that sends the messages caused this problem), however I would like to know what causes these errors, since they are clogging up my logging now and then.
Can anyone shed some light on the cause?
The Event Hub partitions are distributed across multiple servers. They sometimes move due to load balancing, upgrade and other reasons. When this happens, the client connection is lost with this error. The connection will be reestablished very quickly so you should not see any issues with message processing. It is safe to ignore this communication error.

Spring Integration Pop3MailReceiver stops polling silently without logging why

Problem
I have a very basic configuration for a Spring integration mail adapter setup (below is the relevant sample):
<int:channel id="emailChannel">
<int:interceptors>
<int:wire-tap channel="logger"/>
</int:interceptors>
</int:channel>
<mail:inbound-channel-adapter id="popChannel"
store-uri="pop3://user:password#domain.net/INBOX"
channel="emailChannel"
should-delete-messages="true"
auto-startup="true">
<int:poller max-messages-per-poll="1" fixed-rate="30000"/>
</mail:inbound-channel-adapter>
<int:logging-channel-adapter id="logger" level="DEBUG"/>
<int:service-activator input-channel="emailChannel" ref="mailResultsProcessor" method="onMessage" />
This is working fine the majority of the time and I can see the logs showing the polling (and it works fine hooking into my mailResultsProcessor when a mail is there):
2013-08-13 08:19:29,748 [task-scheduler-3] DEBUG org.springframework.integration.mail.Pop3MailReceiver - opening folder [pop3://user:password#fomain.net/INBOX]
2013-08-13 08:19:29,796 [task-scheduler-3] INFO org.springframework.integration.mail.Pop3MailReceiver - attempting to receive mail from folder [INBOX]
2013-08-13 08:19:29,796 [task-scheduler-3] DEBUG org.springframework.integration.mail.Pop3MailReceiver - found 0 new messages
2013-08-13 08:19:29,796 [task-scheduler-3] DEBUG org.springframework.integration.mail.Pop3MailReceiver - Received 0 messages
2013-08-13 08:19:29,893 [task-scheduler-3] DEBUG org.springframework.integration.endpoint.SourcePollingChannelAdapter - Received no Message during the poll, returning 'false'
The problem I have is that the polling stops during the day, with no indication in the logs why it has stopped working. The only reason I can tell is the debug above is not present in the logs and E-Mails build up on the E-Mail account.
Questions
Has anyone seen this before and know how to resolve it?
Is there a change that I can make in my configuration to capture the issue into the log? I thought the logging channel adapter set to debug would have this covered.
Using version 2.2.3.RELEASE of Spring Integration on Tomcat 7, logs output default to catalina.out. Deployed on AWS standard tomcat 7 instance.
Most likely the poller thread is hung someplace upstream. With your configuration, the next poll won't happen until the current poll completes.
You can use jstack or VisualVM to get a thread dump to find out what the thread is doing.
Another possibility is you are suffering from poller thread starvation - if you have a lot of other polled elements in your application, and depending on their configuration. The default taskScheduler bean has only 10 threads.
You can add a task executor to the <poller/> so each poll is handed off to another thread, but be aware that that can result in concurrent polls if a polled task takes longer to execute than the polling rate.
To resolve this problem specifically I used the configuration below:
<mail:inbound-channel-adapter id="popChannel"
store-uri="pop3://***/INBOX"
channel="emailChannel"
should-delete-messages="true"
auto-startup="true">
<int:poller max-messages-per-poll="5" fixed-rate="60000" task-executor="pool"/>
</mail:inbound-channel-adapter>
<task:executor id="pool" pool-size="10" keep-alive="50"/>
Once moving to this approach we saw no further problems, and is with any use of pool the advantage is any Threads that become a problem are cleaned up and recreated.