EventHub EventDataBatch exceed MaxMessageSize - azure-eventhub

I am using Microsoft.Azure.EventHubs 2.0.0 NuGet to add messages to batch:
private EventDataBatch _currentBatch;
_currentBatch = _eventHubClient.CreateBatch();
...
var json = #event.ToEventJson();
var data = new EventData(Encoding.UTF8.GetBytes(json));
if (_currentBatch.TryAdd(data))
{
return;
}
But the created batch randomly (not on all created batches) throws:
{Microsoft.Azure.EventHubs.MessageSizeExceededException: The received
message (delivery-id:0, size:262192 bytes) exceeds the limit (262144
bytes) currently allowed on the link.
during
await _eventHubClient.SendBatchAsync(batch);
It is deterministic as I have set of data that throws this error always during the tests.

Related

Azure Webjob, KeyVault Configuration extension, Socket Error

Need some help to determine if this is a bug in my code or in the config kevault extensions.
I have a netcore console based webjob. all working fine until a few weeks ago when we stated getting occasional startup errors which were Socket Error 10060 - Socket timed out or "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond"
These were all related to loading configuration layers (app settings, env, command line and keyvault). The errors stemmed from the keyvault once the build was executed on the hostbuilder.
I initially added the retry policy with the default HttpStatusCodeErrorDetectionStrategy and an exponential back-off but this is not executing.
finally I added my own retry policy with my own detection strategy (see below). Still not being fired.
I have stripped down the code to a hello world like example and included the messages from the webjob.
Here is the code summary:
Main
public static async Task<int> Main(string[] args)
{
var host = CreateHostBuilder(args)
.UseConsoleLifetime()
.Build();
using var serviceScope = host.Services.CreateScope();
var services = serviceScope.ServiceProvider;
//**stripped down to logging just for debug
var loggerFactory = host.Services.GetRequiredService<ILoggerFactory>();
var logger = loggerFactory.CreateLogger("Main");
logger.LogDebug("Hello Test App Started OK. Exiting.");
//**Normally lots of service calls go here to do real work**
return 0;
}
HostBuilder - why hostbuilder? We use lots of components that are built for webapi and webapps so it was convenient to use a similar services model.
public static IHostBuilder CreateHostBuilder(string[] args)
{
var host = Host
.CreateDefaultBuilder(args)
.ConfigureAppConfiguration((ctx, config) =>
{
//override with keyvault
var azureServiceTokenProvider = new AzureServiceTokenProvider(); //this is awesome - it will use MSI or Visual Studio connection
var keyVaultClient = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(azureServiceTokenProvider.KeyVaultTokenCallback));
var retryPolicy = new RetryPolicy<ServerErrorDetectionStrategy>(
new ExponentialBackoffRetryStrategy(
retryCount: 5,
minBackoff: TimeSpan.FromSeconds(1.0),
maxBackoff: TimeSpan.FromSeconds(16.0),
deltaBackoff: TimeSpan.FromSeconds(2.0)
)
);
retryPolicy.Retrying += RetryPolicy_Retrying;
keyVaultClient.SetRetryPolicy(retryPolicy);
var prebuiltConfig = config.Build();
config.AddAzureKeyVault(prebuiltConfig.GetSection("KeyVaultSettings").GetValue<string>("KeyVaultUri"), keyVaultClient, new DefaultKeyVaultSecretManager());
config.AddCommandLine(args);
})
.ConfigureLogging((ctx, loggingBuilder) => //note - this is run AFTER app configuration - whatever the order it is in.
{
loggingBuilder.ClearProviders();
loggingBuilder
.AddConsole()
.AddDebug()
.AddApplicationInsightsWebJobs(config => config.InstrumentationKey = ctx.Configuration["APPINSIGHTS_INSTRUMENTATIONKEY"]);
})
.ConfigureServices((ctx, services) =>
{
services
.AddApplicationInsightsTelemetry();
services
.AddOptions();
});
return host;
}
Event - this is never fired.
private static void RetryPolicy_Retrying(object sender, RetryingEventArgs e)
{
Console.WriteLine($"Retrying, count = {e.CurrentRetryCount}, Last Exception={e.LastException}, Delay={e.Delay}");
}
Retry Policy - only fires for the non-MSI attempt to contact the keyvault.
public class ServerErrorDetectionStrategy : ITransientErrorDetectionStrategy
{
public bool IsTransient(Exception ex)
{
if (ex != null)
{
Console.WriteLine($"Exception {ex.Message} received, {ex.GetType()?.FullName}");
HttpRequestWithStatusException httpException;
if ((httpException = ex as HttpRequestWithStatusException) != null)
{
switch(httpException.StatusCode)
{
case HttpStatusCode.RequestTimeout:
case HttpStatusCode.GatewayTimeout:
case HttpStatusCode.InternalServerError:
case HttpStatusCode.ServiceUnavailable:
return true;
}
}
SocketException socketException;
if((socketException = (ex as SocketException)) != null)
{
Console.WriteLine($"Exception {socketException.Message} received, Error Code: {socketException.ErrorCode}, SocketErrorCode: {socketException.SocketErrorCode}");
if (socketException.SocketErrorCode == SocketError.TimedOut)
{
return true;
}
}
}
return false;
}
}
WebJob Output
[SYS INFO] Status changed to Initializing
[SYS INFO] Run script 'run.cmd' with script host - 'WindowsScriptHost'
[SYS INFO] Status changed to Running
[INFO]
[INFO] D:\local\Temp\jobs\triggered\HelloWebJob\42wj5ipx.ukj>dotnet HelloWebJob.dll
[INFO] Exception Response status code indicates server error: 401 (Unauthorized). received, Microsoft.Rest.TransientFaultHandling.HttpRequestWithStatusException
[INFO] Exception A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. received, System.Net.Http.HttpRequestException
[ERR ] Unhandled exception. System.Net.Http.HttpRequestException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[ERR ] ---> System.Net.Sockets.SocketException (10060): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[ERR ] at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
[ERR ] --- End of inner exception stack trace ---
[ERR ] at Microsoft.Rest.RetryDelegatingHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
[ERR ] at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
[ERR ] at Microsoft.Azure.KeyVault.KeyVaultClient.GetSecretWithHttpMessagesAsync(String vaultBaseUrl, String secretName, String secretVersion, Dictionary`2 customHeaders, CancellationToken cancellationToken)
[ERR ] at Microsoft.Azure.KeyVault.KeyVaultClientExtensions.GetSecretAsync(IKeyVaultClient operations, String secretIdentifier, CancellationToken cancellationToken)
[ERR ] at Microsoft.Extensions.Configuration.AzureKeyVault.AzureKeyVaultConfigurationProvider.LoadAsync()
[ERR ] at Microsoft.Extensions.Configuration.AzureKeyVault.AzureKeyVaultConfigurationProvider.Load()
[ERR ] at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
[ERR ] at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
[ERR ] at Microsoft.Extensions.Hosting.HostBuilder.BuildAppConfiguration()
[ERR ] at Microsoft.Extensions.Hosting.HostBuilder.Build()
[ERR ] at HelloWebJob.Program.Main(String[] args) in C:\Users\mark\Source\Repos\HelloWebJob\HelloWebJob\Program.cs:line 21
[ERR ] at HelloWebJob.Program.<Main>(String[] args)
[SYS INFO] Status changed to Failed
[SYS ERR ] Job failed due to exit code -532462766
This is an issue in the KV connectivity which is identified by the PG. Below is an official statement from Product Group:
The Microsoft Azure App Service Team has identified an issue with the
Key Vault references for App Service and Azure Functions feature
related to intermittent failure to resolve references at runtime.
Engineers identified a regression in the system that reduced the
performance and availability of our scale unit’s ability to retrieve
key vault references at runtime. A patch has been written and deployed
to our fleet of VMs to mitigate this issue.
We are continuously taking steps to improve the Azure Web App service
and our processes to ensure such incidents do not occur in the future,
and in this case, it includes (but is not limited to): Improving
detection and testing of performance and availability of the Key Vault
App Setting References feature Improvements to our platform to ensure
high availability of this feature at runtime. We apologize for any
inconvenience.
For almost everyone, updating packages to the new Microsoft.Azure packages has mitigated this issue, so trying those would be my first suggestion.
Thanks #HarshitaSingh-MSFT, makes sense though I searched for this when I had the problem and couldn't find it.
As a work around, I added some basic retry code to the startup.
Main looks like this for now:
public static async Task<int> Main(string[] args)
{
IHost host = null;
int retries = 5;
while (true)
{
try
{
Console.WriteLine("Building Host...");
var hostBuilder = CreateHostBuilder(args)
.UseConsoleLifetime();
host = hostBuilder.Build();
break;
}
catch (HttpRequestException hEx)
{
Console.WriteLine($"HTTP Exception in host builder. {hEx.Message}, Name:{hEx.GetType().Name}");
SocketException se;
if ((se = hEx.InnerException as SocketException) != null)
{
if (se.SocketErrorCode == SocketError.TimedOut)
{
Console.WriteLine($"Socket error in host builder. Retrying...");
if (retries > 0)
{
retries--;
await Task.Delay(5000);
host?.Dispose();
}
else
{
throw;
}
}
else
{
throw;
}
}
}
}
using var serviceScope = host.Services.CreateScope();
var services = serviceScope.ServiceProvider;
var transferService = services.GetRequiredService<IRunPinTransfer>();
var result = await transferService.ProcessAsync();
return result;
}

How to set the max receive message length in Google Cloud request?

I have the following Google Cloud call:
var builder = new TextToSpeechClientBuilder();
builder.JsonCredentials = #"...";
var client = builder.Build();
var data = client.SynthesizeSpeech(new SynthesisInput { Ssml = text },
new VoiceSelectionParams { LanguageCode = culture.TwoLetterISOLanguageName },
new AudioConfig { AudioEncoding = AudioEncoding.Linear16, SampleRateHertz = 8000 }).AudioContent;
It throws the following exception:
Grpc.Core.RpcException: 'Status(StatusCode="ResourceExhausted", Detail="Received message larger than max (4675411 vs. 4194304)")'
The request is less than 2000 bytes, so it seems that the response is too big. The server wants to send the response, but the client can't accept it.
How to increase this limit?
UPDATE: Since version 2.2.0 it is possible to set the max response size:
var channelOptions = GrpcChannelOptions.Empty
.WithKeepAliveTime(TimeSpan.FromMinutes(1))
.WithEnableServiceConfigResolution(false)
.WithMaxReceiveMessageSize(1024 * 1024 * 1024);
var builder = new TextToSpeechClientBuilder();
builder.JsonCredentials = jsonCredentials;
builder.GrpcChannelOptions = channelOptions;
In order to overcome the error Received message larger than max (4675411 vs. 4194304), you need to set the inbound message size to a greater value. You can do it when instantiating your builder, similarly as when creating a channel. Below, it is how you can do it:
textToSpeechClient = TextToSpeechClient.create(TextToSpeechSettings
.newBuilder()
.setTransportChannelProvider(
TextToSpeechSettings.defaultGrpcTransportProviderBuilder()
.setMaxInboundMessageSize(8790801)
.build())
.build());
Here is the documentation for the setMaxInboundMessageSize method.
UPDATE: As I understand you want to use C# and since I could not find a method to set the inbound's message size, such as in Java (above). I have opened a a Public issue within Google. Although, I do not have an ETA for it, you can keep track here.

AWS lambda send partial response

I have a lambda function which does a series of actions. I have a react application which triggers the lambda function.
Is there a way I can send a partial response from the lambda function after each action is complete.
const testFunction = (event, context, callback) => {
let partialResponse1 = await action1(event);
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
// send partial response to client
let response = await action4(partialResponse3);
// send final response
}
Is this possible in lambda functions? If so, how we can do this. Any ref docs or sample code would be do a great help.
Thanks.
Note: This is fairly a simple case of showing a loader with % on the client-side. I don't want to overcomplicate things SQS or step functions.
I am still looking for an answer for this.
From what I understand you're using API Gateway + Lambda and are looking to show the progress of the Lambda via UI.
Since each step must finish before the next step begin I see no reason not to call the lambda 4 times, or split the lambda to 4 separate lambdas.
E.g.:
// Not real syntax!
try {
res1 = await ajax.post(/process, {stage: 1, data: ... });
out(stage 1 complete);
res2 = await ajax.post(/process, {stage: 2, data: res1});
out(stage 2 complete);
res3 = await ajax.post(/process, {stage: 3, data: res2});
out(stage 3 complete);
res4 = await ajax.post(/process, {stage: 4, data: res3});
out(stage 4 complete);
out(process finished);
catch(err) {
out(stage {$err.stage-number} failed to complete);
}
If you still want all 4 calls to be executed during the same lambda execution you may do the following (this especially true if the process is expected to be very long) (and because it's usually not good practice to execute "long hanging" http transaction).
You may implement it by saving the "progress" in a database, and when the process is complete save the results to the database as well.
All you need to do is query the status every X seconds.
// Not real syntax
Gateway-API --> lambda1 - startProcess(): returns ID {
uuid = randomUUID();
write to dynamoDB { status: starting }.
send sqs-message-to-start-process(data, uuid);
return response { uuid: uuid };
}
SQS --> lambda2 - execute(): returns void {
try {
let partialResponse1 = await action1(event);
write to dynamoDB { status: action 1 complete }.
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
write to dynamoDB { status: action 2 complete }.
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
write to dynamoDB { status: action 3 complete }.
// send partial response to client
let response = await action4(partialResponse3);
write to dynamoDB { status: action 4 complete, response: response }.
} catch(err) {
write to dynamoDB { status: failed, error: err }.
}
}
Gateway-API --> lambda3 -> getStatus(uuid): returns status {
return status from dynamoDB (uuid);
}
Your UI Code:
res = ajax.get(/startProcess);
uuid = res.uuid;
in interval every X (e.g. 3) seconds:
status = ajax.get(/getStatus?uuid=uuid);
show(status);
if (status.error) {
handle(status.error) and break;
}
if (status.response) {
handle(status.response) and break;
}
}
Just remember that lambda's cannot exceed 15 minutes execution. Therefore, you need to be 100% certain that whatever the process does, it never exceeds this hard limit.
What you are looking for is to have response expose as a stream where you can write to the stream and flush it
Unfortunately its not there in Node.js
How to stream AWS Lambda response in node?
https://docs.aws.amazon.com/lambda/latest/dg/programming-model.html
But you can still do the streaming if you use Java
https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html
package example;
import java.io.InputStream;
import java.io.OutputStream;
import com.amazonaws.services.lambda.runtime.RequestStreamHandler;
import com.amazonaws.services.lambda.runtime.Context;
public class Hello implements RequestStreamHandler{
public void handler(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
int letter;
while((letter = inputStream.read()) != -1)
{
outputStream.write(Character.toUpperCase(letter));
}
}
}
Aman,
You can push the partial outputs into SQS and read the SQS messages to process those message. This is a simple and scalable architecture. AWS provides SQS SDKs in different languages, for example, JavaScript, Java, Python, etc.
Reading and writing into SQS is very easy using SDK and that too can be implemented in serverside or in your UI layer (with proper IAM).
I found AWS step function may be what you need:
AWS Step Functions lets you coordinate multiple AWS services into serverless workflows so you can build and update apps quickly.
Check this link for more detail:
In our example, you are a developer who has been asked to create a serverless application to automate handling of support tickets in a call center. While you could have one Lambda function call the other, you worry that managing all of those connections will become challenging as the call center application becomes more sophisticated. Plus, any change in the flow of the application will require changes in multiple places, and you could end up writing the same code over and over again.

Google IOT per device heartbeat alert using Stackdriver

I'd like to alert on the lack of a heartbeat (or 0 bytes received) from any one of large number of Google IOT core devices. I can't seem to do this in Stackdriver. It instead appears to let me alert on the entire device registry which does not give me what I'm looking for (How would I know that a particular device is disconnected?)
So how does one go about doing this?
I have no idea why this question was downvoted as 'too broad'.
The truth is Google IOT doesn't have per device alerting, but instead offers only alerting on an entire device registry. If this is not true, please reply to this post. The page that clearly states this is here:
Cloud IoT Core exports usage metrics that can be monitored
programmatically or accessed via Stackdriver Monitoring. These metrics
are aggregated at the device registry level. You can use Stackdriver
to create dashboards or set up alerts.
The importance of having per device alerting is built into the promise assumed in this statement:
Operational information about the health and functioning of devices is
important to ensure that your data-gathering fabric is healthy and
performing well. Devices might be located in harsh environments or in
hard-to-access locations. Monitoring operational intelligence for your
IoT devices is key to preserving the business-relevant data stream.
So its not easy today to get an alert if one among many, globally dispersed devices, loses connectivity. One needs to build that, and depending on what one is trying to do, it would entail different solutions.
In my case I wanted to alert if the last heartbeat time or last event state publish was older than 5 minutes. For this I need to run a looping function that scans the device registry and performs this operation regularly. The usage of this API is outlined in this other SO post: Google iot core connection status
For reference, here's a Firebase function I just wrote to check a device's online status, probably needs some tweaks and further testing, but to help anybody else with something to start with:
// Example code to call this function
// const checkDeviceOnline = functions.httpsCallable('checkDeviceOnline');
// Include 'current' key for 'current' online status to force update on db with delta
// const isOnline = await checkDeviceOnline({ deviceID: 'XXXX', current: true })
export const checkDeviceOnline = functions.https.onCall(async (data, context) => {
if (!context.auth) {
throw new functions.https.HttpsError('failed-precondition', 'You must be logged in to call this function!');
}
// deviceID is passed in deviceID object key
const deviceID = data.deviceID
const dbUpdate = (isOnline) => {
if (('wasOnline' in data) && data.wasOnline !== isOnline) {
db.collection("devices").doc(deviceID).update({ online: isOnline })
}
return isOnline
}
const deviceLastSeen = () => {
// We only want to use these to determine "latest seen timestamp"
const stamps = ["lastHeartbeatTime", "lastEventTime", "lastStateTime", "lastConfigAckTime", "deviceAckTime"]
return stamps.map(key => moment(data[key], "YYYY-MM-DDTHH:mm:ssZ").unix()).filter(epoch => !isNaN(epoch) && epoch > 0).sort().reverse().shift()
}
await dm.setAuth()
const iotDevice: any = await dm.getDevice(deviceID)
if (!iotDevice) {
throw new functions.https.HttpsError('failed-get-device', 'Failed to get device!');
}
console.log('iotDevice', iotDevice)
// If there is no error status and there is last heartbeat time, assume device is online
if (!iotDevice.lastErrorStatus && iotDevice.lastHeartbeatTime) {
return dbUpdate(true)
}
// Add iotDevice.config.deviceAckTime to root of object
// For some reason in all my tests, I NEVER receive anything on lastConfigAckTime, so this is my workaround
if (iotDevice.config && iotDevice.config.deviceAckTime) iotDevice.deviceAckTime = iotDevice.config.deviceAckTime
// If there is a last error status, let's make sure it's not a stale (old) one
const lastSeenEpoch = deviceLastSeen()
const errorEpoch = iotDevice.lastErrorTime ? moment(iotDevice.lastErrorTime, "YYYY-MM-DDTHH:mm:ssZ").unix() : false
console.log('lastSeen:', lastSeenEpoch, 'errorEpoch:', errorEpoch)
// Device should be online, the error timestamp is older than latest timestamp for heartbeat, state, etc
if (lastSeenEpoch && errorEpoch && (lastSeenEpoch > errorEpoch)) {
return dbUpdate(true)
}
// error status code 4 matches
// lastErrorStatus.code = 4
// lastErrorStatus.message = mqtt: SERVER: The connection was closed because MQTT keep-alive check failed.
// will also be 4 for other mqtt errors like command not sent (qos 1 not acknowledged, etc)
if (iotDevice.lastErrorStatus && iotDevice.lastErrorStatus.code && iotDevice.lastErrorStatus.code === 4) {
return dbUpdate(false)
}
return dbUpdate(false)
})
I also created a function to use with commands, to send a command to the device to check if it's online:
export const isDeviceOnline = functions.https.onCall(async (data, context) => {
if (!context.auth) {
throw new functions.https.HttpsError('failed-precondition', 'You must be logged in to call this function!');
}
// deviceID is passed in deviceID object key
const deviceID = data.deviceID
await dm.setAuth()
const dbUpdate = (isOnline) => {
if (('wasOnline' in data) && data.wasOnline !== isOnline) {
console.log( 'updating db', deviceID, isOnline )
db.collection("devices").doc(deviceID).update({ online: isOnline })
} else {
console.log('NOT updating db', deviceID, isOnline)
}
return isOnline
}
try {
await dm.sendCommand(deviceID, 'alive?', 'alive')
console.log('Assuming device is online after succesful alive? command')
return dbUpdate(true)
} catch (error) {
console.log("Unable to send alive? command", error)
return dbUpdate(false)
}
})
This also uses my version of a modified DeviceManager, you can find all the example code on this gist (to make sure using latest update, and keep post on here small):
https://gist.github.com/tripflex/3eff9c425f8b0c037c40f5744e46c319
All of this code, just to check if a device is online or not ... which could be easily handled by Google emitting some kind of event or adding an easy way to handle this. COME ON GOOGLE GET IT TOGETHER!

AppFabric Topic Subscription

I am trying to assemble a simple AppFabric Topic whereby messages are sent and received using the SessionId. The code does not abort, but brokeredMessage is always null. Here is the code:
// BTW, the topic already exists
var messagingFactory = MessagingFactory.Create(uri, credentials);
var topicClient = messagingFactory.CreateTopicClient(topicName);
var sender = topicClient.CreateSender();
var message = BrokeredMessage.CreateMessage("Top of the day!");
message.SessionId = "1";
sender.Send(message);
var subscription = topic.AddSubscription("1", new SubscriptionDescription { RequiresSession = true});
var mikeSubscriptionClient = messagingFactory.CreateSubscriptionClient(subscription);
var receiver = mikeSubscriptionClient.AcceptSessionReceiver("1");
BrokeredMessage brokeredMessage;
receiver.TryReceive(TimeSpan.FromMinutes(1), out brokeredMessage); // brokeredMessage always null
You have two problems in your code:
You create a subscription AFTER you send the message. You need to create a subscription before sending, because a subscription tells the topic to, in a sense, copy, the message to several different "buckets".
You are using TryReceive but are not checking for its result. It returns true, if a message was received, and false if not (e.g. Timeout has occured).
I am writing my sample application and will post it on our blog today. I will post the link here as well. But until then, move the subscription logic to before sending the message, and the receiver after it and you will start seeing results.
Update:
As promised, here is the link to my blog post on getting started with AppFabric Queues, Topics, Subscriptions.