PubSub "Connection reset by peer" on gcp - google-cloud-platform

I encountered the exception.
"System.IO.IOException: Unable to read data from the transport connection: Connection reset by peer.\n ---> System.Net.Sockets.SocketException (104): Connection reset by peer\n --- End of inner exception stack trace ---\n at Google.Cloud.PubSub.V1.SubscriberClientImpl.SingleChannel.HandleRpcFailure(Exception e)\n at Google.Cloud.PubSub.V1.SubscriberClientImpl.SingleChannel.HandlePullMoveNext(Task initTask)\n at Google.Cloud.PubSub.V1.SubscriberClientImpl.SingleChannel.StartAsync()\n at Google.Cloud.PubSub.V1.Tasks.ForwardingAwaiter.GetResult()\n at Google.Cloud.PubSub.V1.Tasks.Extensions.<>c__DisplayClass4_0.<g__Inner|0>d.MoveNext()\n--- End of stack trace from previous location ---\n
"Invoke" function was executed to pull a message from my topic per 5 seconds in the scheduler
public async Task Invoke()
{
var subscriber = await SubscriberClient.CreateAsync(CreateSubscriptionName());
await subscriber.StartAsync((msg, cancellationToken) =>
{
//....
return Task.FromResult(SubscriberClient.Reply.Ack);
});
await subscriber.StopAsync(CancellationToken.None);
}
How did I fix this ?
Thanks!
I've already checked the doc
PublisherClient and SubscriberClient are expensive to create, so when regularly publishing or subscribing to the same topic or subscription then a singleton client instance should be created and used for the lifetime of the application.
But I still don't know how to do ...
I guessed I left too many open connections ?

Related

django channels async consumer blocking on http request

I have the following async consumer:
class MyAsyncSoncumer(AsyncWebsocketConsumer):
async def send_http_request(self):
async with aiohttp.ClientSession(
timeout=aiohttp.ClientTimeout(total=60) # We have 60 seconds total timeout
) as session:
await session.post('my_url', json={
'key': 'value'
})
async def connect(self):
await self.accept()
await self.send_http_request()
async def receive(self, text_data=None, bytes_data=None):
print(text_data)
Here, on connect method, I first accept the connection, then call a method that issues an http request with aiohttp, which has a 60 second timeout. Lets assume that the url we're sending the request to is inaccessible. My initial understanding is, as all these methods are coroutines, while we are waiting for the response to the request, if we receive a message, receive method would be called and we could process the message, before the request finishes. However, in reality, I only start receiveing messages after the request times out, so it seems like the consumer is waiting for the send_http_request to finish before being able to receive messages.
If I replace
await self.send_http_request()
with
asyncio.create_task(self.send_http_request())
I can reveive messages while the request is being made, as I do not await for it to finish on accept method.
My understanding was that in the first case also, while awaiting for the request, I would be able to receive messages as we are using different coroutines here, but that is not the case. Could it be that the whole consumer instance works as a single coroutine? Can someone clarify what's happenning here?
Consumers in django channels each run thier own runloop (async Task). But this is per consumer not per message, so if you are handling a message and you await something then the entire runloop for that websocket connection is awaiting.

Azure Webjob, KeyVault Configuration extension, Socket Error

Need some help to determine if this is a bug in my code or in the config kevault extensions.
I have a netcore console based webjob. all working fine until a few weeks ago when we stated getting occasional startup errors which were Socket Error 10060 - Socket timed out or "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond"
These were all related to loading configuration layers (app settings, env, command line and keyvault). The errors stemmed from the keyvault once the build was executed on the hostbuilder.
I initially added the retry policy with the default HttpStatusCodeErrorDetectionStrategy and an exponential back-off but this is not executing.
finally I added my own retry policy with my own detection strategy (see below). Still not being fired.
I have stripped down the code to a hello world like example and included the messages from the webjob.
Here is the code summary:
Main
public static async Task<int> Main(string[] args)
{
var host = CreateHostBuilder(args)
.UseConsoleLifetime()
.Build();
using var serviceScope = host.Services.CreateScope();
var services = serviceScope.ServiceProvider;
//**stripped down to logging just for debug
var loggerFactory = host.Services.GetRequiredService<ILoggerFactory>();
var logger = loggerFactory.CreateLogger("Main");
logger.LogDebug("Hello Test App Started OK. Exiting.");
//**Normally lots of service calls go here to do real work**
return 0;
}
HostBuilder - why hostbuilder? We use lots of components that are built for webapi and webapps so it was convenient to use a similar services model.
public static IHostBuilder CreateHostBuilder(string[] args)
{
var host = Host
.CreateDefaultBuilder(args)
.ConfigureAppConfiguration((ctx, config) =>
{
//override with keyvault
var azureServiceTokenProvider = new AzureServiceTokenProvider(); //this is awesome - it will use MSI or Visual Studio connection
var keyVaultClient = new KeyVaultClient(new KeyVaultClient.AuthenticationCallback(azureServiceTokenProvider.KeyVaultTokenCallback));
var retryPolicy = new RetryPolicy<ServerErrorDetectionStrategy>(
new ExponentialBackoffRetryStrategy(
retryCount: 5,
minBackoff: TimeSpan.FromSeconds(1.0),
maxBackoff: TimeSpan.FromSeconds(16.0),
deltaBackoff: TimeSpan.FromSeconds(2.0)
)
);
retryPolicy.Retrying += RetryPolicy_Retrying;
keyVaultClient.SetRetryPolicy(retryPolicy);
var prebuiltConfig = config.Build();
config.AddAzureKeyVault(prebuiltConfig.GetSection("KeyVaultSettings").GetValue<string>("KeyVaultUri"), keyVaultClient, new DefaultKeyVaultSecretManager());
config.AddCommandLine(args);
})
.ConfigureLogging((ctx, loggingBuilder) => //note - this is run AFTER app configuration - whatever the order it is in.
{
loggingBuilder.ClearProviders();
loggingBuilder
.AddConsole()
.AddDebug()
.AddApplicationInsightsWebJobs(config => config.InstrumentationKey = ctx.Configuration["APPINSIGHTS_INSTRUMENTATIONKEY"]);
})
.ConfigureServices((ctx, services) =>
{
services
.AddApplicationInsightsTelemetry();
services
.AddOptions();
});
return host;
}
Event - this is never fired.
private static void RetryPolicy_Retrying(object sender, RetryingEventArgs e)
{
Console.WriteLine($"Retrying, count = {e.CurrentRetryCount}, Last Exception={e.LastException}, Delay={e.Delay}");
}
Retry Policy - only fires for the non-MSI attempt to contact the keyvault.
public class ServerErrorDetectionStrategy : ITransientErrorDetectionStrategy
{
public bool IsTransient(Exception ex)
{
if (ex != null)
{
Console.WriteLine($"Exception {ex.Message} received, {ex.GetType()?.FullName}");
HttpRequestWithStatusException httpException;
if ((httpException = ex as HttpRequestWithStatusException) != null)
{
switch(httpException.StatusCode)
{
case HttpStatusCode.RequestTimeout:
case HttpStatusCode.GatewayTimeout:
case HttpStatusCode.InternalServerError:
case HttpStatusCode.ServiceUnavailable:
return true;
}
}
SocketException socketException;
if((socketException = (ex as SocketException)) != null)
{
Console.WriteLine($"Exception {socketException.Message} received, Error Code: {socketException.ErrorCode}, SocketErrorCode: {socketException.SocketErrorCode}");
if (socketException.SocketErrorCode == SocketError.TimedOut)
{
return true;
}
}
}
return false;
}
}
WebJob Output
[SYS INFO] Status changed to Initializing
[SYS INFO] Run script 'run.cmd' with script host - 'WindowsScriptHost'
[SYS INFO] Status changed to Running
[INFO]
[INFO] D:\local\Temp\jobs\triggered\HelloWebJob\42wj5ipx.ukj>dotnet HelloWebJob.dll
[INFO] Exception Response status code indicates server error: 401 (Unauthorized). received, Microsoft.Rest.TransientFaultHandling.HttpRequestWithStatusException
[INFO] Exception A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. received, System.Net.Http.HttpRequestException
[ERR ] Unhandled exception. System.Net.Http.HttpRequestException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[ERR ] ---> System.Net.Sockets.SocketException (10060): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
[ERR ] at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
[ERR ] --- End of inner exception stack trace ---
[ERR ] at Microsoft.Rest.RetryDelegatingHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
[ERR ] at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
[ERR ] at Microsoft.Azure.KeyVault.KeyVaultClient.GetSecretWithHttpMessagesAsync(String vaultBaseUrl, String secretName, String secretVersion, Dictionary`2 customHeaders, CancellationToken cancellationToken)
[ERR ] at Microsoft.Azure.KeyVault.KeyVaultClientExtensions.GetSecretAsync(IKeyVaultClient operations, String secretIdentifier, CancellationToken cancellationToken)
[ERR ] at Microsoft.Extensions.Configuration.AzureKeyVault.AzureKeyVaultConfigurationProvider.LoadAsync()
[ERR ] at Microsoft.Extensions.Configuration.AzureKeyVault.AzureKeyVaultConfigurationProvider.Load()
[ERR ] at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
[ERR ] at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
[ERR ] at Microsoft.Extensions.Hosting.HostBuilder.BuildAppConfiguration()
[ERR ] at Microsoft.Extensions.Hosting.HostBuilder.Build()
[ERR ] at HelloWebJob.Program.Main(String[] args) in C:\Users\mark\Source\Repos\HelloWebJob\HelloWebJob\Program.cs:line 21
[ERR ] at HelloWebJob.Program.<Main>(String[] args)
[SYS INFO] Status changed to Failed
[SYS ERR ] Job failed due to exit code -532462766
This is an issue in the KV connectivity which is identified by the PG. Below is an official statement from Product Group:
The Microsoft Azure App Service Team has identified an issue with the
Key Vault references for App Service and Azure Functions feature
related to intermittent failure to resolve references at runtime.
Engineers identified a regression in the system that reduced the
performance and availability of our scale unit’s ability to retrieve
key vault references at runtime. A patch has been written and deployed
to our fleet of VMs to mitigate this issue.
We are continuously taking steps to improve the Azure Web App service
and our processes to ensure such incidents do not occur in the future,
and in this case, it includes (but is not limited to): Improving
detection and testing of performance and availability of the Key Vault
App Setting References feature Improvements to our platform to ensure
high availability of this feature at runtime. We apologize for any
inconvenience.
For almost everyone, updating packages to the new Microsoft.Azure packages has mitigated this issue, so trying those would be my first suggestion.
Thanks #HarshitaSingh-MSFT, makes sense though I searched for this when I had the problem and couldn't find it.
As a work around, I added some basic retry code to the startup.
Main looks like this for now:
public static async Task<int> Main(string[] args)
{
IHost host = null;
int retries = 5;
while (true)
{
try
{
Console.WriteLine("Building Host...");
var hostBuilder = CreateHostBuilder(args)
.UseConsoleLifetime();
host = hostBuilder.Build();
break;
}
catch (HttpRequestException hEx)
{
Console.WriteLine($"HTTP Exception in host builder. {hEx.Message}, Name:{hEx.GetType().Name}");
SocketException se;
if ((se = hEx.InnerException as SocketException) != null)
{
if (se.SocketErrorCode == SocketError.TimedOut)
{
Console.WriteLine($"Socket error in host builder. Retrying...");
if (retries > 0)
{
retries--;
await Task.Delay(5000);
host?.Dispose();
}
else
{
throw;
}
}
else
{
throw;
}
}
}
}
using var serviceScope = host.Services.CreateScope();
var services = serviceScope.ServiceProvider;
var transferService = services.GetRequiredService<IRunPinTransfer>();
var result = await transferService.ProcessAsync();
return result;
}

AWS lambda send partial response

I have a lambda function which does a series of actions. I have a react application which triggers the lambda function.
Is there a way I can send a partial response from the lambda function after each action is complete.
const testFunction = (event, context, callback) => {
let partialResponse1 = await action1(event);
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
// send partial response to client
let response = await action4(partialResponse3);
// send final response
}
Is this possible in lambda functions? If so, how we can do this. Any ref docs or sample code would be do a great help.
Thanks.
Note: This is fairly a simple case of showing a loader with % on the client-side. I don't want to overcomplicate things SQS or step functions.
I am still looking for an answer for this.
From what I understand you're using API Gateway + Lambda and are looking to show the progress of the Lambda via UI.
Since each step must finish before the next step begin I see no reason not to call the lambda 4 times, or split the lambda to 4 separate lambdas.
E.g.:
// Not real syntax!
try {
res1 = await ajax.post(/process, {stage: 1, data: ... });
out(stage 1 complete);
res2 = await ajax.post(/process, {stage: 2, data: res1});
out(stage 2 complete);
res3 = await ajax.post(/process, {stage: 3, data: res2});
out(stage 3 complete);
res4 = await ajax.post(/process, {stage: 4, data: res3});
out(stage 4 complete);
out(process finished);
catch(err) {
out(stage {$err.stage-number} failed to complete);
}
If you still want all 4 calls to be executed during the same lambda execution you may do the following (this especially true if the process is expected to be very long) (and because it's usually not good practice to execute "long hanging" http transaction).
You may implement it by saving the "progress" in a database, and when the process is complete save the results to the database as well.
All you need to do is query the status every X seconds.
// Not real syntax
Gateway-API --> lambda1 - startProcess(): returns ID {
uuid = randomUUID();
write to dynamoDB { status: starting }.
send sqs-message-to-start-process(data, uuid);
return response { uuid: uuid };
}
SQS --> lambda2 - execute(): returns void {
try {
let partialResponse1 = await action1(event);
write to dynamoDB { status: action 1 complete }.
// send partial response to client
let partialResponse2 = await action2(partialResponse1);
write to dynamoDB { status: action 2 complete }.
// send partial response to client
let partialResponse3 = await action3(partialResponse2);
write to dynamoDB { status: action 3 complete }.
// send partial response to client
let response = await action4(partialResponse3);
write to dynamoDB { status: action 4 complete, response: response }.
} catch(err) {
write to dynamoDB { status: failed, error: err }.
}
}
Gateway-API --> lambda3 -> getStatus(uuid): returns status {
return status from dynamoDB (uuid);
}
Your UI Code:
res = ajax.get(/startProcess);
uuid = res.uuid;
in interval every X (e.g. 3) seconds:
status = ajax.get(/getStatus?uuid=uuid);
show(status);
if (status.error) {
handle(status.error) and break;
}
if (status.response) {
handle(status.response) and break;
}
}
Just remember that lambda's cannot exceed 15 minutes execution. Therefore, you need to be 100% certain that whatever the process does, it never exceeds this hard limit.
What you are looking for is to have response expose as a stream where you can write to the stream and flush it
Unfortunately its not there in Node.js
How to stream AWS Lambda response in node?
https://docs.aws.amazon.com/lambda/latest/dg/programming-model.html
But you can still do the streaming if you use Java
https://docs.aws.amazon.com/lambda/latest/dg/java-handler-io-type-stream.html
package example;
import java.io.InputStream;
import java.io.OutputStream;
import com.amazonaws.services.lambda.runtime.RequestStreamHandler;
import com.amazonaws.services.lambda.runtime.Context;
public class Hello implements RequestStreamHandler{
public void handler(InputStream inputStream, OutputStream outputStream, Context context) throws IOException {
int letter;
while((letter = inputStream.read()) != -1)
{
outputStream.write(Character.toUpperCase(letter));
}
}
}
Aman,
You can push the partial outputs into SQS and read the SQS messages to process those message. This is a simple and scalable architecture. AWS provides SQS SDKs in different languages, for example, JavaScript, Java, Python, etc.
Reading and writing into SQS is very easy using SDK and that too can be implemented in serverside or in your UI layer (with proper IAM).
I found AWS step function may be what you need:
AWS Step Functions lets you coordinate multiple AWS services into serverless workflows so you can build and update apps quickly.
Check this link for more detail:
In our example, you are a developer who has been asked to create a serverless application to automate handling of support tickets in a call center. While you could have one Lambda function call the other, you worry that managing all of those connections will become challenging as the call center application becomes more sophisticated. Plus, any change in the flow of the application will require changes in multiple places, and you could end up writing the same code over and over again.

Dispatch Queues for Asynchronous Web Service Call

I'm calling the Facebook Graph API to get the email, facebook ID and name of a user that logs into my app through Facebook.
I successfully get the information; I'm now trying to use dispatch groups so the function that call graph waits until the graph API call completes before returning. The graph request is asynchronous.
I can't figure out why this code is locking up.
1) Create a dispatch group
2) Enter said display group
3) Leave the group once info is retrieved or an error is found
4) Wait for the group leave before returning
It seems like my dispatch group enter is not called correctly but I can't figure out why.
class func getFBInformation()->Bool {
var fbResult = false
let fbGraphGroup = DispatchGroup()
fbGraphGroup.enter()
FBSDKGraphRequest(graphPath: "/me", parameters: ["fields": "id, name, email"]).start { (connection, result, err) in
if err != nil {
fbResult = false
print("Pre Error Signal")
fbGraphGroup.leave()
return
}
if let resultDict = result as? [String:AnyObject] {
<Do things with graph results>
print("Pre success signal")
fbResult = true
fbGraphGroup.leave()
}
}
fbGraphGroup.wait()
print("Post signal")
return fbResult
}
How could it work?
First you enter the group, next you are waiting on main thread for completion handler until it leave the group. But the completion handler is not able to execute on the main thread to be able to leave the group.
As far as I know, your completion handler is dispatched on the main queue by the API. It is not necessary to use any other kind of synchronization.

SQS AWS - message.acknowledge - exception with acknowledge mode: UNORDERED_ACKNOWLEDGE

I was running the SyncMessageReceiverUnorderedAcknowledge.java program, exactly as written on: http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/jmsclient.html#jmsclient-ackmode
public class SyncMessageReceiverUnorderedAcknowledge {
// Visibility time-out for the queue. It must match to the one set for the queue for this example to work.
private static final long TIME_OUT_SECONDS = 1;
public static void main(String args[]) throws JMSException, InterruptedException {
// Create the configuration for the example
ExampleConfiguration config = ExampleConfiguration.parseConfig("SyncMessageReceiverUnorderedAcknowledge", args);
// Setup logging for the example
ExampleCommon.setupLogging();
// Create the connection factory based on the config
SQSConnectionFactory connectionFactory =
SQSConnectionFactory.builder()
.withRegion(config.getRegion())
.withAWSCredentialsProvider(config.getCredentialsProvider())
.build();
// Create the connection
SQSConnection connection = connectionFactory.createConnection();
// Create the queue if needed
ExampleCommon.ensureQueueExists(connection, config.getQueueName());
// Create the session with unordered acknowledge mode
Session session = connection.createSession(false, **SQSSession.UNORDERED_ACKNOWLEDGE**);
// Create the producer and consume
MessageProducer producer = session.createProducer(session.createQueue(config.getQueueName()));
MessageConsumer consumer = session.createConsumer(session.createQueue(config.getQueueName()));
// Open the connection
connection.start();
// Send two text messages
sendMessage(producer, session, "Message 1");
sendMessage(producer, session, "Message 2");
// Receive a message and don't acknowledge it
receiveMessage(consumer, false);
// Receive another message and acknowledge it
receiveMessage(consumer, true);
// Wait for the visibility time out, so that unacknowledged messages reappear in the queue
System.out.println("Waiting for visibility timeout...");
Thread.sleep(TimeUnit.SECONDS.toMillis(TIME_OUT_SECONDS));
// Attempt to receive another message and acknowledge it. This will result in receiving the first message since
// we have acknowledged only the second message. In the UNORDERED_ACKNOWLEDGE mode, all the messages must
// be explicitly acknowledged.
receiveMessage(consumer, true);
// Close the connection. This will close the session automatically
connection.close();
System.out.println("Connection closed.");
}
/**
* Sends a message through the producer.
*
* #param producer Message producer
* #param session Session
* #param messageText Text for the message to be sent
* #throws JMSException
*/
private static void sendMessage(MessageProducer producer, Session session, String messageText) throws JMSException {
// Create a text message and send it
producer.send(session.createTextMessage(messageText));
}
/**
* Receives a message through the consumer synchronously with the default timeout (TIME_OUT_SECONDS).
* If a message is received, the message is printed. If no message is received, "Queue is empty!" is
* printed.
*
* #param consumer Message consumer
* #param acknowledge If true and a message is received, the received message is acknowledged.
* #throws JMSException
*/
private static void receiveMessage(MessageConsumer consumer, boolean acknowledge) throws JMSException {
// Receive a message
Message message = consumer.receive(TimeUnit.SECONDS.toMillis(TIME_OUT_SECONDS));
if (message == null) {
System.out.println("Queue is empty!");
} else {
// Since this queue has only text messages, cast the message object and print the text
System.out.println("Received: " + ((TextMessage) message).getText());
// Acknowledge the message if asked
if (acknowledge) **message.acknowledge();**
}
}
}
When the below code is reached:
// Create the session with unordered acknowledge mode
Session session = connection.createSession(false,SQSSession.UNORDERED_ACKNOWLEDGE);
// Acknowledge the message if asked
if (acknowledge) message.acknowledge();
I get the following exception:
Exception in thread "main" java.lang.NoSuchMethodError: com.amazonaws.services.sqs.AmazonSQS.deleteMessage(Lcom/amazonaws/services/sqs/model/DeleteMessageRequest;)V
at com.amazon.sqs.javamessaging.AmazonSQSMessagingClientWrapper.deleteMessage(AmazonSQSMessagingClientWrapper.java:127)
at com.amazon.sqs.javamessaging.acknowledge.UnorderedAcknowledger.acknowledge(UnorderedAcknowledger.java:42)
at com.amazon.sqs.javamessaging.message.SQSMessage.acknowledge(SQSMessage.java:883)
at sample.sqs.SyncMessageReceiverUnorderedAcknowledge.receiveMessage(SyncMessageReceiverUnorderedAcknowledge.java:116)
at sample.sqs.SyncMessageReceiverUnorderedAcknowledge.main(SyncMessageReceiverUnorderedAcknowledge.java:67)
I am running with the following gradle dependencies:
compile("com.amazonaws:aws-java-sdk-sqs:1.11.13")
compile("com.amazonaws:amazon-sqs-java-messaging-lib:1.0.0")
I debugged the code, and all the aws amazon java classes looks perfect.
In addition, I created a new program that will run isolated deleteMessage.
With deleteMessage(DeleteMessageRequest deleteMessageRequest) - I get the same exception.
But With DeleteMessageBatchResult deleteMessageBatch(DeleteMessageBatchRequest deleteMessageBatchRequest) - Which is in the same classes - It works!
I cleaned the gradle cache folder, downloaded again all the jars, vlean, build, but I get the same results :-(
Help will be highly appreciated.
https://github.com/awslabs/amazon-sqs-java-messaging-lib/issues/22
I changed gradle dependencies:
compile("com.amazonaws:aws-java-sdk-sqs:1.9.6")
And now it works perfect.
Opening pom.xml of amazon-sqs-java-messaging-lib project, I can see:
<properties> <aws-java-sdk.version>1.9.6</aws-java-sdk.version> </properties>
I wonder when do AWS team plan to change it to the latest one (e.g. 1.11.13)? or one of the latest.