How to configure OpenTelemetry agent for an Akka application - akka

I am trying to export metrics and traces from my Akka app written in Scala using OpenTelemetry agent with the purpose of consuming the data in OpenSearch.
Technology stack for my application:
Akka - 2.6.*
RabbitMQ (amqp client 5.12.*)
PostgreSQL (jdbc 42.2.*)
I've added OpenTelemetry instrumentation runtime dependency to build.sbt:
val runtimeDependencies: Seq[ModuleID] = Seq(
"io.opentelemetry.instrumentation" % "opentelemetry-instrumentation-api" % otelInstrumentationVersion % "runtime"
)
...
libraryDependencies ++= compileDependencies ++ testDependencies ++ runtimeDependencies,
I am passing OpenTelemetry configurations in a properties file:
export JAVA_OPTS="... \
-javaagent:lib/opentelemetry/opentelemetry-javaagent-all-v1.6.0.jar \
-Dotel.javaagent.configuration-file=lib/opentelemetry/otel.properties"
The only other related piece in my code is the properties file:
otel.service.name=my-app
otel.traces.exporter=jaeger
otel.propagators=jaeger
I do receive some traces in OpenSearch, but they are disparate and unrelated whereas I would expect them to be linked. For example a message is received on RabbitMQ topic, it makes it's way into an actor, the latter eventually issues a SQL query. As a result I could see for each execution how much time did each step take.
This is an approximate view that I get in OpenSearch:
I would love to be able to follow documentation, but I find that OpenTelemetry's configuration guide is scarce at this point.
Update:
Not sure whether this is relevant, but I get a warning on datapreper:
2021-09-29T16:50:50,861 [raw-pipeline-prepper-worker-5-thread-1] WARN com.amazon.dataprepper.plugins.prepper.oteltrace.OTelTraceRawPrepper - Missing trace group for SpanId: 922097e31cf96c72

Ok so I got around by running across this issue and then reading about how to surpress specific instrumentations.
So to reduce clutter in tracing dashboard, one would add something as following to the properties file (or equivalent via environment variables):
otel.instrumentation.rabbitmq.enabled=false
otel.instrumentation.grpc.enabled=false
Note that I removed the two cluttering instrumentation libraries peculiar for my use case. For another application one wold choose other libraries from link # 2 above. In this way the spans that you as application developer declare will become roots.

Related

Java Slack API: use of jetty libraries

I'm following this Java example to post a message to a slack channel chat.postMessage/code. It uses these 3 lines referencing SlackAppServer:
import com.slack.api.bolt.jetty.SlackAppServer;
var server = new SlackAppServer(app);
server.start();
2 questions in regards to the use of SlackAppServer and com.slack.api.bolt.jetty:
Is the use SlackAppServer absolutely necessary for posting a message to a Slack channel using Java bolt libraries?
Library com.slack.api bolt-jetty makes use of the first 3 jetty libraries below that in turn rely on 3 more jetty libraries and on command "mvn spring-boot:run" produce the error below. Maven dependency tree doesn't show any overlapping libraries, but seems some type of incompatibility/conflict is taking place. Anyone else have seen this or similar errors?
jetty-servlet, jetty-server, jetty-webapp
jetty-alpn-server, http2-server, jetty-alpn-conscrypt-server
The errror:
Caused by: java.lang.IllegalAccessError: failed to access class org.eclipse.jetty.util.ArrayTernaryTrie from
class org.eclipse.jetty.http.PathMap (org.eclipse.jetty.util.ArrayTernaryTrie and org.eclipse.jetty.http.PathMap are in unnamed module of loader 'app')
at org.eclipse.jetty.http.PathMap.<init>(PathMap.java:96)
at org.eclipse.jetty.http.PathMap.<init>(PathMap.java:117)
at org.eclipse.jetty.http.PathMap.<init>(PathMap.java:107)
at org.eclipse.jetty.security.ConstraintSecurityHandler.<init>(ConstraintSecurityHandler.java:68)
... 25 more

How to deploy large nodejs package to AWS Lambda?

I am trying to deploy a simple script to AWS Lambda that would generate critical css for a website. Running this serverless seems to make sense (but I cannot find any working examples).
The problem is with package size. I am trying to use https://github.com/pocketjoso/penthouse. When I simply npm install penthouse suddenly the package size is over 300MB. Size limit on Lambda is only 250MB and it will not upload.
Is there any way to solve this? Perhaps download penthouse on the fly? If so, is there any example?
Performance is not so critical in this case as it would be called only a few times a day by an automated process.
Looking at the bundle size of the package (https://bundlephobia.com/result?p=penthouse), it doesn't appear that your issue is primarily with the penthouse package. Although I cannot say for certain, I think it's mainly down to the size of your other dependencies.
Nevertheless, seen as this isn't a critical system and will be accessed a few times a day via automation processes, you can reduce the size of your node_modules folder by using a CDN.
There are a number of services which allow you to do this, I have primarily used UNPKG and jsDelivr in the past as they appear to be reliable with minimal-to-no downtime.
I lack the required detail from your question regarding which technology you're specifically using and the extent you can go to in order to achieve your desired result, but there are a few options you can choose:
Utilise webpack's externals configuration:
https://webpack.js.org/configuration/externals/
Use a CDN library loader such as: https://www.npmjs.com/package/import-cdn-js
Or https://www.npmjs.com/package/from-cdn
loadjs is another option: https://github.com/muicss/loadjs
scriptjs https://www.npmjs.com/package/scriptjs
I don't know much about penthouse but with scriptjs, I assume you can achieve something like this:
var penthouseScript = require("scriptjs");
penthouseScript("https://cdn.jsdelivr.net/npm/penthouse#2.2.2/lib/index.min.js", () => {
// penthouse related code
penthouse({
url: 'http://google.com',
cssString: 'body { color: red }'
})
.then(criticalCss => {
// use the critical css
fs.writeFileSync('outfile.css', criticalCss);
});
});

Should I have concern about datastoreRpcErrors?

When I run dataflow jobs that writes to google cloud datastore, sometime I see the metrics show that I had one or two datastoreRpcErrors:
Since these datastore writes usually contain a batch of keys, I am wondering in the situation of RpcError, if some retry will happen automatically. If not, what would be a good way to handle these cases?
tl;dr: By default datastoreRpcErrors will use 5 retries automatically.
I dig into the code of datastoreio in beam python sdk. It looks like the final entity mutations are flushed in batch via DatastoreWriteFn().
# Flush the current batch of mutations to Cloud Datastore.
_, latency_ms = helper.write_mutations(
self._datastore, self._project, self._mutations,
self._throttler, self._update_rpc_stats,
throttle_delay=_Mutate._WRITE_BATCH_TARGET_LATENCY_MS/1000)
The RPCError is caught by this block of code in write_mutations in the helper; and there is a decorator #retry.with_exponential_backoff for commit method; and the default number of retry is set to 5; retry_on_rpc_error defines the concrete RPCError and SocketError reasons to trigger retry.
for mutation in mutations:
commit_request.mutations.add().CopyFrom(mutation)
#retry.with_exponential_backoff(num_retries=5,
retry_filter=retry_on_rpc_error)
def commit(request):
# Client-side throttling.
while throttler.throttle_request(time.time()*1000):
try:
response = datastore.commit(request)
...
except (RPCError, SocketError):
if rpc_stats_callback:
rpc_stats_callback(errors=1)
raise
...
I think you should first of all determine which kind of error occurred in order to see what are your options.
However, in the official Datastore documentation, there is a list of all the possible errors and their error codes . Fortunately, they come with recommended actions for each.
My advice is that your implement their recommendations and see for alternatives if they are not effective for you

Can Amazon Simple Workflow (SWF) be made to work with jRuby?

For uninteresting reasons, I have to use jRuby on a particular project where we also want to use Amazon Simple Workflow (SWF). I don't have a choice in the jRuby department, so please don't say "use MRI".
The first problem I ran into is that jRuby doesn't support forking and SWF activity workers love to fork. After hacking through the SWF ruby libraries, I was able to figure out how to attach a logger and also figure out how to prevent forking, which was tremendously helpful:
AWS::Flow::ActivityWorker.new(
swf.client, domain,"my_tasklist", MyActivities
) do |options|
options.logger= Logger.new("logs/swf_logger.log")
options.use_forking = false
end
This prevented forking, but now I'm hitting more exceptions deep in the SWF source code having to do with Fibers and the context not existing:
Error in the poller, exception:
AWS::Flow::Core::NoContextException: AWS::Flow::Core::NoContextException stacktrace:
"aws-flow-2.4.0/lib/aws/flow/implementation.rb:38:in 'task'",
"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:292:in 'respond_activity_task_failed'",
"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:204:in 'respond_activity_task_failed_with_retry'",
"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:335:in 'process_single_task'",
"aws-flow-2.4.0/lib/aws/decider/task_poller.rb:388:in 'poll_and_process_single_task'",
"aws-flow-2.4.0/lib/aws/decider/worker.rb:447:in 'run_once'",
"aws-flow-2.4.0/lib/aws/decider/worker.rb:419:in 'start'",
"org/jruby/RubyKernel.java:1501:in `loop'",
"aws-flow-2.4.0/lib/aws/decider/worker.rb:417:in 'start'",
"/Users/trcull/dev/etl/flow/etl_runner.rb:28:in 'start_workers'"
This is the SWF code at that line:
# #param [Future] future
# Unused; defaults to **nil**.
#
# #param block
# The block of code to be executed when the task is run.
#
# #raise [NoContextException]
# If the current fiber does not respond to `Fiber.__context__`.
#
# #return [Future]
# The tasks result, which is a {Future}.
#
def task(future = nil, &block)
fiber = ::Fiber.current
raise NoContextException unless fiber.respond_to? :__context__
context = fiber.__context__
t = Task.new(nil, &block)
task_context = TaskContext.new(:parent => context.get_closest_containing_scope, :task => t)
context << t
t.result
end
I fear this is another flavor of the same forking problem and also fear that I'm facing a long road of slogging through SWF source code and working around problems until I finally hit a wall I can't work around.
So, my question is, has anyone actually gotten jRuby and SWF to work together? If so, is there a list of steps and workarounds somewhere I can be pointed to? Googling for "SWF and jRuby" hasn't turned up anything so far and I'm already 1 1/2 days into this task.
I think the issue might be that aws-flow-ruby doesn't support Ruby 2.0. I found this PDF dated Jan 22, 2015.
1.2.1
Tested Ruby Runtimes The AWS Flow Framework for Ruby has been tested
with the official Ruby 1.9 runtime, also known as YARV. Other versions
of the Ruby runtime may work, but are unsupported.
I have a partial answer to my own question. The answer to "Can SWF be made to work on jRuby" is "Yes...ish."
I was, indeed, able to get a workflow working end-to-end (and even make calls to a database via JDBC, the original reason I had to do this). So, that's the "yes" part of the answer. Yes, SWF can be made to work on jRuby.
Here's the "ish" part of the answer.
The stack trace I posted above is the result of SWF trying to raise an ActivityTaskFailedException due to a problem in some of my activity code. That part is my fault. What's not my fault is that the superclass of ActivityTaskFailedException has this code in it:
def initialize(reason = "Something went wrong in Flow",
details = "But this indicates that it got corrupted getting out")
super(reason)
#reason = reason
#details = details
details = details.message if details.is_a? Exception
self.set_backtrace(details)
end
When your activity throws an exception, the "details" variable you see above is filled with a String. MRI is perfectly happy to take a String as an argument to set_backtrace(), but jRuby is not, and jRuby throws an exception saying that "details" must be an Array of Strings. This exception blows through all the nice error catching logic of the SWF library and into this code that's trying to do incompatible things with the Fiber library. That code then throws a follow-on exception and kills the activity worker thread entirely.
So, you can run SWF on jRuby as long as your activity and workflow code never, ever throws exceptions because otherwise those exceptions will kill your worker threads (which is not the intended behavior of SWF workers). What they are designed to do instead is communicate the exception back to SWF in a nice, trackable, recoverable fashion. But, the SWF code that does the communicating back to SWF has, itself, code that's incompatible with jRuby.
To get past this problem, I monkey-patched AWS::Flow::FlowException like so:
def initialize(reason = "Something went wrong in Flow",
details = "But this indicates that it got corrupted getting out")
super(reason)
#reason = reason
#details = details
details = details.message if details.is_a? Exception
details = [details] if details.is_a? String
self.set_backtrace(details)
end
Hope that helps someone in the same situation as me.
I'm using JFlow, it lets you start SWF flow activity workers with JRuby.

How to enable DEBUG level logging with Jetty embedded?

I'm trying to set the logging level to DEBUG in an embedded Jetty instance.
The documentation at http://docs.codehaus.org/display/JETTY/Debugging says to -
call SystemProperty.set("DEBUG", "true") before calling new
org.mortbay.jetty.Server().
I'm not sure what the SystemProperty class is, it doesn't seem to be documented anywhere. I tried System.setProperty(), but that didn't do the trick.
My question was answered on the Jetty mailing list by Joakim Erdfelt:
You are looking at the old Jetty 6.x docs at docs.codehaus.org.
DEBUG logging is just a logging level determined by the logging
implementation you choose to use.
If you use slf4j, then use slf4j's docs for configuring logging level. http://slf4j.org/manual.html
If you use java.util.logging, use the JVM docs. http://docs.oracle.com/javase/6/docs/technotes/guides/logging/overview.html
If you use the built-in StdErrLog, then there is a pattern to follow.
-D{classref}.LEVEL={level}
Where {classref} is the class reference you want to set the level on,
and all sub-class refs. and {level} is one of the values ALL, DEBUG,
INFO, WARN
Example:
-Dorg.eclipse.jetty.LEVEL=INFO - this will enable INFO level logging for all jetty packages / classes.
-Dorg.eclipse.jetty.io.LEVEL=DEBUG - this will enable DEBUG level logging for IO classes only
-Dorg.eclipse.jetty.servlet.LEVEL=ALL - this will enable ALL logging (trace events, internally ignored exceptions, etc..) for servlet
packages.
-Dorg.eclipse.jetty.util.thread.QueuedThreadPool.LEVEL=ALL - this will enable level ALL+ on the specific class only.
Add this
-Dorg.eclipse.jetty.util.log.class=org.eclipse.jetty.util.log.StdErrLog
-Dorg.eclipse.jetty.LEVEL=DEBUG
In case you just want to quickly get log messages to stderr add something like this to java command line:
-Dorg.eclipse.jetty.util.log.class=org.eclipse.jetty.util.log.StdErrLog -D{classref}.LEVEL=DEBUG
You can use this snippet to enable logging:
import org.eclipse.jetty.util.log.Log;
import org.eclipse.jetty.util.log.StdErrLog;
.
.
.
StdErrLog logger = new StdErrLog();
logger.setDebugEnabled(true);
Log.setLog(logger);